NotImplementedError: Could not run 'aten::_local_scalar_dense' with arguments from the 'Meta' backend.
#54
by
duccio84
- opened
Getting this error as I try to load llama3.1-8b-instruct using it's config file. I'm using transformers==4.43.0 and torch==2.1.2+cu121
File "/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 438, in from_config
return model_class._from_config(config, **kwargs)
File "/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1466, in _from_config
model = cls(config, **kwargs)
File "/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1066, in __init__
self.model = LlamaModel(config)
File "/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 845, in __init__
[LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 845, in <listcomp>
[LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 632, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 306, in __init__
self.rotary_emb = LlamaRotaryEmbedding(config=self.config)
File "/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 119, in __init__
inv_freq, self.attention_scaling = self.rope_init_fn(self.config, device, **self.rope_kwargs)
File "/lib/python3.10/site-packages/transformers/modeling_rope_utils.py", line 330, in _compute_llama3_parameters
if wavelen < high_freq_wavelen:
File "/lib/python3.10/site-packages/torch/utils/_device.py", line 77, in __torch_function__
return func(*args, **kwargs)
NotImplementedError: Could not run 'aten::_local_scalar_dense' with arguments from the 'Meta' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_local_scalar_dense' is only available for these backends: [CPU, CUDA, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
I'm facing the same issue with LLaMa 3.1 models.
This PR fixes it: https://github.com/huggingface/transformers/pull/32244
This is not part of a release yet, but you can just use transformers main branch
See also this issue: https://github.com/huggingface/transformers/issues/32187