runtime error

tion` package not found, consider installing for better performance: No module named 'flash_attn'. Current `flash-attenton` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`. Downloading shards: 0%| | 0/2 [00:00<?, ?it/s] Downloading shards: 50%|█████ | 1/2 [00:11<00:11, 11.79s/it] Downloading shards: 100%|██████████| 2/2 [00:27<00:00, 14.34s/it] Downloading shards: 100%|██████████| 2/2 [00:27<00:00, 13.96s/it] Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/user/app/app.py", line 10, in <module> model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map='cuda', trust_remote_code=True) File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/usr/local/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 399, in set_module_tensor_to_device new_value = value.to(device) File "/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 302, in _lazy_init torch._C._cuda_init() RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

Container logs:

Fetching error logs...