Could you update model.safetensors.index.json file please
It gives the error:
FileNotFoundError: No such file or directory: ".../miqu-1-70b-sf-4.25bpw-h6-exl2/model-00001-of-00029.safetensors"
exllamav2 loaders do not need this file. What are you using to load the model? If you use exllamav2, ooba's text-generation-webui, or tabbyAPI, it should work as-is.
Hello, Lone Striker! Thk for cool model! I try run Silly Tavern Colab with thits model, but take same error.
"FileNotFoundError: No such file or directory: ".../miqu-1-70b-sf-4.25bpw-h6-exl2/model-00001-of-00029.safetensors"
(https://colab.research.google.com/github/TavernAI/TavernAI/blob/main/colab/GPU.ipynb#scrollTo=kzzNdltKQsY3)
Oobabooga colab crashs with OutOfMemory
https://colab.research.google.com/github/oobabooga/text-generation-webui/blob/main/Colab-TextGen-GPU.ipynb#scrollTo=LGQ8BiMuXMDG
(args: --n-gpu-layers 128 --use_double_quant)
OutOfMemoryError: CUDA out of memory. Tried to allocate 34.00 MiB. GPU 0 has a total capacty of
14.75 GiB of which 27.06 MiB is free. Process 34959 has 14.72 GiB memory in use. Of the allocated
memory 14.49 GiB is allocated by PyTorch, and 120.95 MiB is reserved by PyTorch but unallocated. If
reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See
documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Please, can you tell me what to do? Is it possible to run your model in colab in order to connect to its API to my tavern?
Are you using free Colab account? This model is too big for free account. For the 4.25bpw model, you're looking at over 48 GB VRAM to run. Even the 2.4bpw model needs at least 24 GB VRAM.
You should be able to load this model in 32 GB of VRAM. 64 GB would more than enough.