maderix
/

llama-65b-4bit

Inference Endpoints

Model card Files Files and versions Community

maderix commited on Mar 14, 2023

Commit

6515d51

•

1 Parent(s): efd6696

Create README.md

Files changed (1) hide show

README.md +17 -0

README.md ADDED Viewed

	@@ -0,0 +1,17 @@

+---
+language:
+- en
+library_name: transformers
+---
+Converted with https://github.com/qwopqwop200/GPTQ-for-LLaMa
+All models tested on A100-80G
+Installation instructions as mentioned in above repo:
+1. Install Anaconda and create a venv with python 3.8
+2. Install pytorch(tested with torch-1.13-cu116)
+3. Install Transformers library (you'll need the latest transformers with this PR : https://github.com/huggingface/transformers/pull/21955 ).
+4. Install sentencepiece from pip
+5. Run python cuda_setup.py install in venv
+6. You can either convert the llama models yourself with the instructions from GPTQ-for-llama repo
+7. or directly use these weights by individually downloading them following these instructions (https://huggingface.co/docs/huggingface_hub/guides/download)
+8. Profit!