TheBloke
/

Falcon-180B-Chat-GPTQ

@@ -159,38 +159,40 @@ It is strongly recommended to use the text-generation-webui one-click-installers
 ### Install the necessary packages
-Requires: Transformers 4.32.0 or later, Optimum 1.12.0 or later, and AutoGPTQ 0.4.2 or later.
 ```shell
-pip3 install transformers>=4.32.0 optimum>=1.12.0
-pip3 install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/  # Use cu117 if on CUDA 11.7
 ```
-If you have problems installing AutoGPTQ using the pre-built wheels, install it from source instead:
 ```shell
-pip3 uninstall -y auto-gptq
-git clone https://github.com/PanQiWei/AutoGPTQ
-cd AutoGPTQ
-pip3 install .
 ```
-### For CodeLlama models only: you must use Transformers 4.33.0 or later.
-If 4.33.0 is not yet released when you read this, you will need to install Transformers from source:
 ```shell
-pip3 uninstall -y transformers
-pip3 install git+https://github.com/huggingface/transformers.git
 ```
-### You can then use the following code
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
-model_name_or_path = "TheBloke/Falcon-180B-Chat-GPTQ"
-# To use a different branch, change revision
-# For example: revision="gptq-3bit--1g-actorder_True"
 model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                              device_map="auto",
                                              revision="main")
@@ -199,9 +201,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
 prompt = "Tell me about AI"
 prompt_template=f'''User: {prompt}
-Assistant:
-'''
 print("\n\n*** Generate:")
@@ -229,9 +229,9 @@ print(pipe(prompt_template)[0]['generated_text'])
 <!-- README_GPTQ.md-compatibility start -->
 ## Compatibility
-The files provided have not yet been tested.
-[Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) is compatible with all GPTQ models, but hasn't yet been tested with these files.
 <!-- README_GPTQ.md-compatibility end -->
 <!-- footer start -->

 ### Install the necessary packages
+Requires: Transformers 4.33.0 or later, Optimum 1.12.0 or later, and AutoGPTQ compiled from source with a patch.
 ```shell
+pip3 install transformers>=4.33.0 optimum>=1.12.0
+pip3 uninstall -y auto-gptq
+git clone -b TB_Latest_Falcon https://github.com/TheBloke/AutoGPTQ
+cd AutoGPTQ
+pip3 install .
 ```
+### You then need to manually download the repo so it can be merged
+I recommend using my fast download script
 ```shell
+git clone https://github.com/TheBlokeAI/AIScripts
+python3 AIScripts/hub_download.py TheBloke/Falcon-180B-Chat-GPTQ Falcon-180B-Chat-GPTQ --branch main  # change branch if you want to use the 3-bit model instead
 ```
+### Now join the files
 ```shell
+cd Falcon-180B-Chat-GPTQ
+# Windows users: see the command to use in the Description at the top of this README
+cat model.safetensors-split-* > model.safetensors && rm model.safetensors-split-*
 ```
+### And then finally you can run the following code
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
+model_name_or_path = "/path/to/Falcon-180B-Chat-GPTQ"  # change this to the path you downloaded the model to
 model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                              device_map="auto",
                                              revision="main")
 prompt = "Tell me about AI"
 prompt_template=f'''User: {prompt}
+Assistant: '''
 print("\n\n*** Generate:")
 <!-- README_GPTQ.md-compatibility start -->
 ## Compatibility
+The provided files have not yet been tested.  They are expected to work with AutoGPTQ, or via Transformers, as long as Transformers 4.33.0 is installed, and AutoGPTQ is updated as described above.
+[Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) is compatible with all GPTQ models, but hasn't yet been tested with these files. Let me know if it works!
 <!-- README_GPTQ.md-compatibility end -->
 <!-- footer start -->