Capx
/

Llama-3.1-Carrot

Image-Text-to-Text

Model card Files Files and versions Community

adarshxs commited on Sep 6

Commit

005705b

•

1 Parent(s): 136214c

Create README.md

Files changed (1) hide show

README.md +57 -0

README.md ADDED Viewed

	@@ -0,0 +1,57 @@

+---
+license: apache-2.0
+language:
+- en
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
+pipeline_tag: image-text-to-text
+---
+# Llama 3.1 Vision by Capx AI
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/644bf6ef778ecbfb977e8e84/3D-oR8GazhHTaA-kVLNDk.png)
+Read more on: https://huggingface.co/blog/adarshxs/capx-vision
+## Directions to Run Inference:
+**Minimum requirements to run Inference is an A100 40GB GPU.**
+- Clone our fork of the Bunny by BAAI repository here: https://github.com/adarshxs/Capx-Llama3.1-Vision
+- Create a conda virtual environment
+  ```bash
+  conda create -n capx python=3.10
+  conda activate capx
+  ```
+- Install the following
+  ```bash
+  pip install --upgrade pip  # enable PEP 660 support
+  pip install transformers
+  pip install torch torchvision xformers --index-url https://download.pytorch.org/whl/cu118
+  # Installing APEX
+  pip install ninja
+  git clone https://github.com/NVIDIA/apex
+  cd apex
+  pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./
+  cd ..
+  # Installing Flash Attn
+  pip install packaging
+  pip install flash-attn --no-build-isolation
+  # Clone the inference Repo
+  git clone https://github.com/adarshxs/Capx-Llama3.1-Vision
+  cd Capx-Llama3.1-Vision
+  pip install -e .
+  ```
+- Run cli server:
+  ```bash
+  python -m bunny.serve.cli \
+	--model-path Capx/Llama-3.1-Vision \
+	--model-type llama3.1-8b \
+	--image-file /path/to/image \
+	--conv-mode llama
+  ```
+We thank the amazing team at BAAI, for their Bunny project, upon which this was built and Meta AI for their Llama 3.1 model!