Llama-3.1-Carrot / README.md
adarshxs's picture
Update README.md
09d29a1 verified
metadata
license: apache-2.0
language:
  - en
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
pipeline_tag: image-text-to-text

Llama 3.1 Vision by Capx AI

image/png

Directions to Run Inference:

Minimum requirements to run Inference is an A100 40GB GPU.

  • Clone our fork of the Bunny by BAAI repository here: https://github.com/adarshxs/Capx-Llama-3.1-Carrot
  • Create a conda virtual environment
    conda create -n capx python=3.10
    conda activate capx
    
  • Install the following
    pip install --upgrade pip  # enable PEP 660 support
    pip install transformers
    pip install torch torchvision xformers --index-url https://download.pytorch.org/whl/cu118
    
    # Installing APEX
    pip install ninja
    git clone https://github.com/NVIDIA/apex
    cd apex
    pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./
    cd ..
    
    # Installing Flash Attn
    pip install packaging
    pip install flash-attn --no-build-isolation
    
    # Clone the inference Repo
    git clone https://github.com/adarshxs/Capx-Llama3.1-Vision
    cd Capx-Llama3.1-Vision
    pip install -e .
    
  • Run cli server:
    python -m bunny.serve.cli \
      --model-path Capx/Llama-3.1-Vision \
      --model-type llama3.1-8b \
      --image-file /path/to/image \
      --conv-mode llama
    

We thank the amazing team at BAAI, for their Bunny project, upon which this was built and Meta AI for their Llama 3.1 model!