Notebook to test Llama 2 in Colab free tier

by r3gm - opened Jul 19, 2023

r3gm

Jul 19, 2023

Using the llama-cpp-python library
https://github.com/R3gm/InsightSolver-Colab/blob/main/LLM_Inference_with_llama_cpp_python__Llama_2_13b_chat.ipynb

deepakkaura26

Jul 19, 2023

@r3gm , Hii can you show an example for CPU basis also for Llama 2 13b models

venuv62

Jul 19, 2023

@r3gm , any pointers on how to compile for Metal and run locally on an M2.thx

kroonen

Jul 22, 2023

@r3gm , any pointers on how to compile for Metal and run locally on an M2.thx

follow this guide : https://llama-cpp-python.readthedocs.io/en/latest/install/macos/

venuv62

Jul 23, 2023

•

edited Jul 23, 2023

@r3gm or @kroonen , stayed with ggml3 and 4.0 as recommended but get an Illegal Instruction: 4. Any suggestions?

(llama2-metal) R77NK6JXG7:llama2 venuvasudevan$ pip list|grep llama
llama-cpp-python 0.1.74

Occupying-Mars

Aug 8, 2023

is there someway we could increase the context length? currently its 512 tokens. @r3gm

patalanov

Aug 14, 2023

yes, you pass the arg n_ctx, like so:

llm = LlamaLLM(model_path="./models/7B/llama-2-7b-chat.ggmlv3.q4_K_M.bin", n_ctx=2048)

Occupying-Mars

Aug 19, 2023

oh thanks

Occupying-Mars

Sep 16, 2023

hey i tried running this again today and for some reason it doesn't seem to work when setting lcpp_llm properties and stuff it throws an attribution error i have tried running it in colab, kaggle and 3 different machines just doesn't seem to work

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment