Notebook to test Llama 2 in Colab free tier
#3
by
r3gm
- opened
Using the llama-cpp-python library
https://github.com/R3gm/InsightSolver-Colab/blob/main/LLM_Inference_with_llama_cpp_python__Llama_2_13b_chat.ipynb
@r3gm , Hii can you show an example for CPU basis also for Llama 2 13b models
@r3gm , any pointers on how to compile for Metal and run locally on an M2.thx
follow this guide : https://llama-cpp-python.readthedocs.io/en/latest/install/macos/
is there someway we could increase the context length? currently its 512 tokens. @r3gm
yes, you pass the arg n_ctx, like so:
llm = LlamaLLM(model_path="./models/7B/llama-2-7b-chat.ggmlv3.q4_K_M.bin", n_ctx=2048)
oh thanks