Open LLama 13B Open Instruct
- Model creator: VMware
- Original model: LLama 13B Open Instruct
Description
This repo contains the GGUF model files for Open LLama 13B Open Instruct.
These files are compatible with llama.cpp.
VMware/open-llama-13B-open-instruct
Instruction-tuned version of the fully trained Open LLama 13B model. The model is open for COMMERCIAL USE.
NOTE : The model was trained using the Alpaca prompt template
NOTE : Fast tokenizer results in incorrect encoding, set the use_fast = False
parameter, when instantiating the tokenizer
NOTE : The model might struggle with code as the tokenizer merges multiple spaces
License
- Commercially Viable
- Instruction dataset, VMware/open-instruct-v1-oasst-dolly-hhrlhf is under cc-by-sa-3.0
- Language Model, (openlm-research/open_llama_13b) is under apache-2.0
Nomenclature
- Model : Open-llama
- Model Size: 13B parameters
- Dataset: Open-instruct-v1 (oasst,dolly, hhrlhf)
Use in Transformers
import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = 'VMware/open-llama-13b-open-instruct'
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map='sequential')
prompt_template = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"
prompt = 'Explain in simple terms how the attention mechanism of a transformer model works'
inputt = prompt_template.format(instruction= prompt)
input_ids = tokenizer(inputt, return_tensors="pt").input_ids.to("cuda")
output1 = model.generate(input_ids, max_length=512)
input_length = input_ids.shape[1]
output1 = output1[:, input_length:]
output = tokenizer.decode(output1[0])
print(output)
Finetuning details
The finetuning scripts will be available in our RAIL Github Repository
Evaluation
TODO
- Downloads last month
- 110
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.