Navarasa
Collection
Collection of Gemma finetuned 7B/ 2B Indic Navarasa models.
•
4 items
•
Updated
This model is based on google/gemma-7b and hase been LoRA finetuned on 9 Indian languages and English language instruction datasets:
The model is finetuned using unsloth library and we provide inference code using the same for faster inference. Alternatively you can use HuggingFace Library for inference.
The model is trained on approx 500K instruction samples.
!pip install "unsloth[colab-ampere] @git+https://github.com/unslothai/unsloth.git"
### Instruction: {instruction}
### Input: {input}
## Response: {response}
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = False
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
device_map="auto"
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
input_prompt = """
### Instruction:
{}
### Input:
{}
### Response:
{}"""
input_text = input_prompt.format(
"Tranlsate following sentence to Hindi.", # instruction
"This model is developed by Telugu LLM Labs", # input
"", # output - leave this blank for generation!
)
inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
"Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa",
load_in_4bit = False,
token = hf_token
)
tokenizer = AutoTokenizer.from_pretrained("Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa")
input_prompt = """
### Instruction:
{}
### Input:
{}
### Response:
{}"""
input_text = input_prompt.format(
"Tranlsate following sentence to Hindi.", # instruction
"This model is developed by Telugu LLM Labs", # input
"", # output - leave this blank for generation!
)
inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)[0]
Refer to the blog post for sample examples.
Please check our Code Repository for training and inference scripts.
The model is a collaborative effort by Ravi Theja and Ramsri Goutham. Feel free to DM either of us if you have any questions.
Base model
google/gemma-7b