Model Card for Model ID
Model Details
- Model Name: ppm-pt/gemma-2-nb-2b-it
- Base Model: google/gemma-2-2b-it
- Fine-Tuned On: NbAiLab/norwegian-alpaca
- Model Type: Instruction-tuned Causal Language Model
- Architecture: Transformer-based
- Languages: Multilingual (base model), Norwegian (fine-tuned)
Model Description
This model is a fine-tuned version of Google's gemma-2-2b-it, a 2.2-billion-parameter instruction-tuned language model. The "it" in the model name stands for "instruct," indicating that the base model is trained to follow instructions across multiple tasks and potentially multiple languages. The fine-tuning process was performed using the NbAiLab/norwegian-alpaca dataset, which contains Norwegian instruction-following data.
The model leverages parameter-efficient fine-tuning (PEFT) techniques, specifically Low-Rank Adaptation (LoRA), and utilizes 4-bit quantization to optimize memory usage and computational efficiency.
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: sbdyrnes
- Model type: [More Information Needed]
- Language(s) (NLP): Gemma base languages, and fine-tuned on Norwegian Bokmål
- License: Gemma Terms of Use
- Finetuned from model [optional]: google/gemma-2-2b-it
Uses
Direct Use
The model is designed for generating Norwegian text for tasks such as:
- Instruction Following: Responding to user instructions in Norwegian.
- Text Generation: Generating coherent and contextually relevant Norwegian text.
- Conversational AI: Building chatbots or virtual assistants that communicate in Norwegian.
Out-of-Scope Use
The model is fine-tuned from google/gemma-2-2b-it, and as such, all use must be according to the Gemma Terms of Use. The training data NbAiLab/norwegian-alpaca is generated using GPT-3.5, and therefore this model can not be used to compete against OpenAI in any way, shape, or form.
Bias, Risks, and Limitations
- Language Proficiency: While the model has been fine-tuned on Norwegian data, the base model may not have been extensively pre-trained on Norwegian. This could lead to less optimal performance compared to models pre-trained specifically on Norwegian text.
- Biases: The model may exhibit biases present in the training data, including cultural or societal biases.
- Inaccurate Information: The model may generate incorrect or outdated information.
- Ethical Considerations: Users should exercise caution when deploying the model in sensitive applications.
- Limitations: The model is fine-tuned from google/gemma-2-2b-it, and as such, all use must be according to the Gemma Terms of Use. The training data NbAiLab/norwegian-alpaca is generated using GPT-3.5, and therefore this model can not be used to compete against OpenAI in any form.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
#Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("ppm-pt/gemma-2-nb-2b-it", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("ppm-pt/gemma-2-nb-2b-it", torch_dtype="auto")
#Prepare input text
input_text = "Hva er hovedstaden i Norge?"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
#Generate output
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(output_text)
Training Details
Training Data
NbAiLab/norwegian-alpaca
Fine-Tuning Procedure
Frameworks Used:
- Transformers
- PEFT
- BitsAndBytes
- Datasets
Training Configuration
bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type='nf4', ) config = LoraConfig( r=16, lora_alpha=32, target_modules=modules, lora_dropout=0.05, bias='none', task_type='CAUSAL_LM', )
Training Argumens:
- Epochs = 3
- Learning Rate = 2e-4
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: RTX4090
- Hours used: 4
- Compute Region: Norway
Model Card Contact
sbdyrnes
- Downloads last month
- 41