Finetuning
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the IITB English to Hindi dataset. source group: English target group: Hindi
Model description
meta-llama/Llama-2-7b-hf finetuned for translation task in Hindi language
Training and evaluation data
cfilt/iitb-english-hindi
Training hyperparameters
The following hyperparameters were used during training:
- num_train_epochs=1
- per_device_train_batch_size=4
- per_device_eval_batch_size = 4
- gradient_accumulation_steps=1
- optim="paged_adamw_32bit"
- learning_rate=2e-4
- weight_decay=0.001
- fp16=True
- max_grad_norm=0.3
- max_steps=-1
- warmup_ratio=0.03
- group_by_length=True
- lr_scheduler_type="constant"
Benchamark Evaluation
- BLEU score on Tatoeba: 12.605968092174914
- BLUE score on IN-22: 25.893729634826876
Training procedure
The following bitsandbytes
quantization config was used during training:
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float16
Framework versions
- PEFT 0.4.0
- Transformers 4.42.3
- Pytorch 2.1.2
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 4
Model tree for ritika-kumar/finetuned-llama2-7b-en-hi
Base model
meta-llama/Llama-2-7b-hf