Gemma-2-9b-it-TR-DPO-V1
Gemma-2-9b-it-TR-DPO-V1 is a finetuned version of gemma-2-9b-it. It is trained on a preference dataset which is generated synthetically.
Training Info
Base Model: gemma-2-9b-it
Training Data: A synthetically generated preference dataset consisting of 10K samples was used. No proprietary data was utilized.
Training Time: 2 hours on a single NVIDIA H100
QLoRA Configs:
- lora_r: 64
- lora_alpha: 32
- lora_dropout: 0.05
- lora_target_linear: true
The aim was to finetune the model to enhance the output format and content quality for the Turkish language. It is not necessarily smarter than the base model, but its outputs are more likable and preferable.
Compared to the base model, Gemma-2-9b-it-TR-DPO-V1 is more fluent and coherent in Turkish. It can generate more informative and detailed answers for a given instruction.
It should be noted that the model will still generate incorrect or nonsensical outputs, so please verify the outputs before using them.
How to use
You can use the below code snippet to use the model:
from transformers import BitsAndBytesConfig
import transformers
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model_id = "Metin/Gemma-2-9b-it-TR-DPO-V1"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16 ,'quantization_config': bnb_config},
device_map="auto",
)
messages = [
{"role": "user", "content": "Python'da bir öğenin bir listede geçip geçmediğini nasıl kontrol edebilirim?"},
]
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
prompt,
max_new_tokens=512,
eos_token_id=terminators,
do_sample=True,
temperature=0.2,
top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])
OpenLLMTurkishLeaderboard_v0.2 benchmark results
- MMLU_TR_V0.2: 51.69%
- Truthful_QA_TR_V0.2: 54.72%
- ARC_TR_V0.2: 52.82%
- HellaSwag_TR_V0.2: 51.16%
- GSM8K_TR_V0.2: 65.07%
- Winogrande_TR_V0.2: 55.29%
- Average: 55.13%
These scores may differ from what you will get when you run the same benchmarks, as I did not use any inference engine (vLLM, TensorRT-LLM, etc.)
- Downloads last month
- 572
Model tree for Metin/Gemma-2-9b-it-TR-DPO-V1
Evaluation results
- 5-shot on MMLU_TR_V0.2self-reported0.517
- 0-shot on Truthful_QA_V0.2self-reported0.547
- 25-shot on ARC_TR_V0.2self-reported0.528
- 10-shot on HellaSwag_TR_V0.2self-reported0.512
- 5-shot on GSM8K_TR_V0.2self-reported0.651
- 5-shot on Winogrande_TR_V0.2self-reported0.553