Update README.md

e06db99 verified 14 days ago

4.25 kB

	---
	license: gemma
	language:
	- tr
	base_model:
	- google/gemma-2-9b-it
	pipeline_tag: text-generation

	model-index:
	- name: Gemma-2-9b-it-TR-DPO-V1
	results:
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: MMLU_TR_V0.2
	metrics:
	- name: 5-shot
	type: 5-shot
	value: 0.5169
	verified: false
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: Truthful_QA_V0.2
	metrics:
	- name: 0-shot
	type: 0-shot
	value: 0.5472
	verified: false
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: ARC_TR_V0.2
	metrics:
	- name: 25-shot
	type: 25-shot
	value: 0.5282
	verified: false
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: HellaSwag_TR_V0.2
	metrics:
	- name: 10-shot
	type: 10-shot
	value: 0.5116
	verified: false
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: GSM8K_TR_V0.2
	metrics:
	- name: 5-shot
	type: 5-shot
	value: 0.6507
	verified: false
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: Winogrande_TR_V0.2
	metrics:
	- name: 5-shot
	type: 5-shot
	value: 0.5529
	verified: false
	---

	<img src="https://huggingface.co/Metin/Gemma-2-9b-it-TR-DPO-V1/resolve/main/gemma2_9b_it_dpo_tr_v1.png"
	alt="Logo of Gemma and country code 'TR' in the bottom right corner" width="420"/>

	# Gemma-2-9b-it-TR-DPO-V1

	Gemma-2-9b-it-TR-DPO-V1 is a finetuned version of [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it).
	It is trained on a preference dataset which is generated synthetically.

	## Training Info

	- Base Model: [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
	- Training Data: A synthetically generated preference dataset consisting of 10K samples was used. No proprietary data was utilized.
	- Training Time: 2 hours on a single NVIDIA H100

	- QLoRA Configs:
	- lora_r: 64
	- lora_alpha: 32
	- lora_dropout: 0.05
	- lora_target_linear: true

	The aim was to finetune the model to enhance the output format and content quality for the Turkish language. It is not necessarily smarter than the base model, but its outputs are more likable and preferable.

	Compared to the base model, Gemma-2-9b-it-TR-DPO-V1 is more fluent and coherent in Turkish. It can generate more informative and detailed answers for a given instruction.

	It should be noted that the model will still generate incorrect or nonsensical outputs, so please verify the outputs before using them.

	## How to use

	You can use the below code snippet to use the model:

	```python
	from transformers import BitsAndBytesConfig
	import transformers
	import torch

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_use_double_quant=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)

	model_id = "Metin/Gemma-2-9b-it-TR-DPO-V1"

	pipeline = transformers.pipeline(
	"text-generation",
	model=model_id,
	model_kwargs={"torch_dtype": torch.bfloat16 ,'quantization_config': bnb_config},
	device_map="auto",
	)

	messages = [
	{"role": "user", "content": "Python'da bir öğenin bir listede geçip geçmediğini nasıl kontrol edebilirim?"},
	]

	prompt = pipeline.tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	terminators = [
	pipeline.tokenizer.eos_token_id,
	pipeline.tokenizer.convert_tokens_to_ids("<\|eot_id\|>")
	]

	outputs = pipeline(
	prompt,
	max_new_tokens=512,
	eos_token_id=terminators,
	do_sample=True,
	temperature=0.2,
	top_p=0.9,
	)

	print(outputs[0]["generated_text"][len(prompt):])
	```

	## OpenLLMTurkishLeaderboard_v0.2 benchmark results

	- MMLU_TR_V0.2: 51.69%
	- Truthful_QA_TR_V0.2: 54.72%
	- ARC_TR_V0.2: 52.82%
	- HellaSwag_TR_V0.2: 51.16%
	- GSM8K_TR_V0.2: 65.07%
	- Winogrande_TR_V0.2: 55.29%
	- Average: 55.13%

	These scores may differ from what you will get when you run the same benchmarks, as I did not use any inference engine (vLLM, TensorRT-LLM, etc.)