Edit model card

Fine-tuned Qwen/Qwen2.5-0.5B-Instruct Model

Model Overview

This is a fine-tuned version of the Qwen/Qwen2.5-0.5B-Instruct model. The fine-tuning process utilized the Intel/orca_dpo_pairs dataset and applied DPO (Direct Preference Optimization) and LoRA (Low-Rank Adaptation) techniques.

Note: This fine-tuning was done following the instructions in this blog.

Fine-tuning Details

  • Base Model: Qwen/Qwen2.5-0.5B-Instruct
  • Dataset: Intel/orca_dpo_pairs
  • Fine-tuning Method: DPO + LoRA

Usage Instructions

Install Dependencies

Before using this model, make sure you have the following dependencies installed:

pip install transformers datasets

Load the model

import transformers
from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("drive/MyDrive/result/Qwen-DPO")

message = [
    {"role": "system", "content": "You are a helpful assistant chatbot."},
    {"role": "user", "content": "What is a Large Language Model?"}
]
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

pipeline = transformers.pipeline(
    "text-generation",
    model="co-gy/Qwen-DPO",
    tokenizer=tokenizer
)

sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1,
    max_length=200,
)
print(sequences[0]['generated_text'])
Downloads last month
20
Safetensors
Model size
494M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for co-gy/Qwen2.5-0.5B-DPO

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(63)
this model

Dataset used to train co-gy/Qwen2.5-0.5B-DPO