Edit model card

Model Details

Model Description

  • Developed by: Chintan Shah

  • Model type: meta-llama/Llama-3.2-1B-Instruct

  • Finetuned from model [optional]: meta-llama/Llama-3.2-1B-Instruct

Training Details

Training Data

mlabonne/orpo-dpo-mix-40k

Training Procedure

ORPO

Training Parameters

Training Arguments:

  • Learning Rate: 1e-5
  • Batch Size: 1
  • max_steps: 1
  • Block Size: 512
  • Warmup Ratio: 0.1
  • Weight Decay: 0.01
  • Gradient Accumulation: 4
  • Mixed Precision: bf16

Training Hyperparameters

  • Training regime: fp16 mixed precision

LoRA Configuration:

  • R: 16
  • Alpha: 32
  • Dropout: 0.05

Evaluation

Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc ↑ 0.4408 ± 0.0050
none 0 acc_norm ↑ 0.5922 ± 0.0049

Testing Data, Factors & Metrics

Testing Data

https://github.com/EleutherAI/lm-evaluation-harnes

Downloads last month
9
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for chintanshrinath/chintan-Llama-3.2-1B-Instruct

Finetuned
(114)
this model

Dataset used to train chintanshrinath/chintan-Llama-3.2-1B-Instruct