Edit model card

phi-2-basic-maths

This model is a fine-tuned version of microsoft/phi-2 on an GSM8K dataset.

Model Description

The objective of this model is to evaluate Phi-2's ability to provide correct solutions to reasoning problems after fine-tuning. This model was trained using techniques such as TRL, LoRA quantization, and Flash Attention.

To test it, you can use the following code:

import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer, pipeline

# Specify the model ID
peft_model_id = "Menouar/phi-2-basic-maths"

# Load Model with PEFT adapter
model = AutoPeftModelForCausalLM.from_pretrained(
  peft_model_id,
  device_map="auto",
  torch_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(peft_model_id)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

Training procedure

The complete training procedure can be found on my Notebook.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 42
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 84
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 30

Training results

The training results can be found on Tensoboard.

Evaluation procedure

The complete Evaluation procedure can be found on my Notebook.

Accuracy: 36.16%

Unclear answers: 7.81%

Framework versions

  • PEFT 0.8.2
  • Transformers 4.38.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 53.60
AI2 Reasoning Challenge (25-Shot) 55.80
HellaSwag (10-Shot) 71.15
MMLU (5-Shot) 47.27
TruthfulQA (0-shot) 41.40
Winogrande (5-shot) 75.30
GSM8k (5-shot) 30.71
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Menouar/phi-2-basic-maths

Base model

microsoft/phi-2
Adapter
(628)
this model

Dataset used to train Menouar/phi-2-basic-maths

Evaluation results