Edit model card

This is a test DPO finetune of Microsoft phi-2

Two DPO datasets are used. Training was for 1 epoch as a qlora with rank 64.

Initial Evals

  • ARC: 63.14
  • TruthfulQA: 48.47
Downloads last month
29
Safetensors
Model size
2.78B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.