Edit model card

problem0_model_diverse_more_aug_200

This model is a fine-tuned version of barc0/Llama-3.1-ARC-Potpourri-Transduction-8B on the tttx/problem0_diverse_more_aug dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0069
  • Problem Acc@1: 0.0
  • Solution Acc@1: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Problem Acc@1 Solution Acc@1
No log 0 0 0.0180 0.0 0.0
0.1106 0.3158 15 0.0113 0.0 0.0
0.0597 0.6316 30 0.0048 0.0 0.0
0.053 0.9474 45 0.0080 0.0 0.0
0.0397 1.2526 60 0.0052 0.0 0.0
0.0377 1.5684 75 0.0052 0.0 0.0
0.0377 1.8842 90 0.0078 0.0 0.0

Framework versions

  • PEFT 0.10.0
  • Transformers 4.47.0.dev0
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.2
  • Tokenizers 0.20.0
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for tttx/problem0_model_diverse_more_aug_200

Dataset used to train tttx/problem0_model_diverse_more_aug_200