Edit model card

collapse_gemma-2-2b_hs2_replace_iter15_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5032
  • Num Input Tokens Seen: 4789008

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.4439 0.0511 5 1.2833 244288
0.799 0.1021 10 1.3407 490568
0.5575 0.1532 15 1.5374 736272
0.277 0.2042 20 1.7581 985808
0.1452 0.2553 25 2.0214 1224952
0.0706 0.3063 30 2.2516 1471584
0.043 0.3574 35 2.4168 1717056
0.0348 0.4084 40 2.4770 1963784
0.0245 0.4595 45 2.4962 2207608
0.0283 0.5105 50 2.5400 2454400
0.0243 0.5616 55 2.5265 2697040
0.0225 0.6126 60 2.5140 2950032
0.0246 0.6637 65 2.5142 3199600
0.0206 0.7147 70 2.5219 3444760
0.0233 0.7658 75 2.5338 3693184
0.0243 0.8168 80 2.5344 3942096
0.022 0.8679 85 2.5247 4194896
0.0222 0.9190 90 2.5153 4446336
0.0218 0.9700 95 2.5051 4689600

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter15_sftsd0

Base model

google/gemma-2-2b
Finetuned
(437)
this model