Edit model card

collapse_gemma-2-2b_hs2_replace_iter10_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5268
  • Num Input Tokens Seen: 4724520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 1
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.4194 0.0511 5 1.2766 249528
0.9977 0.1021 10 1.2510 486928
0.6282 0.1532 15 1.3954 732392
0.4243 0.2042 20 1.5904 981424
0.239 0.2553 25 1.7674 1221424
0.1077 0.3063 30 2.0512 1460896
0.0613 0.3574 35 2.2350 1707264
0.0388 0.4084 40 2.3523 1954216
0.0274 0.4595 45 2.4313 2196416
0.0264 0.5105 50 2.5008 2437952
0.0275 0.5616 55 2.5318 2688728
0.0234 0.6126 60 2.5308 2927768
0.0229 0.6637 65 2.5316 3169992
0.0235 0.7147 70 2.4937 3420824
0.0223 0.7658 75 2.4864 3662584
0.0224 0.8168 80 2.4968 3909304
0.0237 0.8679 85 2.5066 4155024
0.0221 0.9190 90 2.5201 4392176
0.0243 0.9700 95 2.5273 4627088

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter10_sftsd1

Base model

google/gemma-2-2b
Finetuned
(378)
this model