collapse_gemma-2-2b_hs2_accumulatesubsample_iter16_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3688	0.0535	5	1.2792	267216
1.1045	0.1071	10	1.2312	538752
0.9071	0.1606	15	1.2252	810008
0.8884	0.2142	20	1.2472	1090696
0.7495	0.2677	25	1.2613	1357168
0.6986	0.3213	30	1.2627	1622320
0.566	0.3748	35	1.2669	1885016
0.4901	0.4284	40	1.2516	2150224
0.5529	0.4819	45	1.2437	2426776
0.5463	0.5355	50	1.2177	2694864
0.5361	0.5890	55	1.2260	2967472
0.4028	0.6426	60	1.2219	3235816
0.4417	0.6961	65	1.2218	3506192
0.3597	0.7497	70	1.2340	3766528
0.4094	0.8032	75	1.2152	4037064
0.4751	0.8568	80	1.2227	4299936
0.3216	0.9103	85	1.2123	4567080
0.4108	0.9639	90	1.2150	4843480