collapse_gemma-2-2b_hs2_accumulatesubsample_iter15_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3791	0.0528	5	1.2741	270544
1.0969	0.1057	10	1.2081	532816
0.9217	0.1585	15	1.2099	798280
0.794	0.2114	20	1.2350	1062080
0.6797	0.2642	25	1.2627	1325448
0.5979	0.3170	30	1.2847	1587720
0.5021	0.3699	35	1.2607	1843760
0.476	0.4227	40	1.2628	2107456
0.487	0.4756	45	1.2493	2370112
0.3335	0.5284	50	1.2433	2630520
0.2871	0.5812	55	1.2330	2893392
0.4017	0.6341	60	1.2188	3156472
0.4512	0.6869	65	1.2056	3422672
0.3667	0.7398	70	1.2041	3688952
0.366	0.7926	75	1.2002	3944200
0.4095	0.8454	80	1.2073	4204976
0.2232	0.8983	85	1.2062	4464672
0.3349	0.9511	90	1.1993	4725696