collapse_gemma-2-2b_hs2_replace_iter6_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.553	0.0316	5	1.3090	258336
1.1605	0.0632	10	1.2408	515760
0.8944	0.0948	15	1.2981	775720
0.5199	0.1264	20	1.5066	1026360
0.3638	0.1580	25	1.6259	1281232
0.217	0.1896	30	1.8240	1539752
0.1148	0.2212	35	1.9752	1796440
0.1057	0.2528	40	2.1133	2056024
0.0565	0.2844	45	2.2720	2315344
0.0582	0.3160	50	2.4049	2578576
0.0373	0.3476	55	2.5018	2831728
0.0341	0.3791	60	2.4419	3089328
0.0415	0.4107	65	2.4454	3350752
0.0285	0.4423	70	2.4645	3607904
0.0276	0.4739	75	2.5049	3874552
0.0304	0.5055	80	2.5111	4127008
0.027	0.5371	85	2.5041	4384976
0.029	0.5687	90	2.5237	4651128
0.0285	0.6003	95	2.5093	4909080
0.0287	0.6319	100	2.5117	5165688
0.0262	0.6635	105	2.5054	5415464
0.0259	0.6951	110	2.4879	5674272
0.0271	0.7267	115	2.4664	5934728
0.027	0.7583	120	2.4789	6187376
0.0288	0.7899	125	2.4795	6450096
0.0247	0.8215	130	2.4943	6712368
0.0248	0.8531	135	2.4960	6970504
0.0282	0.8847	140	2.5069	7232472
0.0266	0.9163	145	2.5055	7495824
0.0311	0.9479	150	2.5049	7758216
0.0237	0.9795	155	2.5113	8017088