collapse_gemma-2-2b_hs2_replace_iter8_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.7525	0.0316	5	1.3104	254840
1.2052	0.0632	10	1.2361	505744
0.829	0.0948	15	1.3012	764784
0.5192	0.1264	20	1.5065	1017080
0.3515	0.1580	25	1.6064	1273384
0.2183	0.1896	30	1.7810	1528696
0.15	0.2212	35	1.9429	1786696
0.1138	0.2528	40	2.1156	2047952
0.059	0.2844	45	2.2852	2304208
0.0473	0.3160	50	2.3658	2562872
0.0341	0.3476	55	2.4285	2820024
0.033	0.3791	60	2.5434	3081376
0.0283	0.4107	65	2.5781	3330816
0.0293	0.4423	70	2.5558	3576176
0.0301	0.4739	75	2.5472	3824776
0.0286	0.5055	80	2.5378	4086992
0.0761	0.5371	85	2.5431	4343816
0.0281	0.5687	90	2.5042	4593704
0.0267	0.6003	95	2.4403	4863272
0.0277	0.6319	100	2.3900	5119864
0.028	0.6635	105	2.3840	5376216
0.0259	0.6951	110	2.4084	5631856
0.0245	0.7267	115	2.4373	5885432
0.0261	0.7583	120	2.4586	6140608
0.0265	0.7899	125	2.4941	6400528
0.0264	0.8215	130	2.5242	6657312
0.0256	0.8531	135	2.5193	6909008
0.0268	0.8847	140	2.5183	7169664
0.0259	0.9163	145	2.5392	7429120
0.0269	0.9479	150	2.5512	7684536
0.0232	0.9795	155	2.5518	7946120