collapse_gemma-2-2b_hs2_replace_iter3_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.5416	0.0324	5	1.3068	269936
1.3641	0.0649	10	1.2115	552616
1.0289	0.0973	15	1.2221	823720
0.8688	0.1297	20	1.2844	1095864
0.7272	0.1621	25	1.4180	1364688
0.519	0.1946	30	1.5302	1638176
0.3073	0.2270	35	1.6787	1919208
0.2729	0.2594	40	1.7833	2198928
0.1705	0.2919	45	1.9387	2476232
0.0886	0.3243	50	1.9982	2749520
0.1069	0.3567	55	2.0681	3023760
0.0686	0.3891	60	2.0576	3297968
0.0828	0.4216	65	1.9080	3573328
0.0499	0.4540	70	1.9215	3843576
0.0494	0.4864	75	1.9651	4114016
0.0778	0.5188	80	1.9004	4382648
0.0607	0.5513	85	1.8523	4659656
0.0551	0.5837	90	1.7979	4931424
0.0352	0.6161	95	1.7820	5204968
0.0639	0.6486	100	1.8419	5483448
0.0605	0.6810	105	1.8618	5761696
0.0386	0.7134	110	1.7887	6038976
0.0399	0.7458	115	1.8088	6311440
0.0333	0.7783	120	1.9178	6585992
0.0458	0.8107	125	1.9033	6863656
0.0419	0.8431	130	1.8162	7138912
0.0386	0.8756	135	1.7969	7407560
0.0464	0.9080	140	1.8278	7687208
0.0376	0.9404	145	1.8610	7964184
0.0418	0.9728	150	1.8592	8240528