collapse_gemma-2-2b_hs2_replace_iter2_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.7513	0.0345	5	1.3040	274624
1.4363	0.0690	10	1.1967	546200
1.2021	0.1035	15	1.1697	817584
1.0165	0.1380	20	1.1870	1088176
0.9588	0.1725	25	1.2338	1358336
0.856	0.2070	30	1.3377	1637992
0.6142	0.2415	35	1.3806	1911376
0.5705	0.2760	40	1.4600	2181176
0.5098	0.3105	45	1.5034	2462856
0.3225	0.3450	50	1.5081	2737752
0.3129	0.3795	55	1.5481	3012656
0.3444	0.4140	60	1.4783	3279744
0.2324	0.4485	65	1.4703	3547808
0.234	0.4830	70	1.4699	3817328
0.2621	0.5175	75	1.4305	4097184
0.1199	0.5520	80	1.4580	4367848
0.1915	0.5865	85	1.4274	4640592
0.2214	0.6210	90	1.4877	4922032
0.1506	0.6555	95	1.4413	5193088
0.1584	0.6900	100	1.4564	5464864
0.2169	0.7245	105	1.4504	5739032
0.1219	0.7589	110	1.4286	6012736
0.1687	0.7934	115	1.4840	6274808
0.1776	0.8279	120	1.4578	6548312
0.1197	0.8624	125	1.4703	6821112
0.1035	0.8969	130	1.4563	7098736
0.1298	0.9314	135	1.4510	7369552
0.0958	0.9659	140	1.4814	7640632