collapse_gemma-2-2b_hs2_replace_iter7_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.682	0.0315	5	1.3074	243600
1.1958	0.0630	10	1.2384	489576
0.8251	0.0945	15	1.3304	736640
0.5381	0.1260	20	1.4985	987872
0.2729	0.1575	25	1.6796	1238504
0.2581	0.1890	30	1.8457	1493072
0.1176	0.2205	35	2.0106	1744312
0.0711	0.2520	40	2.1725	1992528
0.05	0.2835	45	2.2581	2243824
0.0473	0.3150	50	2.3984	2490888
0.0535	0.3465	55	2.4441	2740728
0.032	0.3780	60	2.4463	2979648
0.0318	0.4094	65	2.4594	3231056
0.0359	0.4409	70	2.4814	3481768
0.0294	0.4724	75	2.5039	3739344
0.0275	0.5039	80	2.4899	3999888
0.0298	0.5354	85	2.4773	4250720
0.0296	0.5669	90	2.5022	4506360
0.0243	0.5984	95	2.5058	4764496
0.027	0.6299	100	2.5154	5009024
0.0268	0.6614	105	2.5056	5257688
0.0292	0.6929	110	2.5422	5501784
0.0297	0.7244	115	2.5510	5757400
0.0266	0.7559	120	2.5546	6003016
0.0255	0.7874	125	2.5727	6255120
0.0258	0.8189	130	2.5746	6501384
0.0282	0.8504	135	2.5777	6752008
0.0303	0.8819	140	2.5688	7004584
0.0259	0.9134	145	2.5679	7258288
0.0244	0.9449	150	2.5847	7508928
0.024	0.9764	155	2.5936	7755344