Mistral-7B-v0.1-spin-10k

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Rewards/real	Rewards/generated	Rewards/accuracies	Rewards/margins	Logps/generated	Logps/real	Logits/generated	Logits/real
0.1107	0.1984	62	0.1023	3.4512	-9.1675	1.0	12.6187	-325.4117	-104.4035	-2.4554	-1.7675
0.1008	0.3968	124	0.1011	3.6116	-11.3858	1.0	14.9974	-347.5944	-102.7990	-2.5863	-2.2957
0.1026	0.5952	186	0.1003	3.6979	-11.6295	1.0	15.3274	-350.0322	-101.9367	-2.6687	-2.3709
0.101	0.7936	248	0.0999	3.7212	-12.7800	1.0	16.5012	-361.5364	-101.7031	-2.6775	-2.3954
0.1039	0.992	310	0.0996	3.7735	-13.5972	1.0	17.3707	-369.7089	-101.1806	-2.6756	-2.3816
0.0808	1.1904	372	0.1017	3.5481	-14.8177	1.0	18.3659	-381.9141	-103.4341	-2.6668	-2.3889
0.0776	1.3888	434	0.1017	3.5515	-14.4474	1.0	17.9989	-378.2113	-103.4007	-2.6759	-2.3898
0.0804	1.5872	496	0.1017	3.5414	-15.1983	1.0	18.7398	-385.7200	-103.5011	-2.6878	-2.4139
0.0795	1.7856	558	0.1021	3.5148	-15.5660	1.0	19.0807	-389.3963	-103.7679	-2.6798	-2.4105
0.0757	1.984	620	0.1021	3.5102	-15.1535	1.0	18.6637	-385.2720	-103.8135	-2.6714	-2.3995
0.0612	2.1824	682	0.1062	3.0811	-16.3490	1.0	19.4301	-397.2271	-108.1048	-2.6460	-2.3659
0.0645	2.3808	744	0.1072	2.9846	-16.5619	1.0	19.5465	-399.3556	-109.0693	-2.6555	-2.3741
0.068	2.5792	806	0.1071	2.9934	-16.6624	1.0	19.6558	-400.3607	-108.9816	-2.6421	-2.3601
0.0659	2.7776	868	0.1070	3.0113	-16.8614	1.0	19.8726	-402.3504	-108.8029	-2.6247	-2.3422
0.0618	2.976	930	0.1070	3.0044	-16.9373	1.0	19.9417	-403.1098	-108.8719	-2.6295	-2.3466