mistral-sft-spin-v

This model is a fine-tuned version of AmberYifan/mistral-safe-sft-full on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2211
Rewards/real: 1.4973
Rewards/generated: -9.8210
Rewards/accuracies: 1.0
Rewards/margins: 11.3183
Logps/generated: -240.2288
Logps/real: -111.1903
Logits/generated: -2.8210
Logits/real: -2.7809

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-07
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 3
total_train_batch_size: 24
total_eval_batch_size: 24
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/real	Rewards/generated	Rewards/accuracies	Rewards/margins	Logps/generated	Logps/real	Logits/generated	Logits/real
0.2557	0.0960	200	0.2656	0.6126	-6.8929	0.9881	7.5055	-210.9483	-120.0376	-2.5883	-2.5051
0.2397	0.1919	400	0.2418	0.9896	-7.3355	1.0	8.3251	-215.3737	-116.2676	-2.7204	-2.6772
0.2781	0.2879	600	0.2353	1.0692	-7.6787	1.0	8.7478	-218.8055	-115.4722	-2.8163	-2.7864
0.2066	0.3839	800	0.2281	1.2656	-8.2419	1.0	9.5075	-224.4376	-113.5078	-2.7687	-2.7265
0.226	0.4798	1000	0.2251	1.3885	-8.7567	1.0	10.1452	-229.5862	-112.2788	-2.8022	-2.7554
0.2103	0.5758	1200	0.2228	1.4343	-9.0450	1.0	10.4792	-232.4687	-111.8212	-2.8119	-2.7702
0.221	0.6718	1400	0.2245	1.5045	-8.5829	1.0	10.0875	-227.8485	-111.1185	-2.8226	-2.7823
0.2098	0.7678	1600	0.2224	1.5080	-9.2156	1.0	10.7237	-234.1753	-111.0833	-2.8164	-2.7768
0.2122	0.8637	1800	0.2222	1.5266	-9.5583	1.0	11.0849	-237.6017	-110.8976	-2.8279	-2.7866
0.219	0.9597	2000	0.2211	1.4973	-9.8210	1.0	11.3183	-240.2288	-111.1903	-2.8210	-2.7809

Framework versions

Transformers 4.43.3
Pytorch 2.2.2+cu121
Datasets 2.20.0
Tokenizers 0.19.1

AmberYifan
/

mistral-sft-spin-v

mistral-sft-spin-v

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for AmberYifan/mistral-sft-spin-v

Evaluation results