checkpoints

This model is a fine-tuned version of meta-llama/Llama-3.1-8B on an finance-alpaca dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 4
optimizer: Use paged_adamw_32bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
training_steps: 1000

Training Loss	Epoch	Step	Validation Loss
1.8003	0.0029	50	1.7726
1.654	0.0059	100	1.6940
1.6105	0.0088	150	1.5665
1.4474	0.0118	200	1.4721
1.4677	0.0147	250	1.4386
1.3501	0.0176	300	1.4294
1.4011	0.0206	350	1.4220
1.4403	0.0235	400	1.4087
1.5017	0.0265	450	1.4170
1.2628	0.0294	500	1.3992
1.4797	0.0324	550	1.3977
1.4455	0.0353	600	1.3886
1.423	0.0382	650	1.3895
1.3616	0.0412	700	1.3945
1.3128	0.0441	750	1.3885
1.3983	0.0471	800	1.3946
1.3529	0.05	850	1.3834
1.3314	0.0529	900	1.3897
1.4412	0.0559	950	1.3831
1.305	0.0588	1000	1.3893