llama3-8b-instruct-journal-finetune

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.0299

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 2
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1
training_steps: 500

Training results

Training Loss	Epoch	Step	Validation Loss
2.9002	2.0833	25	1.8777
1.1645	4.1667	50	1.6214
0.4078	6.25	75	1.7856
0.2373	8.3333	100	1.8434
0.2209	10.4167	125	1.7767
0.1953	12.5	150	1.8293
0.1755	14.5833	175	1.7663
0.1893	16.6667	200	1.8726
0.1621	18.75	225	1.9366
0.1657	20.8333	250	1.9146
0.1593	22.9167	275	1.9225
0.156	25.0	300	1.9411
0.1549	27.0833	325	1.9504
0.1525	29.1667	350	1.9608
0.1511	31.25	375	1.9924
0.1494	33.3333	400	1.9878
0.1488	35.4167	425	2.0089
0.1479	37.5	450	2.0089
0.1448	39.5833	475	2.0233
0.1447	41.6667	500	2.0299

Framework versions

PEFT 0.11.1
Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

pmrster
/

llama3-8b-instruct-journal-finetune

llama3-8b-instruct-journal-finetune

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for pmrster/llama3-8b-instruct-journal-finetune

Evaluation results