See axolotl config
axolotl version: 0.4.1
base_model: mistralai/Mistral-7B-Instruct-v0.2
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
load_in_8bit: false
load_in_4bit: true
strict: false
chat_template: chatml
datasets:
- path: Howard881010/climate-cal
type: alpaca
train_on_split: train
dataset_prepared_path:
val_set_size: 0.05
output_dir: ./finetune/output/climate-cal
adapter: qlora
lora_model_dir:
sequence_len: 1900
sample_packing: false
pad_to_sequence_len: true
lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:
wandb_project: finetune
wandb_entity:
wandb_watch:
wandb_name: climate-cal
wandb_log_model:
gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 10
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002
train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
eval_sample_packing: False
warmup_steps: 10
evals_per_epoch: 4
eval_table_size:
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
seed: 42
finetune/output/climate-cal
This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0004
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.7597 | 0.0019 | 1 | 1.8445 |
1.0222 | 0.2498 | 133 | 1.0411 |
0.6943 | 0.4995 | 266 | 0.6397 |
0.5725 | 0.7493 | 399 | 0.3519 |
0.2125 | 0.9991 | 532 | 0.1868 |
0.0803 | 1.2488 | 665 | 0.1336 |
0.0509 | 1.4986 | 798 | 0.0889 |
0.0249 | 1.7484 | 931 | 0.0569 |
0.0614 | 1.9981 | 1064 | 0.0485 |
0.0256 | 2.2479 | 1197 | 0.0429 |
0.0383 | 2.4977 | 1330 | 0.0318 |
0.0122 | 2.7474 | 1463 | 0.0266 |
0.0144 | 2.9972 | 1596 | 0.0204 |
0.0119 | 3.2469 | 1729 | 0.0161 |
0.008 | 3.4967 | 1862 | 0.0127 |
0.0074 | 3.7465 | 1995 | 0.0089 |
0.0013 | 3.9962 | 2128 | 0.0079 |
0.0028 | 4.2460 | 2261 | 0.0068 |
0.0032 | 4.4958 | 2394 | 0.0052 |
0.0043 | 4.7455 | 2527 | 0.0046 |
0.0005 | 4.9953 | 2660 | 0.0027 |
0.0006 | 5.2451 | 2793 | 0.0024 |
0.0002 | 5.4948 | 2926 | 0.0015 |
0.0004 | 5.7446 | 3059 | 0.0014 |
0.0002 | 5.9944 | 3192 | 0.0007 |
0.0002 | 6.2441 | 3325 | 0.0007 |
0.0003 | 6.4939 | 3458 | 0.0006 |
0.0002 | 6.7437 | 3591 | 0.0005 |
0.0003 | 6.9934 | 3724 | 0.0005 |
0.0002 | 7.2432 | 3857 | 0.0005 |
0.0002 | 7.4930 | 3990 | 0.0004 |
0.0003 | 7.7427 | 4123 | 0.0004 |
0.0008 | 7.9925 | 4256 | 0.0004 |
0.0002 | 8.2423 | 4389 | 0.0004 |
0.0002 | 8.4920 | 4522 | 0.0004 |
0.0002 | 8.7418 | 4655 | 0.0004 |
0.0002 | 8.9915 | 4788 | 0.0004 |
0.0003 | 9.2413 | 4921 | 0.0004 |
0.0002 | 9.4911 | 5054 | 0.0004 |
0.0002 | 9.7408 | 5187 | 0.0004 |
0.0005 | 9.9906 | 5320 | 0.0004 |
Framework versions
- PEFT 0.11.1
- Transformers 4.41.1
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 0
Model tree for Rose-STL-Lab/climate-cal
Base model
mistralai/Mistral-7B-Instruct-v0.2