See axolotl config

axolotl version: 0.4.1

base_model: codellama/CodeLlama-7b-hf
base_model_config: codellama/CodeLlama-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
hub_model_id: EvolCodeLlama-7b

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: mlabonne/Evol-Instruct-Python-1k
    type: alpaca
dataset_prepared_path: last_run_prepared
val_set_size: 0.02
output_dir: ./qlora-out

adapter: qlora
lora_model_dir:

sequence_len: 2048
sample_packing: true

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: axolotl
wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 3
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 100
eval_steps: 0.01
save_strategy: epoch
save_steps:
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"

EvolCodeLlama-7b

This model is a fine-tuned version of codellama/CodeLlama-7b-hf on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.3796

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
0.4828	0.0086	1	0.4975
0.4056	0.0343	4	0.4976
0.5046	0.0685	8	0.4973
0.3969	0.1028	12	0.4966
0.3404	0.1370	16	0.4947
0.4645	0.1713	20	0.4896
0.2892	0.2056	24	0.4789
0.2616	0.2398	28	0.4616
0.2586	0.2741	32	0.4430
0.3147	0.3084	36	0.4267
0.3686	0.3426	40	0.4158
0.2935	0.3769	44	0.4084
0.2419	0.4111	48	0.4026
0.2791	0.4454	52	0.3970
0.2381	0.4797	56	0.3922
0.2407	0.5139	60	0.3888
0.2686	0.5482	64	0.3872
0.3673	0.5824	68	0.3880
0.2665	0.6167	72	0.3848
0.3259	0.6510	76	0.3830
0.236	0.6852	80	0.3801
0.2301	0.7195	84	0.3786
0.3573	0.7537	88	0.3766
0.2409	0.7880	92	0.3745
0.3192	0.8223	96	0.3744
0.2652	0.8565	100	0.3720
0.2341	0.8908	104	0.3712
0.3651	0.9251	108	0.3709
0.1667	0.9593	112	0.3714
0.2755	0.9936	116	0.3699
0.2906	1.0254	120	0.3712
0.2079	1.0593	124	0.3708
0.3429	1.0932	128	0.3708
0.3296	1.1271	132	0.3721
0.2231	1.1610	136	0.3707
0.2098	1.1949	140	0.3686
0.2918	1.2288	144	0.3711
0.3803	1.2627	148	0.3676
0.2619	1.2966	152	0.3662
0.2261	1.3305	156	0.3679
0.1954	1.3644	160	0.3689
0.2183	1.3983	164	0.3677
0.2459	1.4322	168	0.3674
0.1979	1.4661	172	0.3669
0.2175	1.5	176	0.3653
0.26	1.5339	180	0.3652
0.2195	1.5678	184	0.3645
0.3344	1.6017	188	0.3645
0.1769	1.6356	192	0.3643
0.1829	1.6695	196	0.3639
0.2343	1.7034	200	0.3649
0.2568	1.7373	204	0.3650
0.1749	1.7712	208	0.3640
0.2118	1.8051	212	0.3628
0.2252	1.8390	216	0.3611
0.2301	1.8729	220	0.3602
0.1884	1.9068	224	0.3602
0.2023	1.9407	228	0.3600
0.2428	1.9746	232	0.3587
0.2413	2.0064	236	0.3583
0.2015	2.0407	240	0.3620
0.2131	2.0749	244	0.3728
0.1768	2.1092	248	0.3834
0.1615	2.1435	252	0.3810
0.1598	2.1777	256	0.3775
0.171	2.2120	260	0.3763
0.1973	2.2463	264	0.3759
0.1407	2.2805	268	0.3758
0.1998	2.3148	272	0.3771
0.1267	2.3490	276	0.3773
0.1526	2.3833	280	0.3782
0.1547	2.4176	284	0.3776
0.1439	2.4518	288	0.3768
0.1565	2.4861	292	0.3757
0.2113	2.5203	296	0.3767
0.1768	2.5546	300	0.3776
0.2366	2.5889	304	0.3792
0.1397	2.6231	308	0.3801
0.3598	2.6574	312	0.3805
0.1296	2.6916	316	0.3803
0.1344	2.7259	320	0.3805
0.2095	2.7602	324	0.3804
0.1646	2.7944	328	0.3800
0.1749	2.8287	332	0.3799
0.1597	2.8630	336	0.3800
0.1602	2.8972	340	0.3799
0.1786	2.9315	344	0.3797
0.1692	2.9657	348	0.3797
0.1887	3.0	352	0.3796

Framework versions

PEFT 0.13.0
Transformers 4.45.0
Pytorch 2.3.1+cu121
Datasets 2.21.0
Tokenizers 0.20.0

ani-kavle
/

EvolCodeLlama-7b

EvolCodeLlama-7b

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ani-kavle/EvolCodeLlama-7b

Evaluation results