lucyknada commited on
Commit
5c5b217
1 Parent(s): 7d3dcdd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md CHANGED
@@ -46,6 +46,103 @@ To create a working GGUF file, make the following adjustments:
46
 
47
  These modifications should allow you to use the model with llama.cpp, albeit with the mentioned context limitation.
48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ## Credits
50
 
51
  - [anthracite-org/Stheno-Data-Filtered](https://huggingface.co/datasets/anthracite-org/Stheno-Data-Filtered)
 
46
 
47
  These modifications should allow you to use the model with llama.cpp, albeit with the mentioned context limitation.
48
 
49
+ ## axolotl config
50
+
51
+ <details><summary>See axolotl config</summary>
52
+
53
+ axolotl version: `0.4.1`
54
+ ```yaml
55
+ base_model: IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml
56
+ model_type: AutoModelForCausalLM
57
+ tokenizer_type: AutoTokenizer
58
+
59
+ load_in_8bit: false
60
+ load_in_4bit: false
61
+ strict: false
62
+
63
+ datasets:
64
+ - path: anthracite-org/Gryphe-3.5-16k-Subset
65
+ type: sharegpt
66
+ conversation: chatml
67
+ - path: Epiculous/Synthstruct-Gens-v1-Filtered-n-Cleaned
68
+ type: sharegpt
69
+ conversation: chatml
70
+ - path: anthracite-org/Stheno-Data-Filtered
71
+ type: sharegpt
72
+ conversation: chatml
73
+ - path: Epiculous/SynthRP-Gens-v1-Filtered-n-Cleaned
74
+ type: sharegpt
75
+ conversation: chatml
76
+ - path: lodrick-the-lafted/NopmWritingStruct
77
+ type: sharegpt
78
+ conversation: chatml
79
+ - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
80
+ type: sharegpt
81
+ conversation: chatml
82
+
83
+ chat_template: chatml
84
+
85
+ val_set_size: 0.01
86
+ output_dir: ./outputs/out
87
+
88
+ adapter:
89
+ lora_r:
90
+ lora_alpha:
91
+ lora_dropout:
92
+ lora_target_linear:
93
+
94
+ sequence_len: 16384
95
+ # sequence_len: 32768
96
+ sample_packing: true
97
+ eval_sample_packing: false
98
+ pad_to_sequence_len: true
99
+
100
+ wandb_project:
101
+ wandb_entity:
102
+ wandb_watch:
103
+ wandb_name:
104
+ wandb_log_model:
105
+
106
+ gradient_accumulation_steps: 32
107
+ micro_batch_size: 1
108
+ num_epochs: 2
109
+ optimizer: adamw_bnb_8bit
110
+ lr_scheduler: cosine
111
+ learning_rate: 0.00002
112
+ weight_decay: 0.05
113
+
114
+ train_on_inputs: false
115
+ group_by_length: false
116
+ bf16: auto
117
+ fp16:
118
+ tf32: true
119
+
120
+ gradient_checkpointing: true
121
+ early_stopping_patience:
122
+ resume_from_checkpoint:
123
+ local_rank:
124
+ logging_steps: 1
125
+ xformers_attention:
126
+ flash_attention: true
127
+
128
+ warmup_ratio: 0.1
129
+ evals_per_epoch: 4
130
+ eval_table_size:
131
+ eval_max_new_tokens: 128
132
+ saves_per_epoch: 1
133
+
134
+ debug:
135
+ deepspeed:
136
+ fsdp:
137
+ fsdp_config:
138
+
139
+ special_tokens:
140
+ pad_token: <|finetune_right_pad_id|>
141
+
142
+ ```
143
+
144
+ </details><br>
145
+
146
  ## Credits
147
 
148
  - [anthracite-org/Stheno-Data-Filtered](https://huggingface.co/datasets/anthracite-org/Stheno-Data-Filtered)