liuylhf commited on
Commit
2cb80d3
1 Parent(s): 04367ae

Model save

Browse files
Files changed (1) hide show
  1. README.md +128 -0
README.md ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - generated_from_trainer
6
+ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
7
+ model-index:
8
+ - name: empower-functions-clean-data-one-more-functions
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
+ <details><summary>See axolotl config</summary>
17
+
18
+ axolotl version: `0.4.0`
19
+ ```yaml
20
+ adapter: qlora
21
+ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
22
+ bf16: true
23
+ chat_template: inst
24
+ dataset_prepared_path: last_run_prepared
25
+ datasets:
26
+ - conversation: mistral
27
+ path: 659f8b7bb7c243ab879f8bc17876ce4a/data/with_function_response/more_functions/one_more_function/function_used_training.jsonl
28
+ type: sharegpt
29
+ - conversation: mistral
30
+ path: 659f8b7bb7c243ab879f8bc17876ce4a/data/with_function_response/original_clean/function_not_used_training.jsonl
31
+ type: sharegpt
32
+ debug: null
33
+ eval_max_new_tokens: 256
34
+ eval_steps: 0.05
35
+ eval_table_size: null
36
+ flash_attention: true
37
+ fp16: false
38
+ fsdp: null
39
+ fsdp_config: null
40
+ gradient_accumulation_steps: 4
41
+ gradient_checkpointing: true
42
+ group_by_length: false
43
+ hub_model_id: liuylhf/empower-functions-clean-data-one-more-functions
44
+ learning_rate: 0.0002
45
+ load_in_4bit: true
46
+ load_in_8bit: false
47
+ logging_steps: 1
48
+ lora_alpha: 64
49
+ lora_dropout: 0.05
50
+ lora_model_dir: null
51
+ lora_r: 32
52
+ lora_target_modules:
53
+ - q_proj
54
+ - k_proj
55
+ - v_proj
56
+ - o_proj
57
+ loss_watchdog_patience: 3
58
+ loss_watchdog_threshold: 5.0
59
+ lr_scheduler: cosine
60
+ micro_batch_size: 2
61
+ model_config:
62
+ output_router_logits: true
63
+ model_type: AutoModelForCausalLM
64
+ num_epochs: 1
65
+ optimizer: paged_adamw_8bit
66
+ output_dir: 659f8b7bb7c243ab879f8bc17876ce4a/model
67
+ pad_to_sequence_len: true
68
+ sample_packing: true
69
+ save_steps: 0.1
70
+ sequence_len: 4096
71
+ strict: false
72
+ tf32: false
73
+ tokenizer_type: LlamaTokenizer
74
+ train_on_inputs: false
75
+ trust_remote_code: true
76
+ val_set_size: 0.01
77
+ wandb_log_model: end
78
+ wandb_name: more-tools
79
+ wandb_project: function-call
80
+ warmup_steps: 10
81
+ weight_decay: 0.0
82
+
83
+ ```
84
+
85
+ </details><br>
86
+
87
+ # empower-functions-clean-data-one-more-functions
88
+
89
+ This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on an unknown dataset.
90
+
91
+ ## Model description
92
+
93
+ More information needed
94
+
95
+ ## Intended uses & limitations
96
+
97
+ More information needed
98
+
99
+ ## Training and evaluation data
100
+
101
+ More information needed
102
+
103
+ ## Training procedure
104
+
105
+ ### Training hyperparameters
106
+
107
+ The following hyperparameters were used during training:
108
+ - learning_rate: 0.0002
109
+ - train_batch_size: 2
110
+ - eval_batch_size: 2
111
+ - seed: 42
112
+ - distributed_type: multi-GPU
113
+ - num_devices: 2
114
+ - gradient_accumulation_steps: 4
115
+ - total_train_batch_size: 16
116
+ - total_eval_batch_size: 4
117
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
118
+ - lr_scheduler_type: cosine
119
+ - lr_scheduler_warmup_steps: 10
120
+ - num_epochs: 1
121
+
122
+ ### Framework versions
123
+
124
+ - PEFT 0.9.0
125
+ - Transformers 4.39.0.dev0
126
+ - Pytorch 2.2.0+cu121
127
+ - Datasets 2.17.1
128
+ - Tokenizers 0.15.0