liuylhf commited on
Commit
cc83be7
1 Parent(s): 6f1b74e

Training in progress, step 520

Browse files
Files changed (2) hide show
  1. README.md +7 -21
  2. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -2,11 +2,10 @@
2
  license: apache-2.0
3
  library_name: peft
4
  tags:
5
- - axolotl
6
  - generated_from_trainer
7
  base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
8
  model-index:
9
- - name: empower-functions-smaller-training
10
  results: []
11
  ---
12
 
@@ -28,7 +27,7 @@ datasets:
28
  path: ./data/with_function_response/function_not_used_training_small.jsonl
29
  type: sharegpt
30
  - conversation: mistral
31
- path: ./data/with_function_response/function_used_training_small.jsonl
32
  type: sharegpt
33
  debug: null
34
  eval_max_new_tokens: 256
@@ -41,7 +40,7 @@ fsdp_config: null
41
  gradient_accumulation_steps: 4
42
  gradient_checkpointing: true
43
  group_by_length: false
44
- hub_model_id: liuylhf/empower-functions-smaller-training
45
  learning_rate: 0.0002
46
  load_in_4bit: true
47
  load_in_8bit: false
@@ -62,7 +61,7 @@ micro_batch_size: 2
62
  model_config:
63
  output_router_logits: true
64
  model_type: AutoModelForCausalLM
65
- num_epochs: 4
66
  optimizer: paged_adamw_8bit
67
  output_dir: 2af0968cad514d6e9d5fb8448230e1c6/model
68
  pad_to_sequence_len: true
@@ -85,11 +84,9 @@ weight_decay: 0.0
85
 
86
  </details><br>
87
 
88
- # empower-functions-smaller-training
89
 
90
- This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the None dataset.
91
- It achieves the following results on the evaluation set:
92
- - Loss: 0.0928
93
 
94
  ## Model description
95
 
@@ -120,18 +117,7 @@ The following hyperparameters were used during training:
120
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
121
  - lr_scheduler_type: cosine
122
  - lr_scheduler_warmup_steps: 10
123
- - num_epochs: 4
124
-
125
- ### Training results
126
-
127
- | Training Loss | Epoch | Step | Validation Loss |
128
- |:-------------:|:-----:|:----:|:---------------:|
129
- | 2.2084 | 0.01 | 1 | 2.1525 |
130
- | 0.0991 | 0.8 | 75 | 0.1072 |
131
- | 0.0883 | 1.59 | 150 | 0.0976 |
132
- | 0.0808 | 2.38 | 225 | 0.0940 |
133
- | 0.0679 | 3.16 | 300 | 0.0928 |
134
-
135
 
136
  ### Framework versions
137
 
 
2
  license: apache-2.0
3
  library_name: peft
4
  tags:
 
5
  - generated_from_trainer
6
  base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
7
  model-index:
8
+ - name: empower-functions-more-tools
9
  results: []
10
  ---
11
 
 
27
  path: ./data/with_function_response/function_not_used_training_small.jsonl
28
  type: sharegpt
29
  - conversation: mistral
30
+ path: ./data/with_function_response/more_functions/function_used_training_small.jsonl
31
  type: sharegpt
32
  debug: null
33
  eval_max_new_tokens: 256
 
40
  gradient_accumulation_steps: 4
41
  gradient_checkpointing: true
42
  group_by_length: false
43
+ hub_model_id: liuylhf/empower-functions-more-tools
44
  learning_rate: 0.0002
45
  load_in_4bit: true
46
  load_in_8bit: false
 
61
  model_config:
62
  output_router_logits: true
63
  model_type: AutoModelForCausalLM
64
+ num_epochs: 2
65
  optimizer: paged_adamw_8bit
66
  output_dir: 2af0968cad514d6e9d5fb8448230e1c6/model
67
  pad_to_sequence_len: true
 
84
 
85
  </details><br>
86
 
87
+ # empower-functions-more-tools
88
 
89
+ This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on an unknown dataset.
 
 
90
 
91
  ## Model description
92
 
 
117
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
118
  - lr_scheduler_type: cosine
119
  - lr_scheduler_warmup_steps: 10
120
+ - num_epochs: 2
 
 
 
 
 
 
 
 
 
 
 
121
 
122
  ### Framework versions
123
 
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8ef1e12f488a9a6042827a605ebd24a9438710a3e375937d5478e5beb5c7c7a5
3
  size 109086416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b02b6f1d8710610647cd2407571320b616b98d42734899b1693475929cee3f9c
3
  size 109086416