NeuralNovel commited on
Commit
debb14e
1 Parent(s): 84d69bb

Delete .ipynb_checkpoints

Browse files
.ipynb_checkpoints/README-checkpoint.md DELETED
@@ -1,154 +0,0 @@
1
- ---
2
- license: apache-2.0
3
- base_model: mistralai/Mistral-7B-v0.1
4
- tags:
5
- - generated_from_trainer
6
- model-index:
7
- - name: out
8
- results: []
9
- ---
10
-
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
-
14
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
15
- <details><summary>See axolotl config</summary>
16
-
17
- axolotl version: `0.4.0`
18
- ```yaml
19
- base_model: mistralai/Mistral-7B-v0.1
20
- model_type: AutoModelForCausalLM
21
- tokenizer_type: AutoTokenizer
22
- is_mistral_derived_model: true
23
-
24
- load_in_8bit: false
25
- load_in_4bit: false
26
- strict: false
27
-
28
- datasets:
29
- - path: practical-dreamer/RPGPT_PublicDomain-alpaca
30
- type: alpaca
31
- format: "[INST] {instruction} [/INST]"
32
- no_input_format: "[INST] {instruction} [/INST]"
33
-
34
- datasets:
35
- - path: shuyuej/metamath_gsm8k
36
- type: jeopardy
37
- format: "[INST] {instruction} [/INST]"
38
- no_input_format: "[INST] {instruction} [/INST]"
39
-
40
- datasets:
41
- - path: NeuralNovel/Neural-DPO
42
- type:
43
- system_prompt: ""
44
- field_system: system
45
- field_instruction: chosen
46
- field_output: chosen
47
- format: "[INST] {instruction} [/INST]"
48
- no_input_format: "[INST] {instruction} [/INST]"
49
-
50
- dataset_prepared_path:
51
- val_set_size: 0.05
52
- output_dir: ./out
53
-
54
- sequence_len: 8192
55
- sample_packing: false
56
- pad_to_sequence_len: true
57
- eval_sample_packing: false
58
-
59
- wandb_project:
60
- wandb_entity:
61
- wandb_watch:
62
- wandb_name:
63
- wandb_log_model:
64
-
65
- gradient_accumulation_steps: 4
66
- micro_batch_size: 2
67
- num_epochs: 1
68
- optimizer: adamw_bnb_8bit
69
- lr_scheduler: cosine
70
- learning_rate: 0.000005
71
-
72
- train_on_inputs: false
73
- group_by_length: false
74
- bf16: auto
75
- fp16:
76
- tf32: false
77
-
78
- gradient_checkpointing: true
79
- early_stopping_patience:
80
- resume_from_checkpoint:
81
- local_rank:
82
- logging_steps: 1
83
- xformers_attention:
84
- flash_attention: true
85
-
86
- warmup_steps: 10
87
- evals_per_epoch: 4
88
- eval_table_size:
89
- eval_max_new_tokens: 128
90
- saves_per_epoch: 0
91
- debug:
92
- deepspeed:
93
- weight_decay: 0.0
94
- fsdp:
95
- fsdp_config:
96
- special_tokens:
97
- bos_token: "<s>"
98
- eos_token: "</s>"
99
- unk_token: "<unk>"
100
-
101
- ```
102
-
103
- </details><br>
104
-
105
- # out
106
-
107
- This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
108
- It achieves the following results on the evaluation set:
109
- - Loss: 0.0000
110
-
111
- ## Model description
112
-
113
- More information needed
114
-
115
- ## Intended uses & limitations
116
-
117
- More information needed
118
-
119
- ## Training and evaluation data
120
-
121
- More information needed
122
-
123
- ## Training procedure
124
-
125
- ### Training hyperparameters
126
-
127
- The following hyperparameters were used during training:
128
- - learning_rate: 5e-06
129
- - train_batch_size: 2
130
- - eval_batch_size: 2
131
- - seed: 42
132
- - gradient_accumulation_steps: 4
133
- - total_train_batch_size: 8
134
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
135
- - lr_scheduler_type: cosine
136
- - lr_scheduler_warmup_steps: 10
137
- - num_epochs: 1
138
-
139
- ### Training results
140
-
141
- | Training Loss | Epoch | Step | Validation Loss |
142
- |:-------------:|:-----:|:----:|:---------------:|
143
- | 0.2061 | 0.01 | 1 | 0.3139 |
144
- | 0.0 | 0.25 | 32 | 0.0000 |
145
- | 0.0 | 0.5 | 64 | 0.0010 |
146
- | 0.0 | 0.76 | 96 | 0.0000 |
147
-
148
-
149
- ### Framework versions
150
-
151
- - Transformers 4.38.0.dev0
152
- - Pytorch 2.2.0+cu121
153
- - Datasets 2.17.1
154
- - Tokenizers 0.15.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.ipynb_checkpoints/config-checkpoint.json DELETED
@@ -1,26 +0,0 @@
1
- {
2
- "_name_or_path": "mistralai/Mistral-7B-v0.1",
3
- "architectures": [
4
- "MistralForCausalLM"
5
- ],
6
- "attention_dropout": 0.0,
7
- "bos_token_id": 1,
8
- "eos_token_id": 2,
9
- "hidden_act": "silu",
10
- "hidden_size": 4096,
11
- "initializer_range": 0.02,
12
- "intermediate_size": 14336,
13
- "max_position_embeddings": 32768,
14
- "model_type": "mistral",
15
- "num_attention_heads": 32,
16
- "num_hidden_layers": 32,
17
- "num_key_value_heads": 8,
18
- "rms_norm_eps": 1e-05,
19
- "rope_theta": 10000.0,
20
- "sliding_window": 4096,
21
- "tie_word_embeddings": false,
22
- "torch_dtype": "bfloat16",
23
- "transformers_version": "4.38.0.dev0",
24
- "use_cache": false,
25
- "vocab_size": 32000
26
- }