Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ license: other
|
|
4 |
|
5 |
# Overview
|
6 |
|
7 |
-
This is a fine-tuned 7b parameter LlaMa model, fine tuned on nearly 100k synthetic instructions generated
|
8 |
|
9 |
I used a jailbreak prompt to generate the synthetic instructions this time, which resulted in some questionable training data, such as synthesizing drugs, making homemade flamethrowers, etc. Mind you, this is all generated by ChatGPT, not me, so I won't speak for any outputs the model produces.
|
10 |
|
@@ -15,5 +15,40 @@ I'm still combing through the data a bit to make sure there's nothing blatantly
|
|
15 |
The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag:
|
16 |
(https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39)
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
### License
|
19 |
The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.
|
|
|
4 |
|
5 |
# Overview
|
6 |
|
7 |
+
This is a fine-tuned 7b parameter LlaMa model, fine tuned on nearly 100k synthetic instructions generated [airoboros](https://github.com/jondurbin/airoboros)
|
8 |
|
9 |
I used a jailbreak prompt to generate the synthetic instructions this time, which resulted in some questionable training data, such as synthesizing drugs, making homemade flamethrowers, etc. Mind you, this is all generated by ChatGPT, not me, so I won't speak for any outputs the model produces.
|
10 |
|
|
|
15 |
The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag:
|
16 |
(https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39)
|
17 |
|
18 |
+
### Fine-tuning method
|
19 |
+
|
20 |
+
I used the excellent [FastChat](https://github.com/lm-sys/FastChat) module, running with:
|
21 |
+
|
22 |
+
```
|
23 |
+
torchrun --nproc_per_node=8 --master_port=20001 /workspace/FastChat/fastchat/train/train_mem.py \
|
24 |
+
--model_name_or_path /workspace/llama-7b \
|
25 |
+
--data_path /workspace/as_conversations.json \
|
26 |
+
--bf16 True \
|
27 |
+
--output_dir /workspace/airoboros-uncensored-7b \
|
28 |
+
--num_train_epochs 3 \
|
29 |
+
--per_device_train_batch_size 24 \
|
30 |
+
--per_device_eval_batch_size 24 \
|
31 |
+
--gradient_accumulation_steps 2 \
|
32 |
+
--evaluation_strategy "steps" \
|
33 |
+
--eval_steps 1000 \
|
34 |
+
--save_strategy "steps" \
|
35 |
+
--save_steps 1000 \
|
36 |
+
--save_total_limit 10 \
|
37 |
+
--learning_rate 2e-5 \
|
38 |
+
--weight_decay 0. \
|
39 |
+
--warmup_ratio 0.04 \
|
40 |
+
--lr_scheduler_type "cosine" \
|
41 |
+
--logging_steps 1 \
|
42 |
+
--fsdp "full_shard auto_wrap" \
|
43 |
+
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
|
44 |
+
--tf32 True \
|
45 |
+
--model_max_length 2048 \
|
46 |
+
--gradient_checkpointing True \
|
47 |
+
--lazy_preprocess True
|
48 |
+
```
|
49 |
+
|
50 |
+
This ran on 8x nvidia 80gb a100's for about 17 hours.
|
51 |
+
|
52 |
+
|
53 |
### License
|
54 |
The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.
|