jondurbin
/

airoboros-7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jondurbin commited on May 18, 2023

Commit

ea9bbc1

•

1 Parent(s): c6851de

Update README.md

Files changed (1) hide show

README.md +36 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: other
 # Overview
-This is a fine-tuned 7b parameter LlaMa model, fine tuned on nearly 100k synthetic instructions generated by my tool [airobors](https://github.com/jondurbin/airoboros)
 I used a jailbreak prompt to generate the synthetic instructions this time, which resulted in some questionable training data, such as synthesizing drugs, making homemade flamethrowers, etc.  Mind you, this is all generated by ChatGPT, not me, so I won't speak for any outputs the model produces.
@@ -15,5 +15,40 @@ I'm still combing through the data a bit to make sure there's nothing blatantly
 The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag:
 (https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39)
 ### License
 The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.

 # Overview
+This is a fine-tuned 7b parameter LlaMa model, fine tuned on nearly 100k synthetic instructions generated [airoboros](https://github.com/jondurbin/airoboros)
 I used a jailbreak prompt to generate the synthetic instructions this time, which resulted in some questionable training data, such as synthesizing drugs, making homemade flamethrowers, etc.  Mind you, this is all generated by ChatGPT, not me, so I won't speak for any outputs the model produces.
 The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag:
 (https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39)
+### Fine-tuning method
+I used the excellent [FastChat](https://github.com/lm-sys/FastChat) module, running with:
+```
+torchrun --nproc_per_node=8 --master_port=20001 /workspace/FastChat/fastchat/train/train_mem.py \
+  --model_name_or_path /workspace/llama-7b \
+  --data_path /workspace/as_conversations.json \
+  --bf16 True \
+  --output_dir /workspace/airoboros-uncensored-7b \
+  --num_train_epochs 3 \
+  --per_device_train_batch_size 24 \
+  --per_device_eval_batch_size 24 \
+  --gradient_accumulation_steps 2 \
+  --evaluation_strategy "steps" \
+  --eval_steps 1000 \
+  --save_strategy "steps" \
+  --save_steps 1000 \
+  --save_total_limit 10 \
+  --learning_rate 2e-5 \
+  --weight_decay 0. \
+  --warmup_ratio 0.04 \
+  --lr_scheduler_type "cosine" \
+  --logging_steps 1 \
+  --fsdp "full_shard auto_wrap" \
+  --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
+  --tf32 True \
+  --model_max_length 2048 \
+  --gradient_checkpointing True \
+  --lazy_preprocess True
+```
+This ran on 8x nvidia 80gb a100's for about 17 hours.
 ### License
 The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.