Update README.md
Browse files
README.md
CHANGED
@@ -20,6 +20,30 @@ Uses Llama 3.1 formatting.
|
|
20 |
|
21 |
LoRA: [mpasila/Llama-3.1-Discord-Short-LoRA-8B](https://huggingface.co/mpasila/Llama-3.1-Discord-Short-LoRA-8B)
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
# Uploaded model
|
24 |
|
25 |
- **Developed by:** mpasila
|
|
|
20 |
|
21 |
LoRA: [mpasila/Llama-3.1-Discord-Short-LoRA-8B](https://huggingface.co/mpasila/Llama-3.1-Discord-Short-LoRA-8B)
|
22 |
|
23 |
+
Trained with regular LoRA (not quantized/QLoRA) and LoRA rank was 128 and Alpha set to 32. Trained for 1 epoch using A40 for about 5,5 hours.
|
24 |
+
|
25 |
+
```python
|
26 |
+
args = UnslothTrainingArguments(
|
27 |
+
per_device_train_batch_size = 1,
|
28 |
+
gradient_accumulation_steps = 8,
|
29 |
+
|
30 |
+
warmup_ratio = 0.1,
|
31 |
+
num_train_epochs = 1,
|
32 |
+
|
33 |
+
learning_rate = 5e-5,
|
34 |
+
embedding_learning_rate = 5e-6,
|
35 |
+
|
36 |
+
fp16 = not is_bfloat16_supported(),
|
37 |
+
bf16 = is_bfloat16_supported(),
|
38 |
+
logging_steps = 1,
|
39 |
+
optim = "adamw_8bit",
|
40 |
+
weight_decay = 0.00,
|
41 |
+
lr_scheduler_type = "cosine",
|
42 |
+
seed = 3407,
|
43 |
+
output_dir = "outputs",
|
44 |
+
),
|
45 |
+
```
|
46 |
+
|
47 |
# Uploaded model
|
48 |
|
49 |
- **Developed by:** mpasila
|