Update README.md
Browse files
README.md
CHANGED
@@ -20,10 +20,6 @@ This model is llama-3-8b-instruct from Meta (uploaded by unsloth) trained on the
|
|
20 |
|
21 |
The Qalore method uses Qlora training along with the methods from Galore for additional reductions in VRAM allowing for llama-3-8b to be loaded on 14.5 GB of VRAM. This allowed this training to be completed on an RTX A4000 16GB in 130 hours for less than $20.
|
22 |
|
23 |
-
Dataset used for training this model:
|
24 |
-
|
25 |
-
- https://huggingface.co/datasets/Replete-AI/OpenCodeInterpreterData
|
26 |
-
|
27 |
Qalore notebook for training:
|
28 |
|
29 |
- https://colab.research.google.com/drive/1bX4BsjLcdNJnoAf7lGXmWOgaY8yekg8p?usp=sharing
|
|
|
20 |
|
21 |
The Qalore method uses Qlora training along with the methods from Galore for additional reductions in VRAM allowing for llama-3-8b to be loaded on 14.5 GB of VRAM. This allowed this training to be completed on an RTX A4000 16GB in 130 hours for less than $20.
|
22 |
|
|
|
|
|
|
|
|
|
23 |
Qalore notebook for training:
|
24 |
|
25 |
- https://colab.research.google.com/drive/1bX4BsjLcdNJnoAf7lGXmWOgaY8yekg8p?usp=sharing
|