Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ license_link: LICENSE
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
-
Yi-34B model fine-tuned on AEZAKMI v1 dataset that is derived from airoboros 2.2.1 and airoboros 2.2. Finetuned with axolotl, using qlora and nf4 double quant, 1 epoch, batch size 1, lr 0.00007, lr scheduler constant. Training took around 33 hours on single local RTX 3090 Ti.
|
10 |
I had power target set to 320W for the GPU, and while I didn't measure power at the wall, it was probably something around 500W. Given the average electricity price in my region, this training run cost me around $3. This was my first attempt at training Yi-34B with this dataset.
|
11 |
Main feature of this model is that it's output should be free of refusals and it feels somehow more natural than airoboros. Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot.
|
12 |
|
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
+
Yi-34B model fine-tuned on AEZAKMI v1 dataset that is derived from airoboros 2.2.1 and airoboros 2.2. Finetuned with axolotl, using qlora and nf4 double quant, 1 epoch, batch size 1, lr 0.00007, lr scheduler constant, sequence length 1200. Training took around 33 hours on single local RTX 3090 Ti.
|
10 |
I had power target set to 320W for the GPU, and while I didn't measure power at the wall, it was probably something around 500W. Given the average electricity price in my region, this training run cost me around $3. This was my first attempt at training Yi-34B with this dataset.
|
11 |
Main feature of this model is that it's output should be free of refusals and it feels somehow more natural than airoboros. Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot.
|
12 |
|