File size: 1,738 Bytes
0a0b05a da1b436 0a0b05a c866594 0a0b05a da1b436 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
base_model: unsloth/llama-3-8b-bnb-4bit
datasets:
- ruslandev/tagengo-subset-gpt-4o
---
# Uploaded model
- **Developed by:** ruslandev
- **License:** apache-2.0
- **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
This model is finetuned on [ruslandev/tagengo-subset-gpt-4o](https://huggingface.co/datasets/ruslandev/tagengo-subset-gpt-4o) dataset.
Please note - this model has been created for educational purposes and it needs further training/fine tuning.
# How to use
I recommend using my framework [gptchain](https://github.com/RuslanPeresy/gptchain).
```
git clone https://github.com/RuslanPeresy/gptchain.git
cd gptchain
pip install -r requirements-train.txt
python gptchain.py chat -m ruslandev/llama-3-8b-gpt-4o \
--chatml true \
-q '[{"from": "human", "value": "Из чего состоит нейронная сеть?"}]'
```
# Training
[gptchain](https://github.com/RuslanPeresy/gptchain) framework has been used for training.
```
python gptchain.py train -m unsloth/llama-3-8b-bnb-4bit \
-dn tagengo_subset_gpt4o \
-sp checkpoints/llama-3-8b-gpt-4o \
-hf llama-3-8b-gpt-4o \
--num-epochs 3
```
# Training hyperparameters
- learning_rate: 2e-4
- seed: 3407
- gradient_accumulation_steps: 4
- per_device_train_batch_size: 2
- optimizer: adamw_8bit
- lr_scheduler_type: linear
- warmup_steps: 5
- num_train_epochs: 3
- weight_decay: 0.01
# Training results
[wandb report](https://api.wandb.ai/links/ruslandev/2i1pukst)
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |