Model Description
- NSMC ๋ฐ์ดํฐ์ ๋ํด KT-AI/midm-bitext-S-7B-inst-v1 ๋ฏธ์ธํ๋
- ์ํ ๋ฆฌ๋ทฐ ํ ์คํธ๋ฅผ ํ๋กฌํํธ์ ํฌํจํ์ฌ ๋ชจ๋ธ์ ์ ๋ ฅํ๋ฉด '๊ธ์ ' ๋๋ '๋ถ์ '์ด๋ผ๊ณ ์์ธก ํ ์คํธ๋ฅผ ์ง์ ์์ฑ
- NSMC์ train ์คํ๋ฆฟ ์์ 2,000๊ฐ ์ด์์ ์ํ์ ํ์ต์ ์ฌ์ฉ
- test ์คํ๋ฆฟ ์์ 1,000๊ฐ์ ์ํ๋ง ์ธก์
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08,
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- training_args.logging_steps: 50
- training_args.max_steps : 300
- trainable params: 16,744,448 || all params: 7,034,347,520 || trainable%: 0.23803839591934178
Training Results
TrainOutput( global_step=300, training_loss=2.666887741088867, metrics={'train_runtime': 961.226, 'train_samples_per_second': 0.624, 'train_steps_per_second': 0.312, 'total_flos': 9315508499251200.0, 'train_loss': 2.666887741088867, 'epoch': 0.3})
Accuracy
Midm: ์ ํ๋ 0.88
TP | TN | |
---|---|---|
PP | 416 | 23 |
PN | 92 | 469 |
Model Card Authors
cxoijve
- Downloads last month
- 7
Model tree for cxoijve/hw_midm_7B_nsmc
Base model
KT-AI/midm-bitext-S-7B-inst-v1