language:
- ko
datasets:
- DopeorNope/DPO-Ko-Dataset
- DopeorNope/Orca_Near_Dedup-v2
library_name: transformers
pipeline_tag: text-generation
license: cc-by-nc-sa-4.0
(주)미디어그룹사람과숲과 (주)마커의 LLM 연구 컨소시엄으로 개발된 모델입니다
DopeorNope개발자가 훈련하여 업로드한 모델입니다
모델 문의사항은 DopeorNope(Seungyoo Lee)개발자에게 컨택 바랍니다
The license is cc-by-nc-sa-4.0
.
🐻❄️COKAL-DPO_13b-v2🐻❄️
Model Details
Model Developers Seungyoo Lee (DopeorNope)
Input Models input text only.
Output Models generate text only.
Model Architecture
COKAL-DPO_13b-v2 is an auto-regressive 13B language model based on the LLaMA2 transformer architecture.
Base Model DopeorNope/COKAL_pre_DPO_Test_v2-13b
DopeorNope/COKAL_pre_DPO_Test_v2-13b is the SFT model to train with DPO methodology.
Training Dataset
- DPO training dataset: DopeorNope/DPO-Ko-Dataset - private
This dataset was constructed by directly collecting and reorganizing data by DopeorNope, obtaining insights from "lvwerra/stack-exchange-paired" to create a paired dataset. (It means I do not use stack-exchange-paired; I just got an insight from it.)
- SFT training dataset: DopeorNope/Orca_Near_Dedup-v2 - private
This dataset is based on "kyujinpy/OpenOrca-KO" and has been processed using the Near Dedup algorithm to remove items with a Jaccard Similarity threshold of 0.8 or higher. In addition, inconsistent inputs have been cleaned and modified.
Training
The difference between "DopeorNope/COKAL-DPO_test-v2" and this model is that this model has different hyper-parameters from the one in that setting regarding the final version.
I developed the model in an environment with four RTX 3090 GPUs running Ubuntu 18.04.
It seems that when uploading the model directly to a repository from a Linux server, there may be an issue causing the model to appear to have more parameters. However, this model is based on a 13B architecture.
Reference papers
Data Strategy:
Model Architecture:
Implementation Code
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
repo = "HumanF-MarkrAI/COKAL-DPO-13b-v2"
model = AutoModelForCausalLM.from_pretrained(
repo,
return_dict=True,
torch_dtype=torch.float16,
device_map='auto'
)
model_tokenizer = AutoTokenizer.from_pretrained(repo)