--- license: apache-2.0 --- ## Model base_model : yanolja/KoSOLAR-10.7B-v0.2 ## Dataset * 공개 데이터 수집 * Deduplicating Training Data Makes Language Models Better 알고리즘 활용 ## Code ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "jingyeom/KoSoLAR-10.7B-v0.2_1.4_dedup" model = AutoModelForCausalLM.from_pretrained( model_name, ) tokenizer = AutoTokenizer.from_pretrained(model_name) ``` ## Benchmark **[Ko-LLM-Leaderboard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard)**