shisa-v1
JA/EN Bilingual LLMs
Text Generation • Updated • 12 • 3Note 2024-05: The shisa-v1 dataset applied to Llama 3 Instruct 70B outperforms gpt-3.5-turbo
shisa-ai/shisa-v1-llama3-8b
Text Generation • Updated • 250 • 3Note 2024-05: The shisa-v1 dataset applied to Llama 3 Instruct 8B leads to significantly improved performance
augmxnt/shisa-gamma-7b-v1
Text Generation • Updated • 10.7k • 15Note 2023-12: A version using the shisa-v1 dataset applied to Japanese Stable LM Base Gamma 7B. Less tokenizer efficiency, but better overall performance
augmxnt/shisa-7b-v1
Text Generation • Updated • 1.37k • 29Note 2023-12: In addition to SFT, this also underwent a DPO round which improved human preference rating
augmxnt/ultra-orca-boros-en-ja-v1
Viewer • Updated • 188k • 260 • 10Note Largely synthetic dataset combining Airoboros, Ultrachat, Orca in JA and EN. Also, the Jaster train set
augmxnt/shisa-base-7b-v1
Text Generation • Updated • 1.34k • 16Note 2023-12: A continued pre-train (8B 90% JA tokens) of Mistral 7B v0.1 w/ tokenizer extension; probably needs 10B more tokens of pretraining tbt
augmxnt/shisa-pretrain-en-ja-v1
Viewer • Updated • 4.7M • 54 • 7