sudy-super
/

Yamase-12B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sudy-super commited on Jul 23

Commit

fe24a97

•

1 Parent(s): 34d3b55

Update README.md

Files changed (1) hide show

README.md +82 -3

README.md CHANGED Viewed

@@ -1,3 +1,82 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- llm-jp/oasst1-21k-ja
+- llm-jp/oasst2-33k-ja
+- HachiML/Hachi-Alpaca
+- Aratako/Rosebleu-1on1-Dialogues-RP
+- baobab-trees/wikipedia-human-retrieval-ja
+- aixsatoshi/Longcontext-aozora-summary
+- aixsatoshi/Longcontext-aozora-instruction
+- kunishou/amenokaku-code-instruct
+- HachiML/Evol-hh-rlhf-gen3-1k
+- Kendamarron/jimba-wiki-instruction-calm3
+- Manual-Dataset-Creation-Project/Malum-130
+- sudy-super/CoTangent
+- minnade/chat-daily
+---
+# Yamase-12B
+### Description
+Yamase-12Bは、[Mistral-Nemo-Instruct](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)に対して日本語能力の向上を目的として約11万件のデータでFine-tuningを行ったモデルです。
+### Usage
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+B_INST, E_INST = "[INST]", "[/INST]"
+text = "旅行に行くと高層ビルがたくさん建っていました。これからどのようなことが推測できますか？"
+model_name = "sudy-super/Yamase-12B"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
+if torch.cuda.is_available():
+    model = model.to("cuda")
+prompt = "{bos_token}{b_inst}{prompt}{e_inst}".format(
+    bos_token=tokenizer.bos_token,
+    b_inst=B_INST,
+    prompt=text,
+    e_inst=E_INST,
+)
+with torch.no_grad():
+    token_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
+    output_ids = model.generate(
+        token_ids.to(model.device),
+        max_new_tokens=256,
+        pad_token_id=tokenizer.pad_token_id,
+        eos_token_id=tokenizer.eos_token_id,
+    )
+output = tokenizer.decode(output_ids.tolist()[0][token_ids.size(1) :], skip_special_tokens=True)
+print(output)
+"""
+"""
+```
+### Chat Template
+```
+<s>[INST]明日の東京の天気は何ですか？[/INST]晴れです。</s>[INST]大阪はどうですか？[/INST]雨です。</s>
+```
+### Hyperparameter
+```
+num_train_epochs: 5
+per_device_train_batch_size: 2
+per_device_eval_batch_size: 2
+gradient_accumulation_steps: 128
+learning_rate: 2e-5
+lr_scheduler_kwargs={"min_lr": 2e-6}
+lr_scheduler_type: "cosine_with_min_lr"
+warmup_ratio: 0.1
+dataloader_pin_memory: True
+gradient_checkpointing: True
+bf16: True
+optim: "adamw_torch_fused"
+weight_decay: 0.0
+max_grad_norm: 1.0
+adam_beta2: 0.99
+label_smoothing_factor: 0.0
+seed: 42
+```
+### Author
+[Rakuto Suda](https://huggingface.co/sudy-super)