Saga-8B / README.md
saucam's picture
Update README.md
840b25c verified
metadata
language:
  - en
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
  - sft
base_model: meta-llama/Meta-Llama-3-8B-Instruct

Saga-8B

  • Developed by: saucam
  • License: apache-2.0
  • Finetuned from model : meta-llama/Meta-Llama-3-8B-Instruct

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Usage with Unsloth

from unsloth.chat_templates import get_chat_template
from unsloth import FastLanguageModel

max_seq_length = 2048
dtype = None
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "saucam/Saga-8B", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = False,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

tokenizer = get_chat_template(
    tokenizer,
    chat_template = "chatml", # Supports zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old, unsloth
    mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"}, # ShareGPT style
    map_eos_token = True, # Maps <|im_end|> to </s> instead
)

FastLanguageModel.for_inference(model) # Enable native 2x faster inference

messages = [
    {"from": "human", "value": "What is a famous tall tower in Paris?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

outputs = model.generate(input_ids = inputs, max_new_tokens = 64, use_cache = True)
print(tokenizer.batch_decode(outputs))

Output:

==((====))==  Unsloth: Fast Llama patching release 2024.4
   \\   /|    GPU: NVIDIA A100 80GB PCIe. Max memory: 79.151 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.2.0+cu121. CUDA = 8.0. CUDA Toolkit = 12.1.
\        /    Bfloat16 = TRUE. Xformers = 0.0.24. FA = True.
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:03<00:00,  1.19it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Unsloth: Will map <|im_end|> to EOS = <|im_end|>.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
['<|im_start|>user\nWhat is a famous tall tower in Paris?<|im_end|>\n<|im_start|>assistant\nThe Eiffel Tower is the most famous tall tower in Paris. It is a wrought iron tower that was built in 1889 as the entrance to the 1889 Exposition Universelle (Universal Exhibition) of Paris. The tower was named after its designer, engineer Gustave Eiffel. It stands ']

Usage with Transformers

from transformers import AutoTokenizer
import transformers
import torch

model = "saucam/Saga-8B"
messages = [{"from": "human", "value": "Write a horror story about the monster of eldoria kingdom"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Output:

Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:12<00:00,  3.20s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<|im_start|>user
Write a horror story about the monster of eldoria kingdom<|im_end|>
<|im_start|>assistant
Title: The Eldorian Beast - A Tale of Eldoria Kingdom

In the heart of Eldoria Kingdom, nestled in the dense forests, lives a creature like no other. It's a tale of survival, love, and betrayal, woven into the intricate narrative of the Eldorian Beast.

The Eldorian Beast, a creature of Eldoria Kingdom, is a symbol of the kingdom's core beliefs and beliefs that reflect its core values. The Eldorian Beast is known for its loyalty, its bravery, and its resilience. Its heart is as big as its kingdom, and like the kingdom, it has its own secrets, challenges, and triumphs, all of which makes it a unique character. 

The Eldorian Beast is a wolf, not just any wolf but one that is a true guardian and protector of the kingdom. It is a wolf that knows the kingdom like no one else does, and knows the kingdom like it's its heart. It's a wolf that knows the kingdom's secrets and mysteries, and it's a wolf that knows the kingdom's strengths and weaknesses. 

The Eldorian Beast is not just a wolf. It's a wolf that has been through many challenges and has survived every obstacle, just like Eldoria Kingdom. It's a wolf that's been

Training

2024-05-01T00:35:48.169914304Z wandb: Run history:
2024-05-01T00:35:48.169916994Z wandb:         train/epoch β–β–β–β–β–‚β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–„β–„β–„β–„β–…β–…β–…β–…β–…β–…β–†β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆ
2024-05-01T00:35:48.169919544Z wandb:   train/global_step β–β–β–β–β–‚β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–„β–„β–„β–„β–…β–…β–…β–…β–…β–…β–†β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆ
2024-05-01T00:35:48.169921664Z wandb:     train/grad_norm β–β–β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–ˆβ–‚β–‚β–‚β–‚β–‚β–ƒβ–‚β–‚β–ƒβ–‚β–ƒβ–‚β–ƒβ–‚β–
2024-05-01T00:35:48.169923494Z wandb: train/learning_rate β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–‡β–‡β–†β–†β–†β–†β–†β–†β–…β–…β–…β–…β–…β–„β–„β–„β–„β–„β–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–β–β–
2024-05-01T00:35:48.169925364Z wandb:          train/loss β–‚β–ƒβ–†β–„β–†β–ˆβ–†β–‚β–ƒβ–…β–…β–„β–„β–ƒβ–„β–…β–†β–„β–„β–„β–…β–…β–‚β–†β–„β–ƒβ–„β–β–…β–ƒβ–†β–ƒβ–ƒβ–„β–„β–„β–ƒβ–†β–ƒβ–…
2024-05-01T00:35:48.169927234Z wandb: 
2024-05-01T00:35:48.169929574Z wandb: Run summary:
2024-05-01T00:35:48.169931534Z wandb:               total_flos 1.5746891949997621e+19
2024-05-01T00:35:48.169933294Z wandb:              train/epoch 1.0
2024-05-01T00:35:48.169935114Z wandb:        train/global_step 30011
2024-05-01T00:35:48.169936884Z wandb:          train/grad_norm 0.77759
2024-05-01T00:35:48.169938934Z wandb:      train/learning_rate 0.0
2024-05-01T00:35:48.169940724Z wandb:               train/loss 1.0772
2024-05-01T00:35:48.169942854Z wandb:               train_loss 1.07496
2024-05-01T00:35:48.169944744Z wandb:            train_runtime 106480.5526
2024-05-01T00:35:48.169946874Z wandb: train_samples_per_second 2.255
2024-05-01T00:35:48.169948973Z wandb:   train_steps_per_second 0.282
2024-05-01T00:35:48.169950783Z wandb: 
2024-05-01T00:35:48.170089392Z wandb: πŸš€ View run training at: https://wandb.ai/saucam/Saga-8B/runs/yv08wyiv