|
--- |
|
datasets: |
|
- cahya/instructions-all |
|
license: bigscience-bloom-rail-1.0 |
|
language: |
|
- de |
|
- en |
|
- es |
|
- fr |
|
- hi |
|
- id |
|
- ja |
|
- ms |
|
- pt |
|
- ru |
|
- th |
|
- vi |
|
- zh |
|
pipeline_tag: text-generation |
|
widget: |
|
- text: "一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。Would you rate the previous review as positive, neutral or negative?" |
|
example_title: "zh-en sentiment" |
|
- text: "一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评?" |
|
example_title: "zh-zh sentiment" |
|
- text: "Suggest at least five related search terms to \"Mạng neural nhân tạo\"." |
|
example_title: "vi-en query" |
|
- text: "Proposez au moins cinq mots clés concernant «Réseau de neurones artificiels»." |
|
example_title: "fr-fr query" |
|
- text: "Explain in a sentence in Telugu what is backpropagation in neural networks." |
|
example_title: "te-en qa" |
|
- text: "Why is the sky blue?" |
|
example_title: "en-en qa" |
|
- text: "Write a fairy tale about a troll saving a princess from a dangerous dragon. The fairy tale is a masterpiece that has achieved praise worldwide and its moral is \"Heroes Come in All Shapes and Sizes\". Story (in Spanish):" |
|
example_title: "es-en fable" |
|
- text: "Write a fable about wood elves living in a forest that is suddenly invaded by ogres. The fable is a masterpiece that has achieved praise worldwide and its moral is \"Violence is the last refuge of the incompetent\". Fable (in Hindi):" |
|
example_title: "hi-en fable" |
|
|
|
--- |
|
# Bloomz-7b1-instruct |
|
|
|
This is Bloomz-7b1-mt model fine-tuned with multilingual instruction dataset and using Peft Lora fine-tuning. |
|
Following languages are supported: English, German, French, Spanish, Hindi, Indonesian, Japanese, Malaysian, Portuguese, |
|
Russian, Thai, Vietnamese and Chinese. |
|
|
|
## Usage |
|
|
|
Following is the code to do the inference using this model: |
|
``` |
|
import torch |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
|
|
peft_model_id = "cahya/bloomz-7b1-instruct" |
|
config = PeftConfig.from_pretrained(peft_model_id) |
|
|
|
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, |
|
load_in_8bit=True, device_map='auto') |
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
|
|
# Load the Lora model |
|
model = PeftModel.from_pretrained(model, peft_model_id) |
|
|
|
batch = tokenizer("User: How old is the universe?\nAssistant: ", return_tensors='pt').to(0) |
|
|
|
|
|
with torch.cuda.amp.autocast(): |
|
output_tokens = model.generate(**batch, max_new_tokens=200, |
|
min_length=50, |
|
do_sample=True, |
|
top_k=40, |
|
top_p=0.9, |
|
temperature=0.2, |
|
repetition_penalty=1.2, |
|
num_return_sequences=1) |
|
|
|
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True)) |
|
``` |