Edit model card

Polyglot-4x7b-24b

polyglot

Polyglot-4x7b is a Mixture of Experts approach to a multilingual model.

This project is an experiment to see if each expert can be of a different language. The answer is yes.

The model is a merge of models that are capable of Chinese and Japanese output.

  • teknium/OpenHermes-2.5-Mistral-7B
  • oshizo/japanese-e5-mistral-7b_slerp
  • cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
  • s3nh/Mistral-7B-Evol-Instruct-Chinese

TODO:

  1. [] polyglot tokenizer

Other polyglot models

Code Example

Inference Colab Live demo available on Spaces

from transformers import AutoModelForCausalLM, AutoTokenizer

def generate_response(prompt):
    """
    Generate a response from the model based on the input prompt.

    Args:
    prompt (str): Prompt for the model.

    Returns:
    str: The generated response from the model.
    """
    # Tokenize the input prompt
    inputs = tokenizer(prompt, return_tensors="pt")

    # Generate output tokens
    outputs = model.generate(**inputs, max_new_tokens=256, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)

    # Decode the generated tokens to a string
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    return response

# Load the model and tokenizer
model_id = "macadeliccc/laser-polyglot-4x7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)

# Example prompts in different languages
english_prompt = "Write a quicksort algorithm in python"
chinese_prompt = "็”จPythonๅ†™ไธ€ไธชๅฟซ้€ŸๆŽ’ๅบ็ฎ—ๆณ•"
japanese_prompt = "Pythonใงใ‚ฏใ‚คใƒƒใ‚ฏใ‚ฝใƒผใƒˆใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ใ‚’ๆ›ธใ„ใฆใใ ใ•ใ„"

# Generate and print responses for each language
print("English Response:")
print(generate_response(english_prompt), "\n")

print("Chinese Response:")
print(generate_response(chinese_prompt), "\n")

print("Japanese Response:")
print(generate_response(japanese_prompt), "\n")

Example Output

English:

Write a quicksort algorithm in python.

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    else:
        pivot = arr[0]
        less = [i for i in arr[1:] if i <= pivot]
        greater = [i for i in arr[1:] if i > pivot]
        return quicksort(less) + [pivot] + quicksort(greater)

arr = [5, 2, 9, 1, 5, 7, 4, 8, 6, 3]
print(quicksort(arr))

This is a simple implementation of the quicksort algorithm in python. The function quicksort takes an array as input and returns a sorted array. The algorithm works by selecting a pivot element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The process is then repeated recursively on the sub-arrays until the entire array is sorted.

Chinese Response:

็”จPythonๅ†™ไธ€ไธชๅฟซ้€ŸๆŽ’ๅบ็ฎ—ๆณ•

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    else:
        pivot = arr[0]
        less = [i for i in arr[1:] if i <= pivot]
        greater = [i for i in arr[1:] if i > pivot]
        return quick_sort(less) + [pivot] + quick_sort(greater)

arr = [3, 5, 2, 1, 4, 6, 8, 7]
print(quick_sort(arr))

่ฟ™ไธช็จ‹ๅบ็š„ๆ—ถ้—ดๅคๆ‚ๅบฆไธบO(nlogn)๏ผŒ็ฉบ้—ดๅคๆ‚ๅบฆไธบO(n)ใ€‚

Japanese Response:

Pythonใงใ‚ฏใ‚คใƒƒใ‚ฏใ‚ฝใƒผใƒˆใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ใ‚’ๆ›ธใ„ใฆใใ ใ•ใ„ใ€‚

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    left = [x for x in arr[1:] if x < pivot]
    right = [x for x in arr[1:] if x >= pivot]
    return quicksort(left) + [pivot] + quicksort(right)

print(quicksort([3,6,8,10,1,5,9,2,4,7]))

ใ“ใฎใ‚ณใƒผใƒ‰ใฏใ‚ฏใ‚คใƒƒใ‚ฏใ‚ฝใƒผใƒˆใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ใ‚’ๅฎŸ่ฃ…ใ—ใฆใ„ใพใ™ใ€‚ใ‚ฏใ‚คใƒƒใ‚ฏใ‚ฝใƒผใƒˆใฏไธ€็จฎใฎๅˆ†ๅ‰ฒใจ conquers ใ‚ขใƒซใ‚ดใƒชใ‚บใƒ ใงใ€้…ๅˆ—ใ‚’ๅˆ†ๅ‰ฒใ—ใ€ใใ‚Œใžใ‚Œใฎ้ƒจๅˆ†้…ๅˆ—ใ‚’ๅ†ๅธฐ็š„ใซใ‚ฝใƒผใƒˆใ—ใพใ™ใ€‚

ใ“ใฎๅฎŸ่ฃ…ใงใฏใ€้…ๅˆ—ใฎๆœ€ๅˆใฎ่ฆ็ด ใ‚’ใƒ”ใƒœใƒƒใƒˆใจใ—ใฆไฝฟ็”จใ—ใพใ™ใ€‚ใใ—ใฆใ€้…ๅˆ—ใ‚’2ใคใฎ

Evaluations

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge Yaml none 0 acc 0.5495 ยฑ 0.0145
none 0 acc_norm 0.5794 ยฑ 0.0144
arc_easy Yaml none 0 acc 0.8304 ยฑ 0.0077
none 0 acc_norm 0.8068 ยฑ 0.0081
boolq Yaml none 0 acc 0.8749 ยฑ 0.0058
hellaswag Yaml none 0 acc 0.6276 ยฑ 0.0048
none 0 acc_norm 0.8157 ยฑ 0.0039
openbookqa Yaml none 0 acc 0.3180 ยฑ 0.0208
none 0 acc_norm 0.4460 ยฑ 0.0223
piqa Yaml none 0 acc 0.8139 ยฑ 0.0091
none 0 acc_norm 0.8237 ยฑ 0.0089
winogrande Yaml none 0 acc 0.7419 ยฑ 0.0123

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 65.79
AI2 Reasoning Challenge (25-Shot) 64.16
HellaSwag (10-Shot) 84.98
MMLU (5-Shot) 63.88
TruthfulQA (0-shot) 55.47
Winogrande (5-shot) 77.82
GSM8k (5-shot) 48.45
Downloads last month
1,217
Safetensors
Model size
24.2B params
Tensor type
BF16
ยท
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Space using macadeliccc/laser-polyglot-4x7b 1

Collection including macadeliccc/laser-polyglot-4x7b

Evaluation results