palmyra-20b-chat / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
adf9d95
|
raw
history blame
1.89 kB
metadata
datasets:
  - WizardLM/WizardLM_evol_instruct_V2_196k
  - Open-Orca/OpenOrca
language:
  - en
tags:
  - chat
  - palmyra

Writer/palmyra-20b-chat


Usage


import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer

model_name = "Writer/palmyra-20b-chat"

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = "What is the meaning of life?"

input_text = (
    "A chat between a curious user and an artificial intelligence assistant. "
    "The assistant gives helpful, detailed, and polite answers to the user's questions. "
    "USER: {prompt} "
    "ASSISTANT:"
)

model_inputs = tokenizer(input_text.format(prompt=prompt), return_tensors="pt").to(
    "cuda"
)

gen_conf = {
    "top_k": 20,
    "max_new_tokens": 2048,
    "temperature": 0.6,
    "do_sample": True,
    "eos_token_id": tokenizer.eos_token_id,
}

streamer = TextStreamer(tokenizer)
if "token_type_ids" in model_inputs:
    del model_inputs["token_type_ids"]

all_inputs = {**model_inputs, **gen_conf}
output = model.generate(**all_inputs, streamer=streamer)

print("-"*20)
print(output)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 38.97
ARC (25-shot) 43.52
HellaSwag (10-shot) 72.83
MMLU (5-shot) 35.18
TruthfulQA (0-shot) 43.17
Winogrande (5-shot) 66.46
GSM8K (5-shot) 3.94
DROP (3-shot) 7.7