|
--- |
|
language: |
|
- en |
|
license: llama2 |
|
tags: |
|
- text generation |
|
- instruct |
|
datasets: |
|
- PygmalionAI/PIPPA |
|
- Open-Orca/OpenOrca |
|
- Norquinal/claude_multiround_chat_30k |
|
- jondurbin/airoboros-gpt4-1.4.1 |
|
- databricks/databricks-dolly-15k |
|
pipeline_tag: text-generation |
|
inference: false |
|
model-index: |
|
- name: pygmalion-2-7b |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: acc_norm |
|
value: 54.01 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PygmalionAI/pygmalion-2-7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: acc_norm |
|
value: 78.23 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PygmalionAI/pygmalion-2-7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 49.11 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PygmalionAI/pygmalion-2-7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: mc2 |
|
value: 43.78 |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PygmalionAI/pygmalion-2-7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 75.14 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PygmalionAI/pygmalion-2-7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 6.37 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PygmalionAI/pygmalion-2-7b |
|
name: Open LLM Leaderboard |
|
--- |
|
<h1 style="text-align: center">Pygmalion-2 7B</h1> |
|
<h2 style="text-align: center">An instruction-tuned Llama-2 biased towards fiction writing and conversation.</h2> |
|
|
|
## Model Details |
|
|
|
The long-awaited release of our new models based on Llama-2 is finally here. Pygmalion-2 7B (formerly known as Metharme) is based on |
|
[Llama-2 7B](https://huggingface.co/meta-llama/llama-2-7b-hf) released by Meta AI. |
|
|
|
The Metharme models were an experiment to try and get a model that is usable for conversation, roleplaying and storywriting, |
|
but which can be guided using natural language like other instruct models. After much deliberation, we reached the conclusion |
|
that the Metharme prompting format is superior (and easier to use) compared to the classic Pygmalion. |
|
|
|
This model was trained by doing supervised fine-tuning over a mixture of regular instruction data alongside roleplay, fictional stories |
|
and conversations with synthetically generated instructions attached. |
|
|
|
This model is freely available for both commercial and non-commercial use, as per the Llama-2 license. |
|
|
|
|
|
## Prompting |
|
|
|
The model has been trained on prompts using three different roles, which are denoted by the following tokens: `<|system|>`, `<|user|>` and `<|model|>`. |
|
|
|
The `<|system|>` prompt can be used to inject out-of-channel information behind the scenes, while the `<|user|>` prompt should be used to indicate user input. |
|
The `<|model|>` token should then be used to indicate that the model should generate a response. These tokens can happen multiple times and be chained up to |
|
form a conversation history. |
|
|
|
### Prompting example |
|
|
|
The system prompt has been designed to allow the model to "enter" various modes and dictate the reply length. Here's an example: |
|
|
|
``` |
|
<|system|>Enter RP mode. Pretend to be {{char}} whose persona follows: |
|
{{persona}} |
|
|
|
You shall reply to the user while staying in character, and generate long responses. |
|
``` |
|
|
|
## Dataset |
|
The dataset used to fine-tune this model includes our own [PIPPA](https://huggingface.co/datasets/PygmalionAI/PIPPA), along with several other instruction |
|
datasets, and datasets acquired from various RP forums. |
|
|
|
## Limitations and biases |
|
|
|
The intended use-case for this model is fictional writing for entertainment purposes. Any other sort of usage is out of scope. |
|
|
|
As such, it was **not** fine-tuned to be safe and harmless: the base model _and_ this fine-tune have been trained on data known to contain profanity and texts that |
|
are lewd or otherwise offensive. It may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive. |
|
Outputs might often be factually wrong or misleading. |
|
|
|
## Acknowledgements |
|
We would like to thank [SpicyChat](https://spicychat.ai/) for sponsoring the training for this model. |
|
|
|
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl) |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_PygmalionAI__pygmalion-2-7b) |
|
|
|
| Metric |Value| |
|
|---------------------------------|----:| |
|
|Avg. |51.11| |
|
|AI2 Reasoning Challenge (25-Shot)|54.01| |
|
|HellaSwag (10-Shot) |78.23| |
|
|MMLU (5-Shot) |49.11| |
|
|TruthfulQA (0-shot) |43.78| |
|
|Winogrande (5-shot) |75.14| |
|
|GSM8k (5-shot) | 6.37| |
|
|
|
|