|
--- |
|
inference: false |
|
base_model: |
|
- SanjiWatsuki/Silicon-Maid-7B |
|
- sethuiyer/Aika-7B |
|
- sethuiyer/Nandine-7b |
|
- mlabonne/AlphaMonarch-7B |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
- not-for-all-audiences |
|
license: cc |
|
model-index: |
|
- name: sethuiyer/Diana-7B |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: acc_norm |
|
value: 68.34 |
|
name: normalized accuracy |
|
source: |
|
url: >- |
|
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Diana-7B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: acc_norm |
|
value: 86.73 |
|
name: normalized accuracy |
|
source: |
|
url: >- |
|
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Diana-7B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 64.58 |
|
name: accuracy |
|
source: |
|
url: >- |
|
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Diana-7B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: mc2 |
|
value: 60.55 |
|
source: |
|
url: >- |
|
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Diana-7B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 80.19 |
|
name: accuracy |
|
source: |
|
url: >- |
|
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Diana-7B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 63.23 |
|
name: accuracy |
|
source: |
|
url: >- |
|
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Diana-7B |
|
name: Open LLM Leaderboard |
|
language: |
|
- en |
|
pipeline_tag: conversational |
|
--- |
|
# Diana-7B |
|
|
|
<p align="center"> |
|
<img src="https://huggingface.co/sethuiyer/Diana-7B/resolve/main/diana.webp" height="128px" alt="Diana"> |
|
</p> |
|
|
|
This is Diana-7b, rated **93.56/100** by GPT-4 on a collection of 30 synthetic prompts generated by GPT-4. |
|
|
|
It is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): |
|
|
|
1. [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B): This model has impressive conversational abilities, formal and sophisticated style, and strong reasoning skills. |
|
2. [sethuiyer/Aika-7b](https://huggingface.co/sethuiyer/Aika-7B): A merge of SanjiWatsuki/Silicon-Maid-7B, Guilherme34/Samantha-v2, jan-hq/stealth-v1.3, and senseable/WestLake-7B-v2, Aika-7b is designed for natural and human-like interactions, accurate information delivery, comprehensive analysis, emotional intelligence, clarity, and structure. |
|
3. [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B): This model is known for its excellent multi-turn conversational skills and logical coherence. |
|
4. [sethuiyer/Nandine-7b](https://huggingface.co/sethuiyer/Nandine-7b): A merge of senseable/Westlake-7B, Guilherme34/Samantha-v2, and uukuguy/speechless-mistral-six-in-one-7b, Nandine-7b excels in narrative skill, empathetic interaction, intellectual depth, and eloquent communication. |
|
|
|
By combining these models, Diana-7B offers a balanced blend of capabilities, making it suitable for various tasks and providing a comprehensive AI companion for creative writing, thoughtful discussions, problem-solving, and general assistance. |
|
|
|
## OpenLLM Benchmark |
|
|
|
| Model | Average ⬆️ | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | |
|
|--------------------------------|------------|-------|-----------|-------|------------|------------|-------| |
|
| sethuiyer/Diana-7B 📑 | 70.6 | 68.34 | 86.73 | 64.58 | 60.55 | 80.19 | 63.23 | |
|
|
|
|
|
## Nous Benchmark |
|
| Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average| |
|
|---------------------------------------------------------|------:|------:|---------:|-------:|------:| |
|
|[Diana-7B](https://huggingface.co/sethuiyer/Diana-7B)| 44.38 | 75.1| 60.55| 44.58| 56.09| |
|
|
|
|
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
|
|
base_model: mlabonne/AlphaMonarch-7B |
|
dtype: bfloat16 |
|
merge_method: dare_ties |
|
models: |
|
- model: mlabonne/AlphaMonarch-7B |
|
- model: sethuiyer/Aika-7B |
|
parameters: |
|
density: 0.85 |
|
weight: 0.30 |
|
- model: SanjiWatsuki/Silicon-Maid-7B |
|
parameters: |
|
density: 0.85 |
|
weight: 0.50 |
|
- model: sethuiyer/Nandine-7b |
|
parameters: |
|
density: 0.85 |
|
weight: 0.30 |
|
parameters: |
|
int8_mask: true |
|
|
|
``` |
|
|
|
## Prompt Template |
|
|
|
```text |
|
{bos}user |
|
{ .Prompt }{eos} |
|
{bos}assistant |
|
``` |
|
|
|
## GGUF |
|
GGUF files are available at [Diana-7B-GGUF](https://huggingface.co/sethuiyer/Diana-7B-GGUF/tree/main) |
|
|
|
## Ollama |
|
Diana is now available on Ollama. You can use it by running the command ```ollama run stuehieyr/diana``` in your |
|
terminal. If you have limited computing resources, check out this [video](https://www.youtube.com/watch?v=Qa1h7ygwQq8) to learn how to run it on |
|
a Google Colab backend. |
|
|