File size: 6,207 Bytes
92c1748 bb689f5 92c1748 15db1e9 bb689f5 3739b7e bb689f5 3739b7e bb689f5 3739b7e bb689f5 3739b7e bb689f5 3739b7e bb689f5 3739b7e bb689f5 3739b7e bb689f5 92c1748 bb689f5 92c1748 bb689f5 92c1748 e98d0af 4feed36 92c1748 da21598 bb689f5 92c1748 bb689f5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
---
tags:
- merge
- mergekit
- louisbrulenaudet/Pearl-7B-slerp
- WizardLM/WizardMath-7B-V1.1
- cognitivecomputations/WestLake-7B-v2-laser
- CultriX/NeuralTrix-7B-dpo
- chemistry
- biology
- math
base_model:
- louisbrulenaudet/Pearl-7B-slerp
- WizardLM/WizardMath-7B-V1.1
- cognitivecomputations/WestLake-7B-v2-laser
- CultriX/NeuralTrix-7B-dpo
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: Pearl-7B-0211-ties
results:
- task:
type: text-generation
metrics:
- name: Average
type: Average
value: 75.11
- name: ARC
type: ARC
value: 71.42
- name: GSM8K
type: GSM8K
value: 70.66
- name: Winogrande
type: Winogrande
value: 84.37
- name: TruthfulQA
type: TruthfulQA
value: 71.46
- name: HellaSwag
type: HellaSwag
value: 88.86
source:
name: Open LLM Leaderboard
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
---
<center><img src='https://i.imgur.com/0xFTuAX.png' width='450px'></center>
# Pearl-7B-0211-ties, an xtraordinary 7B model
**03-22-2024 - To date, louisbrulenaudet/Pearl-34B-ties is the "Best 🤝 base merges and moerges model of around 30B" on the Open LLM Leaderboard.**
Pearl-7B-0211-ties is a merge of the following models:
* [louisbrulenaudet/Pearl-7B-slerp](https://huggingface.co/louisbrulenaudet/Pearl-7B-slerp)
* [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
* [cognitivecomputations/WestLake-7B-v2-laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser)
* [CultriX/NeuralTrix-7B-dpo](https://huggingface.co/CultriX/NeuralTrix-7B-dpo)
## Evaluation
The evaluation was performed using the HuggingFace Open LLM Leaderboard.
| Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | #Params (B) |
|--------------------------------------------------|---------|-------|-----------|-------|------------|------------|-------|--------------|
| **louisbrulenaudet/Pearl-34B-ties** | **75.48** | 70.99 | 84.83 | **76.63** | 70.32 | 82.64 | 67.48 | 34.39 |
| **louisbrulenaudet/Pearl-7B-0211-ties** | **75.11** | **71.42** | **88.86** | 63.91 | **71.46** | **84.37** | 70.66 | 7.24 |
| NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO | 73.35 | 71.08 | 87.29 | 72.17 | 54.83 | 83.11 | 71.65 | 46.7 |
| argilla/notus-8x7b-experiment | 73.18 | 70.99 | 87.73 | 71.33 | 65.79 | 81.61 | 61.64 | 46.7 |
| **louisbrulenaudet/Pearl-7B-slerp** | 72.75 | 68.00 | 87.16 | 64.04 | 62.35 | 81.29 | **73.62** | 7.24 |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 72.7 | 70.14 | 87.55 | 71.4 | 64.98 | 81.06 | 61.11 | 46.7 |
| microsoft/Orca-2-13b | 61.98 | 60.92 | 79.85 | 60.3 | 56.42 | 76.56 | 37.83 | 13 |
| microsoft/phi-2 | 61.33 | 61.09 | 75.11 | 58.11 | 44.47 | 74.35 | 54.81 | 2.78 |
### Ties merging
TIES-Merging is a method designed to facilitate the efficient merging of multiple task-specific models into a consolidated multitask model. It addresses two primary challenges encountered in the process of model merging with a focus on maintaining objectivity.
One key challenge tackled by TIES-Merging involves addressing redundancy in model parameters. This is achieved by identifying and eliminating redundant parameters within task-specific models, emphasizing the changes made during fine-tuning and selectively retaining the top-k% most significant changes while discarding the rest.
Another challenge pertains to conflicts arising from disagreements between parameter signs across different models. TIES-Merging resolves these conflicts by creating a unified sign vector representing the most dominant direction of change across all models.
The TIES-Merging process consists of three steps:
- Trim: Reduces redundancy in task-specific models by retaining a fraction of the most significant parameters (density parameter) and resetting the remaining parameters to zero.
- Elect Sign: Resolves sign conflicts across different models by creating a unified sign vector based on the most dominant direction (positive or negative) in terms of cumulative magnitude.
- Disjoint Merge: Averages parameter values aligned with the unified sign vector, excluding zero values.
## Configuration
```yaml
models:
- model: OpenPipe/mistral-ft-optimized-1227
- model: louisbrulenaudet/Pearl-7B-slerp
parameters:
density: 0.6
weight: 0.3
- model: WizardLM/WizardMath-7B-V1.1
parameters:
density: 0.55
weight: 0.2
- model: cognitivecomputations/WestLake-7B-v2-laser
parameters:
density: 0.55
weight: 0.25
- model: CultriX/NeuralTrix-7B-dpo
parameters:
density: 0.6
weight: 0.25
merge_method: ties
base_model: OpenPipe/mistral-ft-optimized-1227
parameters:
normalize: true
int8_mask: true
dtype: float16
```
## Usage
```python
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "louisbrulenaudet/Pearl-7B-0211-ties"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
## Citing & Authors
If you use this code in your research, please use the following BibTeX entry.
```BibTeX
@misc{louisbrulenaudet2023,
author = {Louis Brulé Naudet},
title = {Pearl-7B-0211-ties, an xtraordinary 7B model},
year = {2023}
howpublished = {\url{https://huggingface.co/louisbrulenaudet/Pearl-7B-0211-ties}},
}
```
## Feedback
If you have any feedback, please reach out at [[email protected]](mailto:[email protected]). |