File size: 3,224 Bytes
6fa3518 718a7d4 6fa3518 8301f81 13c9759 f3420ad 6fa3518 e1d42af 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 718a7d4 6fa3518 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
license: other
tags:
- merge
- mergekit
- lazymergekit
base_model:
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
- mlabonne/Meta-Llama-3-120B-Instruct
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/X1tDlFYMMFPNI_YkDXYbE.png)
# Meta-Llama-3-225B-Instruct
Meta-Llama-3-225B-Instruct is a self-merge with [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
It was inspired by large merges like:
- [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b)
- [nsfwthrowitaway69/Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
- [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
- [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
I don't recommend using it as it seems to break quite easily (but feel free to prove me wrong).
## 🧩 Configuration
```yaml
slices:
- sources:
- layer_range: [0, 20]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [10, 30]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [20, 40]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [30, 50]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [40, 60]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [50, 70]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [60, 80]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [70, 90]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [80, 100]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [90, 110]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [100, 120]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [110, 130]
model: mlabonne/Meta-Llama-3-120B-Instruct
- sources:
- layer_range: [120, 140]
model: mlabonne/Meta-Llama-3-120B-Instruct
merge_method: passthrough
dtype: float16
```
## 💻 Usage
```python
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "mlabonne/Meta-Llama-3-220B-Instruct"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
``` |