DrNicefellow
commited on
Commit
•
934e19e
1
Parent(s):
e4fc8dc
Update README.md
Browse files
README.md
CHANGED
@@ -2,11 +2,11 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
-
# Mixtral-8x7B--v0.1: Model
|
6 |
|
7 |
## Model Description
|
8 |
|
9 |
-
This model is the
|
10 |
|
11 |
## Model Architecture
|
12 |
|
@@ -21,7 +21,7 @@ The architecture of this model includes:
|
|
21 |
```python
|
22 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
23 |
|
24 |
-
model_name = "DrNicefellow/Mistral-
|
25 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
26 |
model = AutoModelForCausalLM.from_pretrained(model_name)
|
27 |
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
# Mixtral-8x7B--v0.1: Model 3
|
6 |
|
7 |
## Model Description
|
8 |
|
9 |
+
This model is the 3rd extracted standalone model from the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using the [Mixtral Model Expert Extractor tool](https://github.com/MeNicefellow/Mixtral-Model-Expert-Extractor) I made. It is constructed by selecting the first expert from each Mixture of Experts (MoE) layer. The extraction of this model is experimental. It is expected to be worse than Mistral-7B.
|
10 |
|
11 |
## Model Architecture
|
12 |
|
|
|
21 |
```python
|
22 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
23 |
|
24 |
+
model_name = "DrNicefellow/Mistral-3-from-Mixtral-8x7B-v0.1"
|
25 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
26 |
model = AutoModelForCausalLM.from_pretrained(model_name)
|
27 |
|