DrNicefellow commited on
Commit
934e19e
1 Parent(s): e4fc8dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -2,11 +2,11 @@
2
  license: apache-2.0
3
  ---
4
 
5
- # Mixtral-8x7B--v0.1: Model 2
6
 
7
  ## Model Description
8
 
9
- This model is the 2nd extracted standalone model from the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using the [Mixtral Model Expert Extractor tool](https://github.com/MeNicefellow/Mixtral-Model-Expert-Extractor) I made. It is constructed by selecting the first expert from each Mixture of Experts (MoE) layer. The extraction of this model is experimental. It is expected to be worse than Mistral-7B.
10
 
11
  ## Model Architecture
12
 
@@ -21,7 +21,7 @@ The architecture of this model includes:
21
  ```python
22
  from transformers import AutoModelForCausalLM, AutoTokenizer
23
 
24
- model_name = "DrNicefellow/Mistral-2-from-Mixtral-8x7B-v0.1"
25
  tokenizer = AutoTokenizer.from_pretrained(model_name)
26
  model = AutoModelForCausalLM.from_pretrained(model_name)
27
 
 
2
  license: apache-2.0
3
  ---
4
 
5
+ # Mixtral-8x7B--v0.1: Model 3
6
 
7
  ## Model Description
8
 
9
+ This model is the 3rd extracted standalone model from the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using the [Mixtral Model Expert Extractor tool](https://github.com/MeNicefellow/Mixtral-Model-Expert-Extractor) I made. It is constructed by selecting the first expert from each Mixture of Experts (MoE) layer. The extraction of this model is experimental. It is expected to be worse than Mistral-7B.
10
 
11
  ## Model Architecture
12
 
 
21
  ```python
22
  from transformers import AutoModelForCausalLM, AutoTokenizer
23
 
24
+ model_name = "DrNicefellow/Mistral-3-from-Mixtral-8x7B-v0.1"
25
  tokenizer = AutoTokenizer.from_pretrained(model_name)
26
  model = AutoModelForCausalLM.from_pretrained(model_name)
27