DrNicefellow
/

Mistral-3-from-Mixtral-8x7B-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DrNicefellow commited on Apr 11

Commit

934e19e

•

1 Parent(s): e4fc8dc

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -2,11 +2,11 @@
 license: apache-2.0
 ---
-# Mixtral-8x7B--v0.1: Model 2
 ## Model Description
-This model is the 2nd extracted standalone model from the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using the [Mixtral Model Expert Extractor tool](https://github.com/MeNicefellow/Mixtral-Model-Expert-Extractor) I made. It is constructed by selecting the first expert from each Mixture of Experts (MoE) layer. The extraction of this model is experimental. It is expected to be worse than Mistral-7B.
 ## Model Architecture
@@ -21,7 +21,7 @@ The architecture of this model includes:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "DrNicefellow/Mistral-2-from-Mixtral-8x7B-v0.1"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name)

 license: apache-2.0
 ---
+# Mixtral-8x7B--v0.1: Model 3
 ## Model Description
+This model is the 3rd extracted standalone model from the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using the [Mixtral Model Expert Extractor tool](https://github.com/MeNicefellow/Mixtral-Model-Expert-Extractor) I made. It is constructed by selecting the first expert from each Mixture of Experts (MoE) layer. The extraction of this model is experimental. It is expected to be worse than Mistral-7B.
 ## Model Architecture
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "DrNicefellow/Mistral-3-from-Mixtral-8x7B-v0.1"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(model_name)