nisten
/

llama3-2x8b-MoE-41k-experiment1

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nisten commited on Apr 25

Commit

6fb6822

•

1 Parent(s): 6fccba9

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -1,10 +1,17 @@
 ---
 license: llama3
 base_model: meta-llama/Meta-Llama-3-8B-Instruct
 ---
 Meow.
-Tis an experimental mixture of two expert models based on Llama 3 Instruct plain in combo with finetune. Specifically, it is built on top of the Meta-Llama-3-8B-Instruct model and Argilla Capybara dataset,

 ---
 license: llama3
 base_model: meta-llama/Meta-Llama-3-8B-Instruct
+datasets:
+- argilla/distilabel-capybara-dpo-7k-binarized
+language:
+- en
+library_name: transformers
+tags:
+- moe
 ---
 Meow.
+This an experimental mixture of expert model with just 2 experts based on Llama 3 Instruct plain in combo with finetune. Specifically, it is built on top of the Meta-Llama-3-8B-Instruct model and finetune is trained on Argilla Capybara dataset.