Isotonic
/

smol_llama-4x220M-MoE

Text Generation

Mixture of Experts

BEE-spoke-data/smol_llama-220M-openhermes

BEE-spoke-data/beecoder-220M-python

BEE-spoke-data/zephyr-220m-sft-full

BEE-spoke-data/zephyr-220m-dpo-full

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Isotonic commited on Feb 3

Commit

52fd47d

•

1 Parent(s): 92b5c28

Update README.md

Files changed (1) hide show

README.md +33 -24

README.md CHANGED Viewed

@@ -9,6 +9,15 @@ tags:
 - BEE-spoke-data/beecoder-220M-python
 - BEE-spoke-data/zephyr-220m-sft-full
 - BEE-spoke-data/zephyr-220m-dpo-full
 ---
 # smol_llama-4x220M-MoE
@@ -19,6 +28,30 @@ smol_llama-4x220M-MoE is a Mixure of Experts (MoE) made with the following model
 * [BEE-spoke-data/zephyr-220m-sft-full](https://huggingface.co/BEE-spoke-data/zephyr-220m-sft-full)
 * [BEE-spoke-data/zephyr-220m-dpo-full](https://huggingface.co/BEE-spoke-data/zephyr-220m-dpo-full)
 ## 🧩 Configuration
 ```yamlbase_model: BEE-spoke-data/smol_llama-220M-openhermes
@@ -81,28 +114,4 @@ experts:
     - "learn new things"
     - "personal assistant"
     - "friendly helper"
-```
-## 💻 Usage
-```python
-!pip install -qU transformers bitsandbytes accelerate
-from transformers import AutoTokenizer
-import transformers
-import torch
-model = "Isotonic/smol_llama-4x220M-MoE"
-tokenizer = AutoTokenizer.from_pretrained(model)
-pipeline = transformers.pipeline(
-    "text-generation",
-    model=model,
-    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
-)
-messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
-prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
-print(outputs[0]["generated_text"])
 ```

 - BEE-spoke-data/beecoder-220M-python
 - BEE-spoke-data/zephyr-220m-sft-full
 - BEE-spoke-data/zephyr-220m-dpo-full
+datasets:
+- JeanKaddour/minipile
+- pszemraj/simple_wikipedia_LM
+- mattymchen/refinedweb-3m
+- HuggingFaceH4/ultrachat_200k
+- teknium/openhermes
+- HuggingFaceH4/ultrafeedback_binarized
+- EleutherAI/proof-pile-2
+- bigcode/the-stack-smol-xl
 ---
 # smol_llama-4x220M-MoE
 * [BEE-spoke-data/zephyr-220m-sft-full](https://huggingface.co/BEE-spoke-data/zephyr-220m-sft-full)
 * [BEE-spoke-data/zephyr-220m-dpo-full](https://huggingface.co/BEE-spoke-data/zephyr-220m-dpo-full)
+## 💻 Usage
+```python
+!pip install -qU transformers bitsandbytes accelerate
+from transformers import AutoTokenizer
+import transformers
+import torch
+model = "Isotonic/smol_llama-4x220M-MoE"
+tokenizer = AutoTokenizer.from_pretrained(model)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
+)
+messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
+prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+print(outputs[0]["generated_text"])
+```
 ## 🧩 Configuration
 ```yamlbase_model: BEE-spoke-data/smol_llama-220M-openhermes
     - "learn new things"
     - "personal assistant"
     - "friendly helper"
 ```