seldonium-3b
Seldonium-3b is a model that combines two existing models, rhysjones/phi-2-orange and cognitivecomputations/dolphin-2_6-phi-2. This fusion is made possible through a Colab called "LazyMergekit", which uses the Mergekit library to mix large language models (LLM). The fusion method employed in this case is "Linear", which utilizes a weighted average to combine the models. By adjusting the weight parameter, users have precise control over the contribution of each model's features to the final generated model. The fusion process involves intelligently integrating the weights and parameters of the individual models to create a new model that capitalizes on the strengths and capabilities of the original models.
𧩠Configuration
models:
- model: rhysjones/phi-2-orange
parameters:
weight: 1.0
- model: cognitivecomputations/dolphin-2_6-phi-2
parameters:
weight: 0.8
merge_method: linear
dtype: float16
π» Usage
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "jomangbp/seldonium-3b"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.