Update: After further testing, this has turned out exactly like I wanted, and is one of my favorite models! It remains coherent at higher contexts and doesn't suffer the repetition issues I was having with Lumimaid.
GGUF Quants of gghfez/SmartMaid-123b
SmartMaid-123b
This experimental model is a hybrid creation combining aspects of Mistral-Large-Instruct-2407 and Lumimaid-v0.2-123B using LoRA (Low-Rank Adaptation) on the mlp.down_proj
module.
Model Details
- Base Model: Mistral-Large-Instruct-2407
- Influence Model: Lumimaid-v0.2-123B
- Method: LoRA extraction from Lumimaid and targetted application to Mistral-Large
- LoRA Configuration:
- Rank: 32
- Alpha: 64
- Target Module:
mlp.down_proj
Process
- A LoRA was extracted from Lumimaid-v0.2-123B
- This LoRA was then applied to a fresh instance of Mistral-Large-Instruct-2407 targetting only the mlp.down_proj modules.
- The resulting model was merged to create this standalone version.
Purpose
The aim of this model is to incorporate the enhanced prose qualities of Lumimaid-v0.2-123B while retaining the core intelligence and capabilities of Mistral-Large.
By applying the LoRA to the mlp.down_proj
module, we sought to influence the model's language generation style without significantly altering its underlying knowledge and reasoning abilities.
Prompt Template
<s>[INST] {input} [/INST] {output}</s>
- Downloads last month
- 136
Model tree for gghfez/SmartMaid-123b-GGUF
Base model
NeverSleep/Lumimaid-v0.2-123B