|
--- |
|
license: other |
|
language: |
|
- en |
|
base_model: |
|
- mistralai/Mistral-Large-Instruct-2407 |
|
- NeverSleep/Lumimaid-v0.2-123B |
|
--- |
|
|
|
# SmartMaid-123b |
|
|
|
This `experimental model` is a hybrid creation combining aspects of [Mistral-Large-Instruct-2407](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407) and [Lumimaid-v0.2-123B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-123B) using LoRA (Low-Rank Adaptation) on the mlp.down_proj module. |
|
|
|
## Model Details |
|
|
|
- **Base Model**: Mistral-Large-Instruct-2407 |
|
- **Influence Model**: Lumimaid-v0.2-123B |
|
- **Method**: LoRA extraction from Lumimaid and targetted application to Mistral-Large |
|
- **LoRA Configuration**: |
|
- Rank: 32 |
|
- Alpha: 64 |
|
- Target Module: `mlp.down_proj` |
|
|
|
## Process |
|
|
|
1. A LoRA was extracted from Lumimaid-v0.2-123B |
|
2. This LoRA was then applied to a fresh instance of Mistral-Large-Instruct-2407 targetting only the mlp.down_proj modules. |
|
3. The resulting model was merged to create this standalone version. |
|
|
|
## Purpose |
|
|
|
The aim of this model is to incorporate the enhanced prose qualities of Lumimaid-v0.2-123B while retaining the core intelligence and capabilities of Mistral-Large. |
|
By applying the LoRA to the `mlp.down_proj` module, we sought to influence the model's language generation style without significantly altering its underlying knowledge and reasoning abilities. |
|
|
|
## Prompt Template |
|
|
|
<s>[INST] {input} [/INST] {output}</s> |
|
|