gghfez
/

SmartMaid-123b-GGUF

Inference Endpoints

Model card Files Files and versions Community

gghfez commited on Sep 23

Commit

56e137f

•

1 Parent(s): 1836672

Create README.md

Files changed (1) hide show

README.md +37 -0

README.md ADDED Viewed

	@@ -0,0 +1,37 @@

+---
+license: other
+language:
+- en
+base_model:
+- mistralai/Mistral-Large-Instruct-2407
+- NeverSleep/Lumimaid-v0.2-123B
+---
+# SmartMaid-123b
+This `experimental model` is a hybrid creation combining aspects of [Mistral-Large-Instruct-2407](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407) and [Lumimaid-v0.2-123B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-123B) using LoRA (Low-Rank Adaptation) on the mlp.down_proj module.
+## Model Details
+- **Base Model**: Mistral-Large-Instruct-2407
+- **Influence Model**: Lumimaid-v0.2-123B
+- **Method**: LoRA extraction from Lumimaid and targetted application to Mistral-Large
+- **LoRA Configuration**:
+  - Rank: 32
+  - Alpha: 64
+  - Target Module: `mlp.down_proj`
+## Process
+1. A LoRA was extracted from Lumimaid-v0.2-123B
+2. This LoRA was then applied to a fresh instance of Mistral-Large-Instruct-2407 targetting only the mlp.down_proj modules.
+3. The resulting model was merged to create this standalone version.
+## Purpose
+The aim of this model is to incorporate the enhanced prose qualities of Lumimaid-v0.2-123B while retaining the core intelligence and capabilities of Mistral-Large.
+By applying the LoRA to the `mlp.down_proj` module, we sought to influence the model's language generation style without significantly altering its underlying knowledge and reasoning abilities.
+## Prompt Template
+<s>[INST] {input} [/INST] {output}</s>