gghfez
/

SmartMaid-123b-GGUF

Inference Endpoints

Model card Files Files and versions Community

SmartMaid-123b-GGUF / README.md

gghfez's picture

Create README.md

56e137f verified 2 months ago

|

1.41 kB

	---
	license: other
	language:
	- en
	base_model:
	- mistralai/Mistral-Large-Instruct-2407
	- NeverSleep/Lumimaid-v0.2-123B
	---

	# SmartMaid-123b

	This `experimental model` is a hybrid creation combining aspects of [Mistral-Large-Instruct-2407](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407) and [Lumimaid-v0.2-123B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-123B) using LoRA (Low-Rank Adaptation) on the mlp.down_proj module.

	## Model Details

	- Base Model: Mistral-Large-Instruct-2407
	- Influence Model: Lumimaid-v0.2-123B
	- Method: LoRA extraction from Lumimaid and targetted application to Mistral-Large
	- LoRA Configuration:
	- Rank: 32
	- Alpha: 64
	- Target Module: `mlp.down_proj`

	## Process

	1. A LoRA was extracted from Lumimaid-v0.2-123B
	2. This LoRA was then applied to a fresh instance of Mistral-Large-Instruct-2407 targetting only the mlp.down_proj modules.
	3. The resulting model was merged to create this standalone version.

	## Purpose

	The aim of this model is to incorporate the enhanced prose qualities of Lumimaid-v0.2-123B while retaining the core intelligence and capabilities of Mistral-Large.
	By applying the LoRA to the `mlp.down_proj` module, we sought to influence the model's language generation style without significantly altering its underlying knowledge and reasoning abilities.

	## Prompt Template

	<s>[INST] {input} [/INST] {output}</s>