pravdin commited on
Commit
0919e6f
1 Parent(s): bc8ceda

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - merge
5
+ - mergekit
6
+ - lazymergekit
7
+ - mistralai/Mistral-Nemo-Instruct-2407
8
+ - mistralai/Mistral-Nemo-Base-2407
9
+ ---
10
+ # Quantized GGUF model Mistral-Nemo-Instruct-2407-Mistral-Nemo-Base-2407-linear-merge
11
+ This model has been quantized using llama-quantize from [llama.cpp](https://github.com/ggerganov/llama.cpp)
12
+
13
+
14
+ # Mistral-Nemo-Instruct-2407-Mistral-Nemo-Base-2407-linear-merge
15
+
16
+ Mistral-Nemo-Instruct-2407-Mistral-Nemo-Base-2407-linear-merge is a sophisticated language model created by merging two distinct models: [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) and [mistralai/Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407). This merging process was executed using [mergekit](https://github.com/cg123/mergekit), a specialized tool designed for precise model blending, ensuring optimal performance and synergy between the merged architectures.
17
+
18
+ ## 🧩 Merge Configuration
19
+
20
+ The models were merged using a linear interpolation method, which allows for a balanced integration of the two models. The configuration details are as follows:
21
+
22
+ ```yaml
23
+ models:
24
+ - model: mistralai/Mistral-Nemo-Instruct-2407
25
+ parameters:
26
+ weight: 0.6
27
+ - model: mistralai/Mistral-Nemo-Base-2407
28
+ parameters:
29
+ weight: 0.4
30
+ merge_method: linear
31
+ parameters:
32
+ normalize: true
33
+ dtype: float16
34
+ ```
35
+
36
+ ## Model Features
37
+
38
+ This merged model combines the instructive capabilities of Mistral-Nemo-Instruct-2407 with the foundational strengths of Mistral-Nemo-Base-2407. The result is a versatile model that excels in various text generation tasks, offering enhanced context understanding and nuanced text generation. By leveraging the strengths of both parent models, this linear merge aims to provide improved performance across diverse NLP applications.
39
+
40
+ ## Limitations
41
+
42
+ While the merged model benefits from the strengths of both parent models, it may also inherit certain limitations or biases present in them. Users should be aware that the performance can vary depending on the specific task and context, and it is advisable to evaluate the model's outputs critically.