MohammadOthman
/

Mistral-NeuralHermes-Merge-7B-slerp

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MohammadOthman commited on May 4

Commit

e82f87a

•

1 Parent(s): 4e603d0

Update README.md

Files changed (1) hide show

README.md +40 -28

README.md CHANGED Viewed

@@ -1,28 +1,40 @@
-# Mistral-Merge-7B-slerp
-## Model Description
-The `Mistral-Merge-7B-slerp` is a merged model which leverages the spherical linear interpolation (SLERP) technique to blend layers from two distinct transformer-based models. This merging strategy is aimed at synthesizing a model that incorporates the robust linguistic capabilities of `OpenPipe/mistral-ft-optimized-1218` and the nuanced understanding of `mlabonne/NeuralHermes-2.5-Mistral-7B`.
-## Configuration
-The merging process was configured to apply a SLERP method across all comparable layers of the two source models. Below is the YAML configuration used for merging:
-```yaml
-slices:
-  - sources:
-      - model: OpenPipe/mistral-ft-optimized-1218
-        layer_range: [0, 32]
-      - model: mlabonne/NeuralHermes-2.5-Mistral-7B
-        layer_range: [0, 32]
-merge_method: slerp
-base_model: OpenPipe/mistral-ft-optimized-1218
-parameters:
-  t:
-    - filter: self_attn
-      value: [0, 0.5, 0.3, 0.7, 1]
-    - filter: mlp
-      value: [1, 0.5, 0.7, 0.3, 0]
-    - value: 0.5
-dtype: bfloat16
-```
-This configuration ensures that both self-attention and MLP (multi-layer perceptron) layers undergo interpolation with a gradient of weights to optimize the integration of features from both models.

+---
+license: cc-by-nc-4.0
+language:
+- en
+pipeline_tag: text-generation
+tags:
+- mergekit
+- text-generation
+- merge
+---
+# Mistral-Merge-7B-slerp
+## Model Description
+The `Mistral-Merge-7B-slerp` is a merged model which leverages the spherical linear interpolation (SLERP) technique to blend layers from two distinct transformer-based models. This merging strategy is aimed at synthesizing a model that incorporates the robust linguistic capabilities of `OpenPipe/mistral-ft-optimized-1218` and the nuanced understanding of `mlabonne/NeuralHermes-2.5-Mistral-7B`.
+## Configuration
+The merging process was configured to apply a SLERP method across all comparable layers of the two source models. Below is the YAML configuration used for merging:
+```yaml
+slices:
+  - sources:
+      - model: OpenPipe/mistral-ft-optimized-1218
+        layer_range: [0, 32]
+      - model: mlabonne/NeuralHermes-2.5-Mistral-7B
+        layer_range: [0, 32]
+merge_method: slerp
+base_model: OpenPipe/mistral-ft-optimized-1218
+parameters:
+  t:
+    - filter: self_attn
+      value: [0, 0.5, 0.3, 0.7, 1]
+    - filter: mlp
+      value: [1, 0.5, 0.7, 0.3, 0]
+    - value: 0.5
+dtype: bfloat16
+```
+This configuration ensures that both self-attention and MLP (multi-layer perceptron) layers undergo interpolation with a gradient of weights to optimize the integration of features from both models.