Josephgflowers commited on
Commit
bdb6c63
1 Parent(s): 28e0db9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -19,7 +19,10 @@ Continued training for healing consisted of around 58860 steps full training on
19
 
20
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/DoIsuqN_p_9fx0f1v5XUb.png)
21
 
22
- 1. Overall Flow in the Model
 
 
 
23
 
24
  Each of these modules is integrated into the model’s modified decoder layer (ModifiedLlamaDecoderLayer). Here’s a high-level outline of the sequence in which they operate within the decoder:
25
 
@@ -40,7 +43,8 @@ Let’s break down how these components contribute to the model’s overall perf
40
 
41
  2.
42
 
43
- Component-Level Contributions
 
44
 
45
 
46
  Adaptive RMSNorm
@@ -75,7 +79,9 @@ Effect on Model: SEBlock helps the model emphasize or suppress specific features
75
 
76
  Performance Impact: Boosts the model’s expressiveness by allowing it to dynamically adjust which features are most relevant for each input. This helps improve generalization, especially when handling varied inputs with different feature relevances, such as conversations with shifting topics.
77
 
78
- 3. Combined Effects and Benefits on the Model
 
 
79
 
80
  When these components work together, they create a model that is both flexible and context-aware. Here’s how they synergize and improve model performance:
81
 
@@ -87,7 +93,9 @@ Improved Stability and Efficiency: Adaptive RMSNorm stabilizes the model’s nor
87
 
88
  Feature Recalibration and Channel Adaptation: SEBlock and Adaptive RMSNorm adapt the feature importance dynamically, giving the model a refined ability to select relevant information across channels and tokens. This can enhance interpretability and generalization across different types of inputs.
89
 
90
- 4. Expected Performance Improvements
 
 
91
 
92
  Accuracy and Generalization: The adaptive and context-sensitive adjustments should help the model generalize better to unseen data, as it dynamically adapts to different contexts and feature relevances.
93
 
 
19
 
20
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/DoIsuqN_p_9fx0f1v5XUb.png)
21
 
22
+
23
+ 1.
24
+
25
+ Overall Flow in the Model
26
 
27
  Each of these modules is integrated into the model’s modified decoder layer (ModifiedLlamaDecoderLayer). Here’s a high-level outline of the sequence in which they operate within the decoder:
28
 
 
43
 
44
  2.
45
 
46
+
47
+ Component-Level Contributions
48
 
49
 
50
  Adaptive RMSNorm
 
79
 
80
  Performance Impact: Boosts the model’s expressiveness by allowing it to dynamically adjust which features are most relevant for each input. This helps improve generalization, especially when handling varied inputs with different feature relevances, such as conversations with shifting topics.
81
 
82
+ 3.
83
+
84
+ Combined Effects and Benefits on the Model
85
 
86
  When these components work together, they create a model that is both flexible and context-aware. Here’s how they synergize and improve model performance:
87
 
 
93
 
94
  Feature Recalibration and Channel Adaptation: SEBlock and Adaptive RMSNorm adapt the feature importance dynamically, giving the model a refined ability to select relevant information across channels and tokens. This can enhance interpretability and generalization across different types of inputs.
95
 
96
+ 4.
97
+
98
+ Expected Performance Improvements
99
 
100
  Accuracy and Generalization: The adaptive and context-sensitive adjustments should help the model generalize better to unseen data, as it dynamically adapts to different contexts and feature relevances.
101