charlesdedampierre
commited on
Commit
•
3202359
1
Parent(s):
6eaddc4
Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,10 @@ license: apache-2.0
|
|
4 |
|
5 |
## Model description
|
6 |
|
7 |
-
|
|
|
|
|
|
|
8 |
|
9 |
The model was trained on a refined DPO dataset. The objective was to train the model on a small portion of the DPO data. To achieve this, we compared two datasets used to train the reward model: the rejected Llama answers and the accepted ChatGPT answers from the [DPO dataset](mlabonne/chatml_dpo_pairs).
|
10 |
We then conducted topic modeling on both datasets, keeping only the topics that existed in the accepted dataset but not in the rejected one.
|
@@ -14,8 +17,6 @@ This method allows for quicker convergence with significantly less data (around
|
|
14 |
|
15 |
Special thanks to [mlabonne](https://huggingface.co/mlabonne) for creating the [colab notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing#scrollTo=YpdkZsMNylvp) that facilitated the DPO Strategy.
|
16 |
|
17 |
-
We used [Bunkatopics](https://github.com/charlesdedampierre/BunkaTopics) to implement the topic modeling methods.
|
18 |
-
|
19 |
|
20 |
## Topic Analysis
|
21 |
|
|
|
4 |
|
5 |
## Model description
|
6 |
|
7 |
+
|
8 |
+
TopicNeuralHermes 2.5 Mistral 7B is a refined model developed through fine-tuning with a specific subset of data, selected via Topic Modeling Techniques using [Bunkatopics](https://github.com/charlesdedampierre/BunkaTopics).
|
9 |
+
|
10 |
+
continuing from OpenHermes 2.5.
|
11 |
|
12 |
The model was trained on a refined DPO dataset. The objective was to train the model on a small portion of the DPO data. To achieve this, we compared two datasets used to train the reward model: the rejected Llama answers and the accepted ChatGPT answers from the [DPO dataset](mlabonne/chatml_dpo_pairs).
|
13 |
We then conducted topic modeling on both datasets, keeping only the topics that existed in the accepted dataset but not in the rejected one.
|
|
|
17 |
|
18 |
Special thanks to [mlabonne](https://huggingface.co/mlabonne) for creating the [colab notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing#scrollTo=YpdkZsMNylvp) that facilitated the DPO Strategy.
|
19 |
|
|
|
|
|
20 |
|
21 |
## Topic Analysis
|
22 |
|