Update README.md
Browse files
README.md
CHANGED
@@ -117,6 +117,7 @@ The databases (ml_100k, ml_1m and jester) are built-in the surprise package for
|
|
117 |
Hyperparameters -
|
118 |
|
119 |
`n_users` : number of simulated users in the database;
|
|
|
120 |
`n_ratings` : number of simulated rating events in the database.
|
121 |
|
122 |
This is a fictional dataset based in the choice of an uniformly distributed random rating(from 1 to 5) for one of the simulated users of the recommender-system that is being designed in this research project.
|
@@ -136,17 +137,13 @@ This is a fictional dataset based in the choice of an uniformly distributed rand
|
|
136 |
```
|
137 |
|
138 |
Hyperparameters -
|
139 |
-
|
140 |
-
|
|
|
|
|
141 |
|
142 |
-
|
143 |
-
|
144 |
-
a random opportunity is chosen, than the amount of a randomly chosen topic inside the description
|
145 |
-
is multiplied by five. The ceiling operation of this result is the rating that the fictional user
|
146 |
-
will give to that opportunity.
|
147 |
-
Because the amount of each topic predicted by the model is disollved among various topics,
|
148 |
-
it is very rare to find an opportunity that has a higher LDA value. The consequence is that this dataset
|
149 |
-
has really low volatility and the major part of ratings are equal to 1.
|
150 |
```python
|
151 |
|
152 |
def read_lda_topics(self):
|
|
|
117 |
Hyperparameters -
|
118 |
|
119 |
`n_users` : number of simulated users in the database;
|
120 |
+
|
121 |
`n_ratings` : number of simulated rating events in the database.
|
122 |
|
123 |
This is a fictional dataset based in the choice of an uniformly distributed random rating(from 1 to 5) for one of the simulated users of the recommender-system that is being designed in this research project.
|
|
|
137 |
```
|
138 |
|
139 |
Hyperparameters -
|
140 |
+
|
141 |
+
n_users` : number of simulated users in the database;
|
142 |
+
|
143 |
+
n_ratings` : number of simulated rating events in the database.
|
144 |
|
145 |
+
This first LDA based dataset builds a model with K = `n_users` topics. LDA topics are used as proxies for simulated users with different clusters of interest. At first a random opportunity is chosen, than the amount of a randomly chosen topic inside the description is multiplied by five. The ceiling operation of this result is the rating that the fictional user will give to that opportunity. Because the amount of each topic predicted by the model is disollved among various topics, it is very rare to find an opportunity that has a higher LDA value. The consequence is that this dataset has really low volatility and the major part of ratings are equal to 1.
|
146 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
147 |
```python
|
148 |
|
149 |
def read_lda_topics(self):
|