Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ thumbnail: https://github.com/Marcosdib/S2Query/Classification_Architecture_mode
|
|
14 |
![MCTIimg](https://antigo.mctic.gov.br/mctic/export/sites/institucional/institucional/entidadesVinculadas/conselhos/pag-old/RODAPE_MCTI.png)
|
15 |
|
16 |
|
17 |
-
# MCTI
|
18 |
|
19 |
Disclaimer: The Brazilian Ministry of Science, Technology, and Innovation (MCTI) has partially supported this project.
|
20 |
|
@@ -221,10 +221,17 @@ This first LDA based dataset builds a model with K = `n_users` topics. LDA topic
|
|
221 |
```
|
222 |
|
223 |
### Limitations and bias
|
|
|
224 |
In this model we have faced some obstacles that we had overcome, but some of those, by the nature of the project, couldn't be totally solved.
|
225 |
-
|
226 |
-
|
227 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
228 |
The problem is that the profiles that linkedin shows is biased, so the profiles that appears was geographically closed, or related to the users organization and email.
|
229 |
|
230 |
## Training data
|
|
|
14 |
![MCTIimg](https://antigo.mctic.gov.br/mctic/export/sites/institucional/institucional/entidadesVinculadas/conselhos/pag-old/RODAPE_MCTI.png)
|
15 |
|
16 |
|
17 |
+
# MCTI Recommendation Task (uncased) DRAFT
|
18 |
|
19 |
Disclaimer: The Brazilian Ministry of Science, Technology, and Innovation (MCTI) has partially supported this project.
|
20 |
|
|
|
221 |
```
|
222 |
|
223 |
### Limitations and bias
|
224 |
+
|
225 |
In this model we have faced some obstacles that we had overcome, but some of those, by the nature of the project, couldn't be totally solved.
|
226 |
+
Databases containing profiles of possible users of the planned prototype are not available.
|
227 |
+
For this reason, it was necessary to carry out simulations in order to represent the interests of these users, so that the recommendation system could be modeled.
|
228 |
+
A simulation of clusters of latent interests was realized, based on topics present in the texts describing financial products.
|
229 |
+
|
230 |
+
|
231 |
+
Due the fact that the dataset was build it by ourselves, there was no interaction yet between a user and the dataset, therefore we don't have
|
232 |
+
realistic ratings, making the results less believable.
|
233 |
+
|
234 |
+
Later on, we have used a database of scrappings of linkedin profiles.
|
235 |
The problem is that the profiles that linkedin shows is biased, so the profiles that appears was geographically closed, or related to the users organization and email.
|
236 |
|
237 |
## Training data
|