English
Recommendation
Jmilagres commited on
Commit
8485cfe
1 Parent(s): ed0f76a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -4
README.md CHANGED
@@ -14,7 +14,7 @@ thumbnail: https://github.com/Marcosdib/S2Query/Classification_Architecture_mode
14
  ![MCTIimg](https://antigo.mctic.gov.br/mctic/export/sites/institucional/institucional/entidadesVinculadas/conselhos/pag-old/RODAPE_MCTI.png)
15
 
16
 
17
- # MCTI Text Classification Task (uncased) DRAFT
18
 
19
  Disclaimer: The Brazilian Ministry of Science, Technology, and Innovation (MCTI) has partially supported this project.
20
 
@@ -221,10 +221,17 @@ This first LDA based dataset builds a model with K = `n_users` topics. LDA topic
221
  ```
222
 
223
  ### Limitations and bias
 
224
  In this model we have faced some obstacles that we had overcome, but some of those, by the nature of the project, couldn't be totally solved.
225
- Due the fact that our dataset was build it by ourselves, there was no interaction yet between a user and the dataset, therefore we don't have
226
- realistic ratings which made us have to generate a simulation, making the results less believable.
227
- Also in this part of the project, we have used a database of scrappings of linkedin profiles.
 
 
 
 
 
 
228
  The problem is that the profiles that linkedin shows is biased, so the profiles that appears was geographically closed, or related to the users organization and email.
229
 
230
  ## Training data
 
14
  ![MCTIimg](https://antigo.mctic.gov.br/mctic/export/sites/institucional/institucional/entidadesVinculadas/conselhos/pag-old/RODAPE_MCTI.png)
15
 
16
 
17
+ # MCTI Recommendation Task (uncased) DRAFT
18
 
19
  Disclaimer: The Brazilian Ministry of Science, Technology, and Innovation (MCTI) has partially supported this project.
20
 
 
221
  ```
222
 
223
  ### Limitations and bias
224
+
225
  In this model we have faced some obstacles that we had overcome, but some of those, by the nature of the project, couldn't be totally solved.
226
+ Databases containing profiles of possible users of the planned prototype are not available.
227
+ For this reason, it was necessary to carry out simulations in order to represent the interests of these users, so that the recommendation system could be modeled.
228
+ A simulation of clusters of latent interests was realized, based on topics present in the texts describing financial products.
229
+
230
+
231
+ Due the fact that the dataset was build it by ourselves, there was no interaction yet between a user and the dataset, therefore we don't have
232
+ realistic ratings, making the results less believable.
233
+
234
+ Later on, we have used a database of scrappings of linkedin profiles.
235
  The problem is that the profiles that linkedin shows is biased, so the profiles that appears was geographically closed, or related to the users organization and email.
236
 
237
  ## Training data