Text Classification
fastText
English

Short sentences score low

#3
by aj666 - opened

hello
Thanks for your work~
I found that the score was very low when the sentences were short, usually around 3e-5.
Is this a normal phenomenon?

hello
Thanks for your work~
I found that the score was very low when the sentences were short, usually around 3e-5.
Is this a normal phenomenon?

Hi @aj666 could you give me an example? I guess there might be some degree of correlation of document length and educational value in general.

Hello
When I set the sentence to "I love you.", the score is 3.109044064331101e-05.
When I set the sentence to "I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. ", the score is 0.9994734696447267.

Hello
When I set the sentence to "I love you.", the score is 3.109044064331101e-05.
When I set the sentence to "I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. ", the score is 0.9994734696447267.

Hi @aj666 , it is an interesting observation. I trained the model with web data but not synthetic data, so it might have not seen such sample.
I think if you apply the filter on web data, in which there is only slight chance of having short sentence, it will be fine.
I have experimentally proved so. See the blog

Sign up or log in to comment