Short sentences score low
hello
Thanks for your work~
I found that the score was very low when the sentences were short, usually around 3e-5.
Is this a normal phenomenon?
hello
Thanks for your work~
I found that the score was very low when the sentences were short, usually around 3e-5.
Is this a normal phenomenon?
Hi @aj666 could you give me an example? I guess there might be some degree of correlation of document length and educational value in general.
Hello
When I set the sentence to "I love you.", the score is 3.109044064331101e-05.
When I set the sentence to "I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. ", the score is 0.9994734696447267.
Hello
When I set the sentence to "I love you.", the score is 3.109044064331101e-05.
When I set the sentence to "I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. I love you. ", the score is 0.9994734696447267.
Hi
@aj666
, it is an interesting observation. I trained the model with web data but not synthetic data, so it might have not seen such sample.
I think if you apply the filter on web data, in which there is only slight chance of having short sentence, it will be fine.
I have experimentally proved so. See the blog