Replicate default cc_net preprocessing at inference time on KenlmModel.get_perplexity 0def03f edugp commited on Nov 11, 2021
Add tests and fix issue when splitting into sentences, to grab the minimum number between total sentences and sample size, rather than total original documents and sample size d131aa3 edugp commited on Nov 9, 2021
Support visualizing both sentences and whole documents. Smooth down color assignment in visualization. a86046b edugp commited on Nov 4, 2021