zzzotop
/

zero-shot-cross-lingual-transfer-demo-masked

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

zero-shot-cross-lingual-transfer-demo-masked / README.md

zzzotop's picture

Update README.md

79fff85 over 1 year ago

|

527 Bytes

	Masked word prediction in 103 languages. To use, input a non-English sentence (you can use Google Translate to get one), replacing one of the words with "<mask>".

	distilbert-base-multilingual-cased finetuned on r/explainlikeimfive subset of ELI5 dataset for English masked language modelling. All knowledge of target language is acquired from pretraining only.
	Training set size 50,000 examples. Trained for 3 epochs. Mask probability 15%, batch size 8, learning rate 2e-5, weight decay 0.01. Final model perplexity 10.22.