Edit model card

BERT Medium for Luxembourgish

Created from a dataset with 1M Luxembourgish sentences from Wikipedia. Corpus has approx. 16M words.

The MLM objective was trained. The BERT model has parameters L=8 and H=512. Vocabulary has 70K word pieces.

Final loss scores, after 3 epochs:

Final train loss: 4.230
Final train perplexity: 68.726
Final validation loss: 4.074
Final validation perplexity: 58.765

Downloads last month: 17

Inference Examples

Fill-Mask

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.