gpt-est-large

This is the large-size GPT2 model, trained from scratch on 2.2 billion words (Estonian National Corpus + News Crawl + Common Crawl). Previously named "gpt-4-est-large", renamed to avoid click-baiting.

Reference

Format

For training data was prepended with a text domain tag, and it should be added as prefix when using the model: >general<, >web<, >news<, >doaj< and >wiki< (standing for general texts, web crawled texts, news, article abstracts and wikipedia texts). Use the prefixes like this, e.g: ">web< Kas tead, et".

Model details

num. of layers: 24
num. of heads: 24
embedding size: 1536
context size: 1024
total size: 723.58M params

Further details to be added soon.

Framework versions

Transformers 4.13.0.dev0
Pytorch 1.10.0+cu102
Datasets 1.15.1
Tokenizers 0.10.3

tartuNLP
/

gpt-for-est-large

gpt-est-large

Format

Model details

Framework versions

Evaluation results