--- language: - en --- BPE Tokenizer Model trained on the BabyLM dataset with a vocab size of 16384.