tangled-llama-t-128k-base-v0.1 / scripts /prepare_pretrain_dataset.py

Commit History

pretrain dataset
a4a75cd

mtasic85 commited on

compress pretrain dataset
b2e9443

mtasic85 commited on

multilingual dataset
39a4b5b

mtasic85 commited on

initial version
adf4b14

mtasic85 commited on