tangled-llama-t-128k-base-v0.1 / scripts /prepare_pretrain_dataset.py

Commit History

compress pretrain dataset
b2e9443

mtasic85 commited on

multilingual dataset
39a4b5b

mtasic85 commited on

initial version
adf4b14

mtasic85 commited on