🚧"raw" pretrained smol_llama checkpoints - WIP 🚧
BEEspoke Data
community
AI & ML interests
'an LLM is only as good as the dataset it was trained on' - Sun Tzu
Organization Card
🐝📊💁
Collections
7
spaces
1
models
49
BEE-spoke-data/tFINE-900m-instruct-orpo
Text2Text Generation
•
Updated
•
26
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation
•
Updated
•
802
•
5
BEE-spoke-data/tFINE-900m-e16-d32-instruct
Text2Text Generation
•
Updated
•
66
BEE-spoke-data/tFINE-900m-e16-d32-instruct_2e
Text2Text Generation
•
Updated
•
12
BEE-spoke-data/tFINE-900m-e16-d32-flan
Text2Text Generation
•
Updated
•
65
BEE-spoke-data/slimpajama_tok-48128-BPE-forT5
Updated
BEE-spoke-data/claude-tokenizer-forT5
Updated
BEE-spoke-data/Meta-Llama-3-8Bee
Text Generation
•
Updated
•
78
BEE-spoke-data/MiniTokenizer-20480
Updated
BEE-spoke-data/BeeTokenizer
Updated
•
1
datasets
66
BEE-spoke-data/FLAN-compressed-plusplus
Viewer
•
Updated
•
124M
•
13
•
1
BEE-spoke-data/roastme
Viewer
•
Updated
•
434k
•
6
BEE-spoke-data/FLAN-compressed
Viewer
•
Updated
•
338M
•
3
•
1
BEE-spoke-data/synthsumm-comparisons
Viewer
•
Updated
•
4.67k
BEE-spoke-data/fineweb-cinema-100k
Viewer
•
Updated
•
100k
BEE-spoke-data/aimodels.fyi-papers
Viewer
•
Updated
•
14.8k
BEE-spoke-data/smollm-corpus-python
Viewer
•
Updated
•
12.4M
•
28
BEE-spoke-data/flan-v2-hf
Viewer
•
Updated
•
819M
•
7
BEE-spoke-data/the-stack-smol-xs-all
Viewer
•
Updated
•
8.7k
•
2
BEE-spoke-data/the-stack-smol-xs-scored-and-annotated-python
Viewer
•
Updated
•
100
•
2