Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
1
1
Catherine Arnett
catherinearnett
Follow
vinhnx90's profile picture
thomwolf's profile picture
thermal666's profile picture
16 followers
·
4 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
catherinearnett.bsky.social
AI & ML interests
multilingual NLP, tokenization
Recent Activity
upvoted
an
article
10 days ago
Releasing the largest multilingual open pretraining dataset
View all activity
Articles
Releasing the largest multilingual open pretraining dataset
10 days ago
•
94
Detoxifying the Commons
24 days ago
•
6
wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??
Sep 27
•
35
Organizations
catherinearnett
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a dataset
5 months ago
ambean/lingOly
Viewer
•
Updated
Jun 11
•
90
•
140
•
7