Kirill Koncha

midwestcyr

AI & ML interests

NLP, Benchmarking, Low-Resource Languages,

Recent Activity

liked a model about 1 month ago

mistralai/Mistral-7B-Instruct-v0.1

View all activity

Organizations

midwestcyr's activity

liked a model about 1 month ago

mistralai/Mistral-7B-Instruct-v0.1

Text Generation • Updated Aug 22 • 235k • 1.53k

authored a paper 5 months ago

RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs

Paper • 2406.19232 • Published Jun 27

upvoted a paper 6 months ago

RuCoLA: Russian Corpus of Linguistic Acceptability

Paper • 2210.12814 • Published Oct 23, 2022 • 1

Reacted to FremyCompany's post with ❤️ 7 months ago

Post

2111

Today, April 26, is the Day of the Tatar Language! 🌟
To celebrate, we release our new language model, Tweety Tatar 🐣

https://huggingface.co/Tweeties/tweety-tatar-base-7b-2024-v1

The model was converted from Mistral Instruct v0.2 using a novel technique called trans-tokenization. As a result, the model uses a brand-new tokenizer, fully tailored for the Tatar language.

We also release a model which can be finetuned for translation of English or Russian into Tatar, and achieves a performance similar to commercial offerings:

https://huggingface.co/Tweeties/tweety-tatar-hydra-base-7b-2024-v1

More details in our upcoming paper 👀
François REMY, Pieter Delobelle, Alfiya Khabibullina

Татар теле көне белән!

3 replies

liked a model about 2 years ago

speechbrain/spkrec-ecapa-voxceleb

Updated Feb 19 • 656k • 161

liked 2 models over 2 years ago

speechbrain/lang-id-voxlingua107-ecapa

Audio Classification • Updated about 16 hours ago • 272k • • 101

speechbrain/lang-id-commonlanguage_ecapa

Audio Classification • Updated Feb 19 • 63.2k • 36