DeepPavlovAI
commited on
Commit
•
2c754a3
1
Parent(s):
c5ae87f
Update README.md
Browse files
README.md
CHANGED
@@ -2,8 +2,8 @@
|
|
2 |
language:
|
3 |
- ru
|
4 |
---
|
5 |
-
# distilrubert-
|
6 |
-
Conversational DistilRuBERT-
|
7 |
|
8 |
Our DistilRuBERT-tiny was highly inspired by \[3\], \[4\]. Namely, we used
|
9 |
* KL loss (between teacher and student output logits)
|
|
|
2 |
language:
|
3 |
- ru
|
4 |
---
|
5 |
+
# distilrubert-small-cased-conversational
|
6 |
+
Conversational DistilRuBERT-small \(Russian, cased, 2‑layer, 768‑hidden, 12‑heads, 107M parameters\) was trained on OpenSubtitles\[1\], [Dirty](https://d3.ru/), [Pikabu](https://pikabu.ru/), and a Social Media segment of Taiga corpus\[2\] (as [Conversational RuBERT](https://huggingface.co/DeepPavlov/rubert-base-cased-conversational)). It can be considered as small copy of [Conversational DistilRuBERT-base](https://huggingface.co/DeepPavlov/distilrubert-base-cased-conversational).
|
7 |
|
8 |
Our DistilRuBERT-tiny was highly inspired by \[3\], \[4\]. Namely, we used
|
9 |
* KL loss (between teacher and student output logits)
|