Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- exbert
|
4 |
+
language: multilingual
|
5 |
+
license: mit
|
6 |
+
---
|
7 |
+
|
8 |
+
# TOD-XLMR
|
9 |
+
|
10 |
+
TOD-XLMR is a conversationally specialized version of [XLM-RoBERTa](https://huggingface.co/xlm-roberta-base). It is pre-trained on English conversational corpora consisting of nine human-to-human multi-turn task-oriented dialog (TOD) datasets as proposed in the paper [TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue](https://aclanthology.org/2020.emnlp-main.66.pdf) by Wu et al. and first released in [this repository](https://huggingface.co/TODBERT).
|
11 |
+
|
12 |
+
The model is jointly trained with two objectives as proposed in TOD-BERT, including masked language modeling (MLM) and response contrastive loss (RCL). Masked language modeling is a common pretraining strategy utilized for BERT-based architectures, where a random sample of tokens in the input sequence is replaced with the special token [MASK] for predicting the original masked tokens. To further encourage the model to capture dialogic structure (i.e., dialog sequential order), response contrastive loss is implemented by using in-batch negative training with contrastive learning.
|
13 |
+
|
14 |
+
### How to use
|
15 |
+
Here is how to use this model to get the features of a given text in PyTorch:
|
16 |
+
```
|
17 |
+
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
18 |
+
|
19 |
+
tokenizer = AutoTokenizer.from_pretrained("umanlp/TOD-XLMR")
|
20 |
+
model = AutoModelForMaskedLM.from_pretrained("umanlp/TOD-XLMR")
|
21 |
+
|
22 |
+
# prepare input
|
23 |
+
text = "Replace me by any text you'd like."
|
24 |
+
encoded_input = tokenizer(text, return_tensors='pt')
|
25 |
+
|
26 |
+
# forward pass
|
27 |
+
output = model(**encoded_input)
|
28 |
+
```
|
29 |
+
|
30 |
+
Or you can also use `AutoModel` to load the pretrained model and further apply to downstream tasks:
|
31 |
+
```
|
32 |
+
from transformers import AutoTokenizer, AutoModel
|
33 |
+
|
34 |
+
tokenizer = AutoTokenizer.from_pretrained("umanlp/TOD-XLMR")
|
35 |
+
model = AutoModel("umanlp/TOD-XLMR")
|
36 |
+
|
37 |
+
# prepare input
|
38 |
+
text = "Replace me by any text you'd like."
|
39 |
+
encoded_input = tokenizer(text, return_tensors='pt')
|
40 |
+
|
41 |
+
# forward pass
|
42 |
+
output = model(**encoded_input)
|
43 |
+
```
|