CarelvNiekerk commited on
Commit
ec6ecbd
1 Parent(s): e510cdb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md CHANGED
@@ -1,3 +1,70 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
+ tags:
6
+ - roberta
7
+ - classification
8
+ - dialog state tracking
9
+ - natural language understanding
10
+ - uncertainty
11
+ - conversational system
12
+ - task-oriented dialog
13
+ datasets:
14
+ - ConvLab/multiwoz21
15
+ metrics:
16
+ - Joint Goal Accuracy
17
+ - Slot F1
18
+ - Joint Goal Expected Calibration Error
19
+
20
+ model-index:
21
+ - name: setsumbt-dst-nlu-multiwoz21
22
+ results:
23
+ - task:
24
+ type: classification
25
+ name: dialog state tracking
26
+ dataset:
27
+ type: ConvLab/multiwoz21
28
+ name: MultiWOZ21
29
+ split: test
30
+ metrics:
31
+ - type: Joint Goal Accuracy
32
+ value: 51.8
33
+ name: JGA
34
+ - type: Slot F1
35
+ value: 91.1
36
+ name: Slot F1
37
+ - type: Joint Goal Expected Calibration Error
38
+ value: 12.7
39
+ name: JECE
40
+
41
  ---
42
+
43
+ # SetSUMBT-dst-nlu-multiwoz21
44
+
45
+ This model is a fine-tuned version [SetSUMBT](https://github.com/ConvLab/ConvLab-3/tree/master/convlab/dst/setsumbt) of [roberta-base](https://huggingface.co/roberta-base) on [MultiWOZ2.1](https://huggingface.co/datasets/ConvLab/multiwoz21).
46
+ This model is a combined DST and NLU model and is a distribution distilled version of a ensemble of 5 models. This model should be used to produce uncertainty estimates for the dialogue belief state.
47
+
48
+ Refer to [ConvLab-3](https://github.com/ConvLab/ConvLab-3) for model description and usage.
49
+
50
+ ## Training procedure
51
+
52
+ ### Training hyperparameters
53
+
54
+ The following hyperparameters were used during training:
55
+ - learning_rate: 0.00001
56
+ - train_batch_size: 3
57
+ - eval_batch_size: 16
58
+ - seed: 0
59
+ - gradient_accumulation_steps: 1
60
+ - optimizer: AdamW
61
+ - loss: Ensemble Distribution Distillation Loss
62
+ - lr_scheduler_type: linear
63
+ - num_epochs: 50.0
64
+
65
+ ### Framework versions
66
+
67
+ - Transformers 4.17.0
68
+ - Pytorch 1.8.0+cu110
69
+ - Datasets 2.3.2
70
+ - Tokenizers 0.12.1