heckmi commited on
Commit
4f78e11
1 Parent(s): 30a5806

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md CHANGED
@@ -1,3 +1,58 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
+ tags:
6
+ - dialogue state tracking
7
+ - task-oriented dialog
8
+
9
  ---
10
+
11
+ # roberta-base-trippy-dst-multiwoz21
12
+
13
+ This is a TripPy model trained on [MultiWOZ 2.1](https://github.com/budzianowski/multiwoz) for use in [ConvLab-3](https://github.com/ConvLab/ConvLab-3).
14
+ This model predicts informable slots, requestable slots, general actions and domain indicator slots.
15
+ Expected joint goal accuracy for MultiWOZ 2.1 is in the range of 55-56\%.
16
+
17
+ For information about TripPy DST, refer to [TripPy: A Triple Copy Strategy for Value Independent Neural Dialog State Tracking](https://aclanthology.org/2020.sigdial-1.4/).
18
+
19
+ The training and evaluation code is available at the official [TripPy repository](https://gitlab.cs.uni-duesseldorf.de/general/dsml/trippy-public).
20
+
21
+ ## Training procedure
22
+
23
+ The model was trained on MultiWOZ 2.1 data via supervised learning using the [TripPy codebase](https://gitlab.cs.uni-duesseldorf.de/general/dsml/trippy-public).
24
+ MultiWOZ 2.1 data was loaded via ConvLab-3's unified data format dataloader.
25
+ The pre-trained encoder is [RoBERTa](https://arxiv.org/abs/1907.11692) (base).
26
+ Fine-tuning the encoder and training the DST specific classification heads was conducted for 10 epochs.
27
+
28
+ ### Training hyperparameters
29
+
30
+ ```
31
+ python3 run_dst.py \
32
+ --task_name="unified" \
33
+ --model_type="roberta" \
34
+ --model_name_or_path="roberta-base" \
35
+ --dataset_config=dataset_config/unified_multiwoz21.json \
36
+ --do_lower_case \
37
+ --learning_rate=1e-4 \
38
+ --num_train_epochs=10 \
39
+ --max_seq_length=180 \
40
+ --per_gpu_train_batch_size=24 \
41
+ --per_gpu_eval_batch_size=32 \
42
+ --output_dir=results \
43
+ --save_epochs=2 \
44
+ --eval_all_checkpoints \
45
+ --logging_steps=10 \
46
+ --warmup_proportion=0.1 \
47
+ --adam_epsilon=1e-6 \
48
+ --weight_decay=0.01 \
49
+ --label_value_repetitions \
50
+ --swap_utterances \
51
+ --append_history \
52
+ --use_history_labels \
53
+ --fp16 \
54
+ --do_train \
55
+ --predict_type=dummy \
56
+ --seed=42
57
+ ```
58
+