--- language: - en license: apache-2.0 tags: - dialogue policy - task-oriented dialog datasets: - ConvLab/multiwoz21 --- # ddpt-policy-sgd This is a MLE model trained on [MultiWOZ 2.1](https://huggingface.co/datasets/ConvLab/multiwoz21) Refer to [ConvLab-3](https://github.com/ConvLab/ConvLab-3) for model description and usage. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 32 - seed: 0 - optimizer: Adam - num_epochs: 24 - use checkpoint which performed best on validation set ### Framework versions - Transformers 4.18.0 - Pytorch 1.10.2+cu111