t5-large-coqr-canard

This model is a fine-tuned version of t5-large on the CANARD dataset. It achieves the following results on the test set:

Loss: 0.3064
Bleu: 77.1979
Generation Length: 9.576

Model description

CANARD dataset rewrites the original questions in conversations to make them context-independent (understandable w/o context). On the contrary, this model is trained to rewrite context-independent questions to conversational questions, aiming to create fluent dialog with anaphora and ellipsis.

Input:

Rewrite the question according to the given context to make the dialog fluent using anaphora and ellipsis.

question: How did people respond to Superstar Billy Graham's return?

context: Superstar Billy Graham
Return to WWWF (1977-1981)
Why did he return to the WWWF?
an agreement with promoter Vincent J. McMahon (Senior
What was his agreement with McMahon?
I don't know.

Target:

How did people respond to his return?

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 64
eval_batch_size: 64
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 512
total_eval_batch_size: 512
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 1.0

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	62	0.2987	77.2361	9.4534

Framework versions

Transformers 4.20.1
Pytorch 1.11.0+cu113
Datasets 2.6.1
Tokenizers 0.12.1