wangchan-deberta_v1-base-wiki-20210520-news-spm-finetune-qa

Finetuning airesearch/wangchan-deberta_v1-base-wiki-20210520-news-spm with the training set of iapp_wiki_qa_squad, thaiqa_squad, and nsc_qa (removed examples which have cosine similarity with validation and test examples over 0.8; contexts of the latter two are trimmed to be around 300 newmm words). Benchmarks shared on wandb using validation and test sets of iapp_wiki_qa_squad. Trained with thai2transformers.

Run with:

export MODEL_NAME=wangchan-deberta_v1-base-wiki-20210520-news-spm
CUDA_LAUNCH_BLOCKING=1 python train_question_answering_lm_finetuning.py \
  --model_name $MODEL_NAME \
  --dataset_name chimera_qa \
  --revision mlm@ckp-41100 \
  --output_dir $MODEL_NAME-finetune-chimera_qa-model \
  --log_dir $MODEL_NAME-finetune-chimera_qa-log \
  --model_max_length 400 \
  --pad_on_right \
  --fp16 \
  --use_auth_token