Regarding fine tuning model on custom dataset

by Aayushee - opened Nov 9, 2022

Nov 9, 2022

Hi
Thank you for sharing the finetuned longt5 model for SumPubMed dataset. I want to fine-tune this further on my own long math documents dataset. Could you please share how you fine-tuned the longt5 model, training set up, etc?
Did you use this script https://github.com/huggingface/transformers/blob/main/examples/pytorch/summarization/run_summarization.py?

Thanks!

Blaise-g

Owner Nov 12, 2022

Hello there!

I used something similar to this default script for fine-tuning on text summarization datasets.
https://github.com/huggingface/notebooks/blob/main/examples/summarization.ipynb

Aayushee

Nov 12, 2022

Thanks for sharing!

Aayushee changed discussion status to closed Nov 12, 2022

Aayushee

Jan 25, 2023

•

edited Jan 27, 2023

Hi, could you please share the learning rate used for fine-tuning with sumpubmed dataset? In the paper, it is mentioned 0.001 for other datasets.
I am fine-tuning on a new dataset, just wanted to know whether to start very slow (2e-5) like in Pegasus or high (0.001) like in Longt5? Also, what GPU configuration did you use for fine-tuning?

Aayushee changed discussion status to open Jan 25, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment