How to finetune using DPO?

#31

by Maverick17 - opened 16 days ago

16 days ago

Hello,

I have a standard DPO dataset with columns for images, rejected points, and chosen points, containing 2D coordinates for GUI visual grounding tasks. What prompt format is needed to correctly train the model using the DPO technique? The paper mentions that a 2D PixMo-Points dataset was used to train the model, but could you clarify the exact approach?

amanrangapur

Ai2 org 14 days ago

Hello @Maverick17 , we are releasing paper with complete details of dataset, training and evaluation shortly.

Maverick17

14 days ago

Hello @amanrangapur , shortly means by the end of this week or by the end of november? :)

I'm really looking forward to the release of the dataset, training and eval. scripts!

amanrangapur

Ai2 org 13 days ago

Hi @Maverick17 , I mean last week of November..

Maverick17

2 days ago

Hello @amanrangapur , what is the state of data release? We are entering the end of November :)

amanrangapur

Ai2 org 2 days ago

Hey @Maverick17 , we're planning to release this week. Stay tuned.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment