OFA-Sys
/

ofa-base-caption-fairseq-version

Model card Files Files and versions Community

Farseq -> Transformers conversion

by mys - opened Sep 10, 2022

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+50119

-7

mys

Sep 10, 2022

No description provided.

Add Transformers-compatible weights converted from fairseq version22cea17c

mys changed pull request status to open Sep 10, 2022

JustinLin610

OFA-Sys org Sep 15, 2022

Thanks for you contribution. Actually I intended to upload the fairseq version of the caption ckpt, as users reported that it is hard to download from the aliyun oss. I'll directly upload a new one for transformers, and this one will be marked with Fairseq version.

cckevinn

Jan 16, 2023

Thanks for your ofa checkpoints. With my own inference code, the origin checkpoints in ofa-large-caption (https://huggingface.co/OFA-Sys/ofa-large-caption) have a lower CIDEr about 130. But with your checkpoints converted from fairseq, the performance is correct with a CIDEr 146. This means that the code in transformers ofa model and my own inference code is correct. It looks like just the origin checkpoints has some minor issues.
Therefore, I would like to ask if it is possible to provide checkpoints of other transformers ofa model converted by fairseq? For example, the pretrain ofa (https://huggingface.co/OFA-Sys/ofa-large), this will be of great benefit to fine-tune our own model. Or maybe the code for converting from fairseq to transformers?
Thanks a lot!

mys

Jan 16, 2023

Hi @cckevinn , here's the code I used for this conversion: https://colab.research.google.com/drive/1LLJewY92LXdeug5m_ceMUHdlqrRQwSQJ?usp=sharing

I can also share a GitHub repo with links to converted pretrained models later this week. I'm also working on a sample code for fine-tuning pretrained models directly in Transformers.

cckevinn

Jan 19, 2023

Hi @mys , Thanks a lot! The code looks fine, I will try it soon.

ivelin

Jan 28, 2023

Hi @cckevinn , here's the code I used for this conversion: https://colab.research.google.com/drive/1LLJewY92LXdeug5m_ceMUHdlqrRQwSQJ?usp=sharing

I can also share a GitHub repo with links to converted pretrained models later this week. I'm also working on a sample code for fine-tuning pretrained models directly in Transformers.

@mys Thank you for sharing your awesome work.

Will the fine-tuning samples include visual grounding?

I would be interested to benchmark OFA vs Donut for UI RefExp task. Here is my working in progress with Donut:
https://huggingface.co/spaces/ivelin/ui-refexp

I know Visual Grounding is pre-trained on RefCoco family which is mostly physical objects , while UI RefExp is primarily RICO android mobile app screenshots. Nevertheless I am curious how fast OFA can transfer learn on RICO RefExp and with what ultimate performance. Happy to share my results as I am with Donut.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

README.md

· Sign up or log in to comment