---
license: cc-by-4.0
tags:
- yahoo-open-source-software-incubator
- image-to-text
- image-captioning
inference: false
---

# Object Relation Transformer

The Object Relation Transformer is a Transformer-based image captioning model.
You can find more details about the model in our [NeurIPS 2019
paper](https://papers.nips.cc/paper/9293-image-captioning-transforming-objects-into-words.pdf).

This model repository contains two variants of the Object Relation
Transformer, as well as a couple of baseline models. Please find more
details about all these models within the [README of our Github
repository](https://github.com/yahoo/object_relation_transformer?tab=readme-ov-file#model-zoo-and-results).

## Citation

If you find these models useful, please consider citing (no obligation at all):

```
@article{herdade2019image,
  title={Image Captioning: Transforming Objects into Words},
  author={Herdade, Simao and Kappeler, Armin and Boakye, Kofi and Soares, Joao},
  journal={arXiv preprint arXiv:1906.05963},
  year={2019}
}
```

## Maintainers

- Joao Soares: jvbsoares@yahooinc.com

## License

The contents of this repository are (c) by Verizon Media.

The contents of this repository are licensed under a Creative Commons
Attribution 4.0 International License.

You should have received a copy of the license along with this
work. If not, see <https://creativecommons.org/licenses/by/4.0/>.