Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,9 @@
|
|
1 |
---
|
2 |
license: odc-by
|
3 |
---
|
4 |
-
#### Model for the paper: [Harnessing Webpage Uis For Text Rich Visual Understanding]()
|
5 |
|
6 |
-
🌐 [Homepage](https://neulab.github.io/MultiUI/) | 🐍 [GitHub](https://github.com/neulab/multiui) | 📖 [arXiv]()
|
7 |
|
8 |
## Introduction
|
9 |
We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 million websites, covering diverse multi- modal tasks and UI layouts. Models trained on **MultiUI** not only excel in web UI tasks—achieving up to a 48% improvement on VisualWebBench and a 19.1% boost in action accuracy on a web agent dataset Mind2Web—but also generalize surprisingly well to non-web UI tasks and even to non-UI domains, such as document understanding, OCR, and chart interpretation.
|
@@ -23,4 +23,15 @@ We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 millio
|
|
23 |
* Xiang Yue: [email protected]
|
24 |
|
25 |
## Citation
|
26 |
-
If you find this work helpful, please cite out paper:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: odc-by
|
3 |
---
|
4 |
+
#### Model for the paper: [Harnessing Webpage Uis For Text Rich Visual Understanding](https://arxiv.org/abs/2410.13824)
|
5 |
|
6 |
+
🌐 [Homepage](https://neulab.github.io/MultiUI/) | 🐍 [GitHub](https://github.com/neulab/multiui) | 📖 [arXiv](https://arxiv.org/abs/2410.13824)
|
7 |
|
8 |
## Introduction
|
9 |
We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 million websites, covering diverse multi- modal tasks and UI layouts. Models trained on **MultiUI** not only excel in web UI tasks—achieving up to a 48% improvement on VisualWebBench and a 19.1% boost in action accuracy on a web agent dataset Mind2Web—but also generalize surprisingly well to non-web UI tasks and even to non-UI domains, such as document understanding, OCR, and chart interpretation.
|
|
|
23 |
* Xiang Yue: [email protected]
|
24 |
|
25 |
## Citation
|
26 |
+
If you find this work helpful, please cite out paper:
|
27 |
+
````
|
28 |
+
@misc{liu2024harnessingwebpageuistextrich,
|
29 |
+
title={Harnessing Webpage UIs for Text-Rich Visual Understanding},
|
30 |
+
author={Junpeng Liu and Tianyue Ou and Yifan Song and Yuxiao Qu and Wai Lam and Chenyan Xiong and Wenhu Chen and Graham Neubig and Xiang Yue},
|
31 |
+
year={2024},
|
32 |
+
eprint={2410.13824},
|
33 |
+
archivePrefix={arXiv},
|
34 |
+
primaryClass={cs.CV},
|
35 |
+
url={https://arxiv.org/abs/2410.13824},
|
36 |
+
}
|
37 |
+
````
|