neulab
/

UIX-Qwen2-Mind2Web

Model card Files Files and versions Community

oottyy commited on 14 days ago

Commit

cf1654a

•

1 Parent(s): c945f77

Update README.md

Files changed (1) hide show

README.md +14 -3

README.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
 license: odc-by
 ---
-#### Model for the paper: [Harnessing Webpage Uis For Text Rich Visual Understanding]()
-🌐 [Homepage](https://neulab.github.io/MultiUI/) | 🐍 [GitHub](https://github.com/neulab/multiui) | 📖 [arXiv]()
 ## Introduction
 We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 million websites, covering diverse multi- modal tasks and UI layouts. Models trained on **MultiUI** not only excel in web UI tasks—achieving up to a 48% improvement on VisualWebBench and a 19.1% boost in action accuracy on a web agent dataset Mind2Web—but also generalize surprisingly well to non-web UI tasks and even to non-UI domains, such as document understanding, OCR, and chart interpretation.
@@ -23,4 +23,15 @@ We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 millio
 * Xiang Yue: [email protected]
 ## Citation
-If you find this work helpful, please cite out paper:

 ---
 license: odc-by
 ---
+#### Model for the paper: [Harnessing Webpage Uis For Text Rich Visual Understanding](https://arxiv.org/abs/2410.13824)
+🌐 [Homepage](https://neulab.github.io/MultiUI/) | 🐍 [GitHub](https://github.com/neulab/multiui) | 📖 [arXiv](https://arxiv.org/abs/2410.13824)
 ## Introduction
 We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 million websites, covering diverse multi- modal tasks and UI layouts. Models trained on **MultiUI** not only excel in web UI tasks—achieving up to a 48% improvement on VisualWebBench and a 19.1% boost in action accuracy on a web agent dataset Mind2Web—but also generalize surprisingly well to non-web UI tasks and even to non-UI domains, such as document understanding, OCR, and chart interpretation.
 * Xiang Yue: [email protected]
 ## Citation
+If you find this work helpful, please cite out paper:
+````
+@misc{liu2024harnessingwebpageuistextrich,
+      title={Harnessing Webpage UIs for Text-Rich Visual Understanding},
+      author={Junpeng Liu and Tianyue Ou and Yifan Song and Yuxiao Qu and Wai Lam and Chenyan Xiong and Wenhu Chen and Graham Neubig and Xiang Yue},
+      year={2024},
+      eprint={2410.13824},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2410.13824},
+}
+````