Safetensors
qwen2
oottyy commited on
Commit
cf1654a
1 Parent(s): c945f77

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -3
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
  license: odc-by
3
  ---
4
- #### Model for the paper: [Harnessing Webpage Uis For Text Rich Visual Understanding]()
5
 
6
- 🌐 [Homepage](https://neulab.github.io/MultiUI/) | 🐍 [GitHub](https://github.com/neulab/multiui) | 📖 [arXiv]()
7
 
8
  ## Introduction
9
  We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 million websites, covering diverse multi- modal tasks and UI layouts. Models trained on **MultiUI** not only excel in web UI tasks—achieving up to a 48% improvement on VisualWebBench and a 19.1% boost in action accuracy on a web agent dataset Mind2Web—but also generalize surprisingly well to non-web UI tasks and even to non-UI domains, such as document understanding, OCR, and chart interpretation.
@@ -23,4 +23,15 @@ We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 millio
23
  * Xiang Yue: [email protected]
24
 
25
  ## Citation
26
- If you find this work helpful, please cite out paper:
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: odc-by
3
  ---
4
+ #### Model for the paper: [Harnessing Webpage Uis For Text Rich Visual Understanding](https://arxiv.org/abs/2410.13824)
5
 
6
+ 🌐 [Homepage](https://neulab.github.io/MultiUI/) | 🐍 [GitHub](https://github.com/neulab/multiui) | 📖 [arXiv](https://arxiv.org/abs/2410.13824)
7
 
8
  ## Introduction
9
  We introduce **MultiUI**, a dataset containing 7.3 million samples from 1 million websites, covering diverse multi- modal tasks and UI layouts. Models trained on **MultiUI** not only excel in web UI tasks—achieving up to a 48% improvement on VisualWebBench and a 19.1% boost in action accuracy on a web agent dataset Mind2Web—but also generalize surprisingly well to non-web UI tasks and even to non-UI domains, such as document understanding, OCR, and chart interpretation.
 
23
  * Xiang Yue: [email protected]
24
 
25
  ## Citation
26
+ If you find this work helpful, please cite out paper:
27
+ ````
28
+ @misc{liu2024harnessingwebpageuistextrich,
29
+ title={Harnessing Webpage UIs for Text-Rich Visual Understanding},
30
+ author={Junpeng Liu and Tianyue Ou and Yifan Song and Yuxiao Qu and Wai Lam and Chenyan Xiong and Wenhu Chen and Graham Neubig and Xiang Yue},
31
+ year={2024},
32
+ eprint={2410.13824},
33
+ archivePrefix={arXiv},
34
+ primaryClass={cs.CV},
35
+ url={https://arxiv.org/abs/2410.13824},
36
+ }
37
+ ````