teowu commited on
Commit
42d289d
1 Parent(s): a6fe4d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -100
README.md CHANGED
@@ -7,107 +7,16 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- <div align="center">
11
 
12
- <div align="center">
13
- <a href="https://huggingface.co/spaces/teowu/Q-Instruct-on-mPLUG-Owl-2"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm-dark.svg" alt="Open in Spaces"></a>
14
- <a href="https://arxiv.org/abs/2311.06783"><img src="https://img.shields.io/badge/Arxiv-2311:06783-red"></a>
15
- <a href="https://huggingface.co/datasets/teowu/Q-Instruct"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Q%20Instruct-Dataset-green"></a>
16
- </div>
17
-
18
- <h1>Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models</h1>
19
 
 
 
20
 
 
 
 
 
21
 
22
- <div>
23
- <a href="https://teowu.github.io/" target="_blank">Haoning Wu</a><sup>1</sup><sup>*</sup>,
24
- <a href="https://github.com/zzc-1998" target="_blank">Zicheng Zhang</a><sup>2</sup><sup>*</sup>,
25
- <a href="https://github.com/ZhangErliCarl/" target="_blank">Erli Zhang</a><sup>1</sup><sup>*</sup>,
26
- <a href="https://chaofengc.github.io" target="_blank">Chaofeng Chen</a><sup>1</sup>,
27
- <a href="https://liaoliang92.github.io" target="_blank">Liang Liao</a><sup>1</sup>,
28
- <a href="https://github.com/AnnanWangDaniel" target="_blank">Annan Wang</a><sup>1</sup>,
29
- <a href="https://scholar.google.com/citations?user=NBIqaHQAAAAJ&hl=en" target="_blank">Kaixin Xu</a><sup>4</sup>,
30
- </div>
31
-
32
- <div>
33
- <a href="https://github.com/lcysyzxdxc" target="_blank">Chunyi Li</a><sup>2</sup>,
34
- <a href="https://scholar.google.com.sg/citations?user=NlNOyiQAAAAJ&hl=en" target="_blank">Jingwen Hou</a><sup>1</sup>,
35
- <a href="https://ee.sjtu.edu.cn/en/FacultyDetail.aspx?id=24&infoid=153&flag=153" target="_blank">Guangtao Zhai</a><sup>2</sup>,
36
- <a href="https://scholar.google.com/citations?user=ZYVZ1bgAAAAJ&hl=en" target="_blank">Geng Xue</a><sup>4</sup>,
37
- <a href="https://wenxiusun.com" target="_blank">Wenxiu Sun</a><sup>3</sup>,
38
- <a href="https://scholar.google.com/citations?user=uT9CtPYAAAAJ&hl=en" target="_blank">Qiong Yan</a><sup>3</sup>,
39
- <a href="https://personal.ntu.edu.sg/wslin/Home.html" target="_blank">Weisi Lin</a><sup>1</sup><sup>#</sup>
40
- </div>
41
- <div>
42
- <sup>1</sup>Nanyang Technological University, <sup>2</sup>Shanghai Jiaotong University, <sup>3</sup>Sensetime Research, <sup>4</sup>I2R@A*STAR
43
- </div>
44
- <div>
45
- <sup>*</sup>Equal contribution. <sup>#</sup>Corresponding author.
46
- </div>
47
- <div>
48
- <a href="https://HuggingFace.co/datasets/teowu/Q-Instruct"><strong>Dataset</strong></a> | <a href="https://github.com/Q-Future/Q-Instruct/tree/main/model_zoo"><strong>Model Zoo</strong></a> | <a href="https://github.com/Q-Future/Q-Instruct/tree/main/fig/Q_Instruct_v0_1_preview.pdf"><strong>Paper (Preview)</strong></a> | <a href="https://huggingface.co/spaces/teowu/Q-Instruct-on-mPLUG-Owl-2"><strong>Demo (Hugging Face)</strong></a>
49
- </div>
50
-
51
-
52
- <div style="width: 100%; text-align: center; margin:auto;">
53
- <img style="width:100%" src="https://raw.githubusercontent.com/Q-Future/Q-Instruct/main/new_q_instruct.png">
54
- </div>
55
- </div>
56
-
57
- <div align="center">
58
- <h1>Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels</h1>
59
-
60
- *One Unified Model for Image Quality Assessment (IQA), Image Aesthetic Assessment (IAA), and Video Quality Assessment (VQA).*
61
-
62
- <div>
63
- <a href="https://teowu.github.io/" target="_blank">Haoning Wu</a><sup>1</sup><sup>*</sup><sup>+</sup>,
64
- <a href="https://github.com/zzc-1998" target="_blank">Zicheng Zhang</a><sup>2</sup><sup>*</sup>,
65
- <a href="https://sites.google.com/view/r-panda" target="_blank">Weixia Zhang</a><sup>2</sup>,
66
- <a href="https://chaofengc.github.io" target="_blank">Chaofeng Chen</a><sup>1</sup>,
67
- <a href="https://liaoliang92.github.io" target="_blank">Liang Liao</a><sup>1</sup>,
68
- <a href="https://github.com/lcysyzxdxc" target="_blank">Chunyi Li</a><sup>2</sup>,
69
-
70
- </div>
71
-
72
-
73
- <div>
74
- <a href="https://github.com/YixuanGao98" target="_blank">Yixuan Gao</a><sup>2</sup>,
75
- <a href="https://github.com/AnnanWangDaniel" target="_blank">Annan Wang</a><sup>1</sup>,
76
- <a href="https://github.com/ZhangErliCarl/" target="_blank">Erli Zhang</a><sup>1</sup>,
77
- <a href="https://wenxiusun.com" target="_blank">Wenxiu Sun</a><sup>3</sup>,
78
- <a href="https://scholar.google.com/citations?user=uT9CtPYAAAAJ&hl=en" target="_blank">Qiong Yan</a><sup>3</sup>,
79
- <a href="https://sites.google.com/site/minxiongkuo/" target="_blank">Xiongkuo Min</a><sup>2</sup>,
80
- <a href="https://ee.sjtu.edu.cn/en/FacultyDetail.aspx?id=24&infoid=153&flag=153" target="_blank">Guangtao Zhai</a><sup>2</sup><sup>#</sup>,
81
- <a href="https://personal.ntu.edu.sg/wslin/Home.html" target="_blank">Weisi Lin</a><sup>1</sup><sup>#</sup>
82
- </div>
83
- <div>
84
- <sup>1</sup>Nanyang Technological University, <sup>2</sup>Shanghai Jiao Tong University, <sup>3</sup>Sensetime Research
85
- </div>
86
- <div>
87
- <sup>*</sup>Equal contribution. <sup>+</sup>Project Lead. <sup>#</sup>Corresponding author(s).
88
- </div>
89
-
90
- <div>
91
- <a href="https://HuggingFace.co/q-future/one-align"><strong>One Align</strong></a> | <a href="https://github.com/Q-Future/Q-Align/tree/main/model_zoo"><strong>Model Zoo</strong></a> | <a href="xx"><strong>Technical Report (Coming Soon)</strong></a>
92
- </div>
93
-
94
-
95
- <h2>Results</h2>
96
- <div style="width: 75%; text-align: center; margin:auto;">
97
- <img style="width: 75%" src="https://raw.githubusercontent.com/Q-Future/Q-Align/main/fig/radar.png">
98
- </div>
99
-
100
- <h2>Syllabus</h2>
101
-
102
- <div style="width: 100%; text-align: center; margin:auto;">
103
- <img style="width: 100%" src="https://raw.githubusercontent.com/Q-Future/Q-Align/main/fig/q-align-syllabus.png">
104
- </div>
105
-
106
- <h2>Structure</h2>
107
-
108
- <div style="width: 75%; text-align: center; margin:auto;">
109
- <img style="width: 75%" src="https://raw.githubusercontent.com/Q-Future/Q-Align/main/fig/structure.png">
110
- </div>
111
-
112
- </div>
113
-
 
7
  pinned: false
8
  ---
9
 
10
+ Our spaces:
11
 
12
+ HF Spaces that our group maintains (Great thanks to the research GPU grants!):
 
 
 
 
 
 
13
 
14
+ - **Q-Align** (*Most Powerful Visual Scorer*): <a href="https://huggingface.co/spaces/teowu/OneScorer"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm-dark.svg" alt="Open in Huggingface Spaces"></a>
15
+ - **Q-Instruct** (*Low-level Vision-Language Assistant/Chatbot, support 1-4 images*): <a href="https://huggingface.co/spaces/teowu/Q-Instruct-v1"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm-dark.svg" alt="Open in Huggingface Spaces"></a>
16
 
17
+ Corresponding models:
18
+ - `q-future/one-align`: AutoModel for Visual Scoring. Trained with Mixture of existing datasets: See [Github](https://github.com/Q-Future/Q-Align) for details.
19
+ - `q-future/co-instruct-preview`: AutoModel for Low-level Visual Dialog (Description, Comparison, Question Answering). Trained with the scaled 480K new Q-Instruct dataset (*will also release soon!*).
20
+ - `q-future/q-instruct-mplug-owl2-1031`: Older version of Q-Instruct, as reported by [**paper**](https://q-future.github.io/Q-Instruct/fig/Q_Instruct_v0_1_preview.pdf). Trained with **released** Q-Instruct-200K dataset.
21
 
22
+ *Though we have other model variants released for the community to replicate our results, please use the previous ones as they are proved to have more stable performance.*