Weiyun1025 commited on
Commit
19fbddb
โ€ข
1 Parent(s): 18824f4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -9
README.md CHANGED
@@ -16,24 +16,20 @@ tags:
16
  ---
17
  # InternVL2-8B-MPO
18
 
19
- [\[๐Ÿ“‚ GitHub\]](https://github.com/OpenGVLab/InternVL) [\[๐Ÿ†• Blog\]](https://internvl.github.io/blog/2024-11-14-InternVL-2.0-MPO/) [\[๐Ÿ“œ Paper\]](https://internvl.github.io/blog/2024-11-14-InternVL-2.0-MPO/) [\[๐Ÿ“– Documents\]](https://internvl.readthedocs.io/en/latest/internvl2.0/preference_optimization.html)
20
 
21
  [ๅˆ‡ๆข่‡ณไธญๆ–‡็‰ˆ](#็ฎ€ไป‹)
22
 
23
- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/619507e7b74b6c591f794340/ZNSqJ_rYzNUGk6rJVjEfq.jpeg)
24
 
25
  ## Introduction
26
 
27
  Existing open-source multimodal large language models (MLLMs) generally follow a training process involving pre-training and supervised fine-tuning. However, these models suffer from distribution shifts, which limit their multimodal reasoning, particularly in the Chain-of-Thought (CoT) performance.
28
 
29
- To address this, we introduce a preference optimization (PO) process to enhance the multimodal reasoning capabilities of MLLMs.
30
- Specifically,
31
- (1) on the data side, we design an automated preference data construction pipeline to create [MMPR](https://huggingface.co/datasets/OpenGVLab/MMPR), a high-quality, large-scale multimodal reasoning preference dataset;
32
- and (2) on the model side, we explore integrating PO with MLLMs, developing a simple yet effective method, termed Mixed Preference Optimization (MPO), that boosts multimodal CoT performance.
33
 
34
- Our approach demonstrates improved performance across multiple benchmarks, particularly in multimodal reasoning tasks.
35
- Notably, our model, [InternVL2-8B-MPO](https://huggingface.co/OpenGVLab/InternVL2-8B), achieves an accuracy of 67.0 on MathVista, outperforming InternVL2-8B by 8.7 points and achieving performance comparable to the 10$\times$ larger InternVL2-76B.
36
- We hope this study could inspire further advancements in MLLMs.
37
 
38
  ## Model Details
39
 
 
16
  ---
17
  # InternVL2-8B-MPO
18
 
19
+ [\[๐Ÿ“‚ GitHub\]](https://github.com/OpenGVLab/InternVL/tree/main/internvl_chat/shell/internvl2.0_mpo) [\[๐Ÿ†• Blog\]](https://internvl.github.io/blog/2024-11-14-InternVL-2.0-MPO/) [\[๐Ÿ“œ Paper\]](https://internvl.github.io/blog/2024-11-14-InternVL-2.0-MPO/) [\[๐Ÿ“– Documents\]](https://internvl.readthedocs.io/en/latest/internvl2.0/preference_optimization.html)
20
 
21
  [ๅˆ‡ๆข่‡ณไธญๆ–‡็‰ˆ](#็ฎ€ไป‹)
22
 
23
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/619507e7b74b6c591f794340/sy8aVC1Y5wtAjG-OQzrDI.jpeg)
24
 
25
  ## Introduction
26
 
27
  Existing open-source multimodal large language models (MLLMs) generally follow a training process involving pre-training and supervised fine-tuning. However, these models suffer from distribution shifts, which limit their multimodal reasoning, particularly in the Chain-of-Thought (CoT) performance.
28
 
29
+ To address this, we introduce a preference optimization (PO) process to enhance the multimodal reasoning capabilities of MLLMs. Specifically, (1) on the data side, we design an automated preference data construction pipeline to create [MMPR](https://huggingface.co/datasets/OpenGVLab/MMPR), a high-quality, large-scale multimodal reasoning preference dataset. and (2) on the model side, we explore integrating PO with MLLMs, developing a simple yet effective method, termed Mixed Preference Optimization (MPO), which boosts multimodal CoT performance.
30
+
31
+ Our approach demonstrates improved performance across multiple benchmarks, particularly in multimodal reasoning tasks. Notably, our model, [InternVL2-8B-MPO](https://huggingface.co/OpenGVLab/InternVL2-8B), achieves an accuracy of 67.0 on MathVista, outperforming InternVL2-8B by 8.7 points and achieving performance comparable to the 10$\times$ larger InternVL2-76B. We hope this study could inspire further advancements in MLLMs.
 
32
 
 
 
 
33
 
34
  ## Model Details
35