InternVL 1.0 - a OpenGVLab Collection

OpenGVLab 's Collections

PVT

All-Seeing Project

InternVL 1.0

updated about 10 hours ago

Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Paper • 2312.14238 • Published Dec 21, 2023 • 14

Note CVPR 2024, Oral
OpenGVLab/InternViT-6B-224px

Image Feature Extraction • Updated Aug 23 • 1.08k • 19
OpenGVLab/InternVL-14B-224px

Image Feature Extraction • Updated Aug 23 • 1.46k • 33
OpenGVLab/InternVL-Chat-V1-2-Plus

Image-Text-to-Text • Updated Sep 24 • 1.19k • 33

Note Relased at 2024.02.21 | 40B parameters | More SFT data and stronger.
OpenGVLab/InternVL-Chat-V1-2

Image-Text-to-Text • Updated Sep 24 • 357 • 17

Note Released at 2024.02.11 | 40B parameters | scaling up LLM to 34B.
OpenGVLab/InternVL-Chat-V1-1

Image-Text-to-Text • Updated Sep 24 • 521 • 12

Note Released at 2024.01.24 | 19B parameters | support Chinese and stronger OCR
OpenGVLab/InternViT-6B-448px-V1-2

Image Feature Extraction • Updated Aug 23 • 246 • 25

Note Released at 2024.02.11 | Vision Foundation Model | 448 resolution
OpenGVLab/InternViT-6B-448px-V1-0

Image Feature Extraction • Updated Aug 23 • 23 • 8

Note Released at 2024.01.30 | Vision Foundation Model | 448 resolution
OpenGVLab/InternVL-14B-Flickr30K-FT-364px

Feature Extraction • Updated Aug 24 • 7 • 6
OpenGVLab/InternVL-14B-FlickrCN-FT-364px

Updated Aug 24 • 5 • 3
OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B

Visual Question Answering • Updated Aug 24 • 126 • 8
OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B

Visual Question Answering • Updated Aug 24 • 162 • 7
OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px

Visual Question Answering • Updated Aug 24 • 19 • 3
OpenGVLab/InternVL

Updated 28 days ago • 21