jzsues's picture
Update README.md
49a856d verified
metadata
language:
  - zh
  - en
pipeline_tag: visual-question-answering
datasets:
  - Lin-Chen/ShareGPT4V
  - liuhaotian/LLaVA-Pretrain

Model

llava-qwen1.5-4b-chat is a lightweight multimodal models base on LLaVA architecture.

Evaluation

MMBench

Model MMBench Test (EN) MMBench Dev (EN) MMBench Test (CN) MMBench Dev (CN) CCBench Dev
LLaVA-v1.5-7B 67.7 69.2 61.0 59.7 28.4
LLaVA-InternLM-7B 69.0 68.5 66.7 63.8 37.3
LLaVA-InternLM2-7B 73.3 74.6 71.7 72.0 42.5
Bunny-3B 69.2 68.6 - - -
MiniCPM-V 64.1 67.9 62.6 65.3 41.4
llava-qwen1.5-4b-chat 69.6 69.2 68.6 68.3 41.0

Uses

TBD

Training Details

TBD