liuhaotian
commited on
Commit
•
31e97e2
1
Parent(s):
6ba60f0
Update README.md
Browse files
README.md
CHANGED
@@ -38,8 +38,8 @@ The primary use of LLaVA is research on large multimodal models and chatbots.
|
|
38 |
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
|
39 |
|
40 |
## Training dataset
|
41 |
-
|
42 |
-
|
43 |
|
44 |
## Evaluation dataset
|
45 |
A preliminary evaluation of the model quality is conducted by creating a set of 90 visual reasoning questions from 30 unique images randomly sampled from COCO val 2014 and each is associated with three types of questions: conversational, detailed description, and complex reasoning. We utilize GPT-4 to judge the model outputs.
|
|
|
38 |
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
|
39 |
|
40 |
## Training dataset
|
41 |
+
558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
|
42 |
+
80K GPT-generated multimodal instruction-following data.
|
43 |
|
44 |
## Evaluation dataset
|
45 |
A preliminary evaluation of the model quality is conducted by creating a set of 90 visual reasoning questions from 30 unique images randomly sampled from COCO val 2014 and each is associated with three types of questions: conversational, detailed description, and complex reasoning. We utilize GPT-4 to judge the model outputs.
|