Update README.md
Browse files
README.md
CHANGED
@@ -2,8 +2,10 @@
|
|
2 |
language:
|
3 |
- zh
|
4 |
- en
|
5 |
-
library_name: adapter-transformers
|
6 |
pipeline_tag: visual-question-answering
|
|
|
|
|
|
|
7 |
---
|
8 |
|
9 |
# Model
|
@@ -11,7 +13,7 @@ pipeline_tag: visual-question-answering
|
|
11 |
llava-qwen1.5-4b-chat is a lightweight multimodal models base on [LLaVA architecture](https://llava-vl.github.io/).
|
12 |
- Language Model: [Qwen/Qwen1.5-4B-Chat](https://huggingface.co/Qwen/Qwen1.5-4B-Chat)
|
13 |
- Vision Encoder: [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384)
|
14 |
-
- Paramters
|
15 |
|
16 |
## Evaluation
|
17 |
### MMBench
|
|
|
2 |
language:
|
3 |
- zh
|
4 |
- en
|
|
|
5 |
pipeline_tag: visual-question-answering
|
6 |
+
datasets:
|
7 |
+
- Lin-Chen/ShareGPT4V
|
8 |
+
- liuhaotian/LLaVA-Pretrain
|
9 |
---
|
10 |
|
11 |
# Model
|
|
|
13 |
llava-qwen1.5-4b-chat is a lightweight multimodal models base on [LLaVA architecture](https://llava-vl.github.io/).
|
14 |
- Language Model: [Qwen/Qwen1.5-4B-Chat](https://huggingface.co/Qwen/Qwen1.5-4B-Chat)
|
15 |
- Vision Encoder: [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384)
|
16 |
+
- Total Paramters: 4,388,102,720
|
17 |
|
18 |
## Evaluation
|
19 |
### MMBench
|