Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ pipeline_tag: visual-question-answering
|
|
10 |
|
11 |
<!-- Provide a quick summary of what the model is/does. -->
|
12 |
|
13 |
-
|
14 |
|
15 |
## Model Details
|
16 |
|
@@ -19,6 +19,12 @@ This modelcard aims to be a base template for new models. It has been generated
|
|
19 |
- **License:** apache-2.0
|
20 |
- **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
## Reproduction
|
23 |
|
24 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
|
|
10 |
|
11 |
<!-- Provide a quick summary of what the model is/does. -->
|
12 |
|
13 |
+
Llama-3.2V-11B-cot is an early version of [LLaVA-o1](https://github.com/PKU-YuanGroup/LLaVA-o1), which is a visual language model capable of spontaneous, systematic reasoning.
|
14 |
|
15 |
## Model Details
|
16 |
|
|
|
19 |
- **License:** apache-2.0
|
20 |
- **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
|
21 |
|
22 |
+
## Benchmark Results
|
23 |
+
|
24 |
+
| MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
|
25 |
+
|--------|---------|-------|-----------|------|-----------|---------|
|
26 |
+
| 57.6 | 75.0 | 60.3 | 54.8 | 85.7 | 47.8 | 63.5 |
|
27 |
+
|
28 |
## Reproduction
|
29 |
|
30 |
<!-- This section describes the evaluation protocols and provides the results. -->
|