DongfuJiang
commited on
Commit
•
25b996a
1
Parent(s):
81bec39
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ It's fine-tuned on [Mantis-Instruct](https://huggingface.co/datasets/TIGER-Lab/M
|
|
29 |
|
30 |
## Summary
|
31 |
|
32 |
-
- Mantis-Idefics2 is LMM with **interleaved text and image as inputs**, trained on Mantis-Instruct under academic-level resources (i.e. 36 hours on 16xA100-40G).
|
33 |
- Mantis is trained to have multi-image skills including co-reference, reasoning, comparing, temporal understanding.
|
34 |
- Mantis reaches the state-of-the-art performance on five multi-image benchmarks (NLVR2, Q-Bench, BLINK, MVBench, Mantis-Eval), and also maintain a strong single-image performance on par with CogVLM and Emu2.
|
35 |
|
|
|
29 |
|
30 |
## Summary
|
31 |
|
32 |
+
- Mantis-Idefics2 is an LMM with **interleaved text and image as inputs**, trained on Mantis-Instruct under academic-level resources (i.e. 36 hours on 16xA100-40G).
|
33 |
- Mantis is trained to have multi-image skills including co-reference, reasoning, comparing, temporal understanding.
|
34 |
- Mantis reaches the state-of-the-art performance on five multi-image benchmarks (NLVR2, Q-Bench, BLINK, MVBench, Mantis-Eval), and also maintain a strong single-image performance on par with CogVLM and Emu2.
|
35 |
|