--- license: apache-2.0 language: - en base_model: - Qwen/Qwen2.5-0.5B-Instruct - facebook/dinov2-small pipeline_tag: visual-question-answering --- Pretrain stage only, 4630 epochs # Introduction We use the powerful [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) to create a super small image-text-to-text model. The goal is to make it possible to run LLaVA models on edge devices (with few gigabytes of memory). For LLM and vision tower, we choose [OpenELM-270M-Instruct](apple/OpenELM-270M-Instruct) and [facebook/dinov2-small](facebook/dinov2-small), respectively. [POPE](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#pope): | Category | # Samples | TP | FP | TN | FN | Accuracy | Precision | Recall | F1 Score | Yes Ratio | |-----------------|---------------|--------|--------|--------|--------|--------------|---------------|------------|--------------|---------------| | Adversarial | 3000 | 1312 | 1250 | 250 | 188 | 0.521 | 0.512 | 0.875 | 0.646 | 0.854 | | Popular | 3000 | 1312 | 1236 | 264 | 188 | 0.525 | 0.515 | 0.875 | 0.648 | 0.849 | | Random | 2910 | 1312 | 1185 | 225 | 188 | 0.528 | 0.525 | 0.875 | 0.656 | 0.858 | [TEXTVQA](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#textvqa) Samples 5000, Accuracy 0% (:-|) [SCIENCEQA](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#scienceqa) Samples 4241, Correct: -, Accuracy: -%, IMG-Accuracy: -% [MMMU](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#mmmu) | Category | # Samples | Accuracy | |---------------------------------|-----------|----------| | Overall | 900 | 0.280 | | Overall-Art and Design | 120 | 0.208 | | Art | 30 | 0.167 | | Art Theory | 30 | 0.200 | | Design | 30 | 0.367 | | Music | 30 | 0.100 | | Overall-Business | 150 | 0.213 | | Accounting | 30 | 0.100 | | Economics | 30 | 0.367 | | Finance | 30 | 0.200 | | Management | 30 | 0.233 | | Marketing | 30 | 0.167 | | Overall-Science | 150 | 0.300 | | Biology | 30 | 0.300 | | Chemistry | 30 | 0.133 | | Geography | 30 | 0.300 | | Math | 30 | 0.333 | | Physics | 30 | 0.433 | | Overall-Health and Medicine | 150 | 0.340 | | Basic Medical Science | 30 | 0.300 | | Clinical Medicine | 30 | 0.133 | | Diagnostics and Laboratory Med. | 30 | 0.333 | | Pharmacy | 30 | 0.400 | | Public Health | 30 | 0.533 | | Overall-Humanities and Soc. Sci.| 120 | 0.342 | | History | 30 | 0.300 | | Literature | 30 | 0.567 | | Sociology | 30 | 0.233 | | Psychology | 30 | 0.267 | | Overall-Tech and Engineering | 210 | 0.276 | | Agriculture | 30 | 0.300 | | Architecture and Engineering | 30 | 0.200 | | Computer Science | 30 | 0.367 | | Electronics | 30 | 0.200 | | Energy and Power | 30 | 0.367 | | Materials | 30 | 0.233 | | Mechanical Engineering | 30 | 0.267 |