jiajunlong
commited on
Commit
•
59d9f93
1
Parent(s):
26aaf1c
Update README.md
Browse files
README.md
CHANGED
@@ -3,13 +3,13 @@
|
|
3 |
[![arXiv](https://img.shields.io/badge/Arxiv-2402.14289-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2402.14289)[![Github](https://img.shields.io/badge/Github-Github-blue.svg)](https://github.com/TinyLLaVA/TinyLLaVA_Factory)[![Demo](https://img.shields.io/badge/Demo-Demo-red.svg)](http://8843843nmph5.vicp.fun/#/)
|
4 |
TinyLLaVA has released a family of small-scale Large Multimodel Models(LMMs), ranging from 0.55B to 3.1B. Our best model, TinyLLaVA-Phi-2-SigLIP-3.1B, achieves better overall performance against existing 7B models such as LLaVA-1.5 and Qwen-VL.
|
5 |
### TinyLLaVA
|
6 |
-
Here, we introduce TinyLLaVA-OpenELM-
|
7 |
|
8 |
### Usage
|
9 |
Execute the following test code:
|
10 |
```python
|
11 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
12 |
-
hf_path = 'jiajunlong/TinyLLaVA-OpenELM-
|
13 |
model = AutoModelForCausalLM.from_pretrained(hf_path, trust_remote_code=True)
|
14 |
model.cuda()
|
15 |
config = model.config
|
@@ -25,7 +25,7 @@ print('runing time:', genertaion_time)
|
|
25 |
| model_name | gqa | textvqa | sqa | vqav2 | MME | MMB | MM-VET |
|
26 |
| :----------------------------------------------------------: | ----- | ------- | ----- | ----- | ------- | ----- | ------ |
|
27 |
| [TinyLLaVA-1.5B](https://huggingface.co/bczhou/TinyLLaVA-1.5B) | 60.3 | 51.7 | 60.3 | 76.9 | 1276.5 | 55.2 | 25.8 |
|
28 |
-
| [TinyLLaVA-0.55B](https://huggingface.co/jiajunlong/TinyLLaVA-0.
|
29 |
|
30 |
P.S. [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) is an open-source modular codebase for small-scale LMMs with a focus on simplicity of code implementations, extensibility of new features, and reproducibility of training results. This code repository provides standard training&evaluating pipelines, flexible data preprocessing&model configurations, and easily extensible architectures. Users can customize their own LMMs with minimal coding effort and less coding mistake.
|
31 |
TinyLLaVA Factory integrates a suite of cutting-edge models and methods.
|
|
|
3 |
[![arXiv](https://img.shields.io/badge/Arxiv-2402.14289-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2402.14289)[![Github](https://img.shields.io/badge/Github-Github-blue.svg)](https://github.com/TinyLLaVA/TinyLLaVA_Factory)[![Demo](https://img.shields.io/badge/Demo-Demo-red.svg)](http://8843843nmph5.vicp.fun/#/)
|
4 |
TinyLLaVA has released a family of small-scale Large Multimodel Models(LMMs), ranging from 0.55B to 3.1B. Our best model, TinyLLaVA-Phi-2-SigLIP-3.1B, achieves better overall performance against existing 7B models such as LLaVA-1.5 and Qwen-VL.
|
5 |
### TinyLLaVA
|
6 |
+
Here, we introduce TinyLLaVA-OpenELM-450M-CLIP-0.55B, which is trained by the [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) codebase. For LLM and vision tower, we choose [OpenELM-450M-Instruct](https://huggingface.co/apple/OpenELM-450M-Instruct) and [clip-vit-base-patch16](https://huggingface.co/openai/clip-vit-base-patch16), respectively. The dataset used for training this model is the [LLaVA](https://github.com/haotian-liu/LLaVA/blob/main/docs/Data.md) dataset.
|
7 |
|
8 |
### Usage
|
9 |
Execute the following test code:
|
10 |
```python
|
11 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
12 |
+
hf_path = 'jiajunlong/TinyLLaVA-OpenELM-450M-CLIP-0.55B'
|
13 |
model = AutoModelForCausalLM.from_pretrained(hf_path, trust_remote_code=True)
|
14 |
model.cuda()
|
15 |
config = model.config
|
|
|
25 |
| model_name | gqa | textvqa | sqa | vqav2 | MME | MMB | MM-VET |
|
26 |
| :----------------------------------------------------------: | ----- | ------- | ----- | ----- | ------- | ----- | ------ |
|
27 |
| [TinyLLaVA-1.5B](https://huggingface.co/bczhou/TinyLLaVA-1.5B) | 60.3 | 51.7 | 60.3 | 76.9 | 1276.5 | 55.2 | 25.8 |
|
28 |
+
| [TinyLLaVA-0.55B](https://huggingface.co/jiajunlong/TinyLLaVA-OpenELM-450M-CLIP-0.55B) | 50.38 | 36.37 | 50.02 | 65.44 | 1056.69 | 26.29 | 15.4 |
|
29 |
|
30 |
P.S. [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) is an open-source modular codebase for small-scale LMMs with a focus on simplicity of code implementations, extensibility of new features, and reproducibility of training results. This code repository provides standard training&evaluating pipelines, flexible data preprocessing&model configurations, and easily extensible architectures. Users can customize their own LMMs with minimal coding effort and less coding mistake.
|
31 |
TinyLLaVA Factory integrates a suite of cutting-edge models and methods.
|