weizhiwang commited on
Commit
35089c0
1 Parent(s): 3bd23e6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ inference: false
3
+ datasets:
4
+ - liuhaotian/LLaVA-CC3M-Pretrain-595K
5
+ ---
6
+
7
+ # llava-v1.5-llama-3-8b-pretrain Model Card
8
+
9
+ This is a pretrained checkpoint with the MLP connector after LLaVA stage 1, you can use it to instruct tune your multimodal models.
10
+ Please follow my reproduced implementation [LLaVA-Llama-3](https://github.com/Victorwz/LLaVA-Llama-3/) for more details on fine-tuning LLaVA model with Llama-3 as the foundatiaon LLM.
11
+
12
+
13
+ ## Training dataset
14
+ - 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
15
+
16
+ ## Architecture
17
+ - LLM: llama-3-8b (Frozen)
18
+ - Vision-Language Adapter: MLP
19
+ - Vision Encoder: CLIP-ViT-L-336px (Frozen)