nielsr HF staff commited on
Commit
f846db6
1 Parent(s): 8faa4a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -4
README.md CHANGED
@@ -15,18 +15,20 @@ Check out also the Google Colab demo to run Llava on a free-tier Google Colab in
15
 
16
  Or check out our Spaces demo! [![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/llava-hf/llava-4bit)
17
 
18
-
19
  ## Model details
20
 
21
  **Model type:**
22
  LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
23
  It is an auto-regressive language model, based on the transformer architecture.
24
 
 
 
 
25
  **Model date:**
26
- LLaVA-v1.5-7B was trained in September 2023.
27
 
28
  **Paper or resources for more information:**
29
- https://llava-vl.github.io/
30
 
31
  ## How to use the model
32
 
@@ -125,4 +127,17 @@ model = VipLlavaForConditionalGeneration.from_pretrained(
125
 
126
  ## License
127
  Llama 2 is licensed under the LLAMA 2 Community License,
128
- Copyright (c) Meta Platforms, Inc. All Rights Reserved.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  Or check out our Spaces demo! [![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/llava-hf/llava-4bit)
17
 
 
18
  ## Model details
19
 
20
  **Model type:**
21
  LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
22
  It is an auto-regressive language model, based on the transformer architecture.
23
 
24
+ Vip-LlaVa enhances the training protocol of Llava by marking images and interact with the model using natural cues like a
25
+ “red bounding box” or “pointed arrow” during training.
26
+
27
  **Model date:**
28
+ ViP-LLaVa was released in December 2023.
29
 
30
  **Paper or resources for more information:**
31
+ https://vip-llava.github.io/
32
 
33
  ## How to use the model
34
 
 
127
 
128
  ## License
129
  Llama 2 is licensed under the LLAMA 2 Community License,
130
+ Copyright (c) Meta Platforms, Inc. All Rights Reserved.
131
+
132
+ ## Citation
133
+ To cite this work please use
134
+ ```bibtex
135
+ @misc{cai2023making,
136
+ title={Making Large Multimodal Models Understand Arbitrary Visual Prompts},
137
+ author={Mu Cai and Haotian Liu and Siva Karthik Mustikovela and Gregory P. Meyer and Yuning Chai and Dennis Park and Yong Jae Lee},
138
+ year={2023},
139
+ eprint={2312.00784},
140
+ archivePrefix={arXiv},
141
+ primaryClass={cs.CV}
142
+ }
143
+ ```