rulins
/

blip2-t5-llava

Image-Text-to-Text

Model card Files Files and versions Community

blip2-t5-llava / README.md

rulins's picture

Update README.md

581f172 over 1 year ago

|

382 Bytes

	---
	license: apache-2.0
	---
	Base Model: BLIP2-t5 pretrained version

	Finetune data: LLAVA 150k (sample one pair of instruction-answer if multi-round conversations)

	Hyper-parameters:

	v0:
	* lr = 2e-5 --> 0.0 with cosine lr scheduler
	* gbs = 32
	* image size = 480
	* weight decay = 0.05

	v1 (same as LLAVA):
	* lr = 2e-5
	* gbs = 32
	* image size = 480
	* weight decay = 0.0