Satmae++ Collection Collection of ViT models trained using SatMAE++ approach. • 4 items • Updated Jun 11 • 1
GeoChat Collection GeoChat is the first grounded Large Vision Language Model, specifically tailored to Remote Sensing(RS) scenarios. • 4 items • Updated Jun 11 • 4
GLaMM Collection Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated. • 9 items • Updated Jun 11 • 4
Video-ChatGPT Collection "Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. • 2 items • Updated Jun 11 • 2
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated Jun 11 • 23