@AdinaY on Hugging Face: "LLaVA-o1 🔥 NEW visual language model with spontaneous and systematic…"

Post

370

LLaVA-o1 🔥 NEW visual language model with spontaneous and systematic reasoning, like GPT-o1!

Paper: LLaVA-o1: Let Vision Language Models Reason Step-by-Step (2411.10440)
Github: https://github.com/PKU-YuanGroup/LLaVA-o1
✨ Autonomous Multistage Reasoning
✨ Efficient with Small Data: Trained on 100k samples
✨ Innovative Inference: Stepwise beam search boosts precision & scalability in reasoning.