Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
SkalskiPΒ 
posted an update Feb 19
Post
YOLO-World: Real-Time, Zero-Shot Object Detection πŸ”₯ πŸ”₯ πŸ”₯

YOLO-World was designed to solve a limitation of existing zero-shot object detection models: speed. Whereas other state-of-the-art models use Transformers, a powerful but typically slower architecture, YOLO-World uses the faster CNN-based YOLO architecture.

YOLO-World provides three models: small with 13M (re-parametrized 77M), medium with 29M (re-parametrized 92M), and large with 48M (re-parametrized 110M) parameters.

The YOLO-World team benchmarked the model on the LVIS dataset and measured their performance on the V100 without any performance acceleration mechanisms like quantization or TensorRT.

According to the paper, YOLO-World reached 35.4 AP with 52.0 FPS for the L version and 26.2 AP with 74.1 FPS for the S version. While the V100 is a powerful GPU, achieving such high FPS on any device is impressive.

- πŸ”— YOLO-World arXiv paper: https://lnkd.in/ddRBKCCX
- πŸ”— my YOLO-World technical report: https://blog.roboflow.com/what-is-yolo-world
- πŸ€— YOLO-World space: SkalskiP/YOLO-World
In this post