arxiv:2410.17241

Frontiers in Intelligent Colonoscopy

Published on Oct 22

· Submitted by

DanielJi on Oct 23

Upvote

Authors:

Peng Xu ,

Abstract

Colonoscopy is currently one of the most sensitive screening methods for colorectal cancer. This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications. With this goal, we begin by assessing the current data-centric and model-centric landscapes through four tasks for colonoscopic scene perception, including classification, detection, segmentation, and vision-language understanding. This assessment enables us to identify domain-specific challenges and reveals that multimodal research in colonoscopy remains open for further exploration. To embrace the coming multimodal era, we establish three foundational initiatives: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal benchmark. To facilitate ongoing monitoring of this rapidly evolving field, we provide a public website for the latest updates: https://github.com/ai4colonoscopy/IntelliScope.

View arXiv page View PDF Add to collection

Community

DanielJi

Paper submitter 9 days ago

•

edited 9 days ago

This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications.

1⃣️ We first assess the current landscape to sort out domain challenges and under-researched areas in the AI era. It contains 63 datasets and 137 representative deep techniques across four research topics published since 2015. Our assessment reveals domain-specific challenges and underscores the need for further multimodal research in colonoscopy.

2⃣️ To address these gaps, we establish three foundational initiatives for the community: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal conversation benchmark.