yifeihu
/

TFT-ID-1.0

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

yifeihu commited on Aug 30

Commit

60100f0

•

1 Parent(s): c409165

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ TFT-ID is finetuned from [microsoft/Florence-2](https://huggingface.co/microsoft
 - The model was finetuned with papers from Hugging Face Daily Papers. All 36,000+ bounding boxes are manually annotated and checked by [Yifei Hu](https://x.com/hu_yifei).
 - TFT-ID model takes an image of a single paper page as the input, and return bounding boxes for all tables, figures, and text sections in the given page.
-- The text sections contain clean text content perfect for downstream OCR workflows. However, TFT-ID is not an OCR model.
 Object Detection results format:
 {'\<OD>': {'bboxes': [[x1, y1, x2, y2], ...],

 - The model was finetuned with papers from Hugging Face Daily Papers. All 36,000+ bounding boxes are manually annotated and checked by [Yifei Hu](https://x.com/hu_yifei).
 - TFT-ID model takes an image of a single paper page as the input, and return bounding boxes for all tables, figures, and text sections in the given page.
+- The text sections contain clean text content perfect for downstream OCR workflows. I recommend using **TB-OCR-preview-0.1** [[HF]](https://huggingface.co/yifeihu/TB-OCR-preview-0.1) as the OCR model to convert the text sections into clean markdown and math latex output.
 Object Detection results format:
 {'\<OD>': {'bboxes': [[x1, y1, x2, y2], ...],