Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ TFT-ID is finetuned from [microsoft/Florence-2](https://huggingface.co/microsoft
|
|
20 |
|
21 |
- The model was finetuned with papers from Hugging Face Daily Papers. All 36,000+ bounding boxes are manually annotated and checked by [Yifei Hu](https://x.com/hu_yifei).
|
22 |
- TFT-ID model takes an image of a single paper page as the input, and return bounding boxes for all tables, figures, and text sections in the given page.
|
23 |
-
- The text sections contain clean text content perfect for downstream OCR workflows.
|
24 |
|
25 |
Object Detection results format:
|
26 |
{'\<OD>': {'bboxes': [[x1, y1, x2, y2], ...],
|
|
|
20 |
|
21 |
- The model was finetuned with papers from Hugging Face Daily Papers. All 36,000+ bounding boxes are manually annotated and checked by [Yifei Hu](https://x.com/hu_yifei).
|
22 |
- TFT-ID model takes an image of a single paper page as the input, and return bounding boxes for all tables, figures, and text sections in the given page.
|
23 |
+
- The text sections contain clean text content perfect for downstream OCR workflows. I recommend using **TB-OCR-preview-0.1** [[HF]](https://huggingface.co/yifeihu/TB-OCR-preview-0.1) as the OCR model to convert the text sections into clean markdown and math latex output.
|
24 |
|
25 |
Object Detection results format:
|
26 |
{'\<OD>': {'bboxes': [[x1, y1, x2, y2], ...],
|