breezedeus commited on
Commit
d4fb5d6
1 Parent(s): 479a6c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md CHANGED
@@ -1,3 +1,96 @@
1
  ---
 
 
 
 
 
 
 
2
  license: mit
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - latex-ocr
4
+ - math-ocr
5
+ - math-formula-recognition
6
+ - mfr
7
+ - pix2text
8
+ - image-to-text
9
  license: mit
10
+ library_name: transformers
11
  ---
12
+
13
+ # Model Card: Pix2Text-MFR
14
+ Math Formula Recognition (MFR) model from [Pix2Text (P2T)]().
15
+
16
+ ## Model Details / 模型细节
17
+
18
+ This model is fine-tuned on a coin dataset using **contrastive learning** techniques, based on OpenAI's CLIP (ViT-B/32). It aims to enhance the feature extraction capabilities for **Coin** images, thus achieving more accurate image-based search functionalities. The model combines the powerful features of the Vision Transformer (ViT) with the multimodal learning capabilities of CLIP, specifically optimized for coin imagery.
19
+
20
+ 这个模型是在 OpenAI 的 CLIP (ViT-B/32) 基础上,利用对比学习技术并使用硬币数据集进行微调得到的。它旨在提高硬币图像的特征提取能力,从而实现更准确的以图搜图功能。该模型结合了视觉变换器(ViT)的强大功能和 CLIP 的多模态学习能力,专门针对硬币图像进行了优化。
21
+
22
+
23
+
24
+ ## Usage and Limitations / 使用和限制
25
+
26
+ - **Usage**: This model is primarily used for extracting representation vectors from coin images, enabling efficient and precise image-based searches in a coin image database.
27
+ - **Limitations**: As the model is trained specifically on coin images, it may not perform well on non-coin images.
28
+
29
+
30
+
31
+
32
+ - **用途**:此模型主要用于提取硬币图片的表示向量,以实现在硬币图像库中进行高效、精确的以图搜图。
33
+ - **限制**:由于模型是针对硬币图像进行训练的,因此在处理非硬币图像时可能效果不佳。
34
+
35
+
36
+
37
+ ## Documents / 文档
38
+
39
+ - Base Model: [openai/clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32)
40
+
41
+
42
+
43
+ ## Model Use / 模型使用
44
+
45
+ ```python3
46
+ from PIL import Image
47
+ import requests
48
+
49
+ from transformers import CLIPProcessor, CLIPModel
50
+
51
+ model = CLIPModel.from_pretrained("breezedeus/coin-clip-vit-base-patch32")
52
+ processor = CLIPProcessor.from_pretrained("breezedeus/coin-clip-vit-base-patch32")
53
+
54
+ image_fp = "path/to/coin_image.jpg"
55
+ image = Image.open(image_fp).convert("RGB")
56
+
57
+ inputs = processor(images=image, return_tensors="pt")
58
+ img_features = model.get_image_features(**inputs)
59
+ img_features = F.normalize(img_features, dim=1)
60
+ ```
61
+
62
+
63
+
64
+ ## Training Data / 训练数据
65
+
66
+ The model was trained on a specialized coin image dataset. This dataset includes images of various currencies' coins.
67
+
68
+
69
+
70
+ 本模型使用的是专门的硬币图像数据集进行训练。这个数据集包含了多种货币的硬币图片。
71
+
72
+ ## Training Process / 训练过程
73
+
74
+ The model was fine-tuned on the OpenAI CLIP (ViT-B/32) pretrained model using a coin image dataset. The training process involved Contrastive Learning fine-tuning techniques and parameter settings.
75
+
76
+
77
+
78
+ 模型是在 OpenAI 的 CLIP (ViT-B/32) 预训练模型的基础上,使用硬币图像数据集进行微调。训练过程采用了对比学习的微调技巧和参数设置。
79
+
80
+ ## Performance / 性能
81
+
82
+ This model demonstrates excellent performance in coin image retrieval tasks.
83
+
84
+
85
+
86
+ 该模型在硬币图像检索任务上展现了优异的性能。
87
+
88
+
89
+
90
+ ## Feedback / 反馈
91
+
92
+ > Where to send questions or comments about the model.
93
+
94
+ Welcome to contact the author [Breezedeus](https://www.breezedeus.com/join-group).
95
+
96
+ 欢迎联系作者 [Breezedeus](https://www.breezedeus.com/join-group) 。