jstzwj
commited on
Commit
•
b69ed44
1
Parent(s):
86533d0
init weights
Browse files- 360LayoutAnalysis开源模型许可证.txt +35 -0
- LICENSE.txt +51 -0
- README.md +161 -0
- README_EN.md +132 -0
- config.json +4 -0
- paper-8n.pt +3 -0
360LayoutAnalysis开源模型许可证.txt
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
360LayoutAnalysis开源模型许可证
|
2 |
+
|
3 |
+
一、定义
|
4 |
+
1.1 “本许可证”:指本文档第一条至第五条所定义的对360LayoutAnalysis开源模型使用、复制、修改和分发的条款和条件。
|
5 |
+
1.2 “模型”:指任何附带的基于机器学习技术的参数,包括但不限于权重、偏置、检查点及最终优化器状态(如适用)。
|
6 |
+
1.3 “衍生模型”:指对本模型进行的修改、基于本模型的模型,或通过将本模型参数、激活、操作或输出模式迁移到其他模型而创建或初始化的任何其他机器学习模型,包括但不限于模型微调、量化和使用中间数据表示的模型蒸馏方法。
|
7 |
+
1.4 “数据”:指为训练、预训练或评估本模型而从与本模型一起使用的数据集(包括训练、预训练或其他评估数据集)中提取的信息和/或内容集合。
|
8 |
+
1.5 “个人信息”:指以电子或其他方式记录的与已识别或可识别自然人相关的各种信息,不包括匿名化处理后的信息。
|
9 |
+
1.6 “输出”:指通过操作或使用本模型或其衍生模型产生的任何形式的信息内容结果。
|
10 |
+
1.7 “分发”:指通过任何媒介向第三方传输、发布或以其他方式共享本模型、其衍生模型,包括但不限于通过API、网络访问或任何其他电子或远程方式向用户提供模型或其功能的服务("托管服务")。
|
11 |
+
|
12 |
+
1.8 “许可方”:指对360模型及其衍生模型拥有知识产权的三六零科技集团有限公司及其关联主体。
|
13 |
+
1.9 “被许可方”:指根据本许可证被授予许可的自然人或法人实体。
|
14 |
+
1.10 “商业用途”:指使用本模型直接或间接为实体或个人产生收入或用于任何其他营利目的。
|
15 |
+
二、许可
|
16 |
+
2.1 著作权许可:根据本许可证条款和条件,许可方授予被许可方永久的、全球范围内的、免费的、非排他性和不可撤销的著作权许可,以使用、复制、创作本模型的衍生模型。但若被许可方对任何人发起著作权侵权诉讼或维权行动,主张本模型或其衍生模型构成著作权侵权,则上述著作权许可自被许可方提起诉讼或维权行动之日起终止。
|
17 |
+
2.2 专利许可:根据本许可证条款和条件,许可方授予被许可方永久的、全球范围内的、免费的、非排他性和不可撤销的专利许可,以制造、使用、销售、许诺销售、进口本模型或其衍生模型。前述专利许可仅限于许可方现有或将来拥有或控制的、使用本模型将必然会侵犯的专利权利要求。但若被许可方对任何人发起专利侵权诉讼或维权行动,主张本模型或其衍生模型构成专利侵权,则上述专利许可自被许可方提起诉讼或维权行动之日起终止。
|
18 |
+
2.3 其他知识产权许可:除上述著作权、专利许可外,根据本许可证条款和条件,许可方就使用、复制、分发本模型及其衍生模型将必然会侵犯的许可方就本模型及衍生模型所拥有或控制的其他知识产权(本许可证明确不授予商标权许可)授予被许可方永久的、全球范围内的、免费的、非排他性和不可撤销的许可。但若被许可方对任何人发起相关知识产权侵权诉讼或维权行动,则上述其他知识产权许可自被许可方提起诉讼或维权行动之日起终止。
|
19 |
+
2.4 上述许可针对非商业用途使用本模型及衍生模型之目的,若需将本模型及衍生模型用于商业用途,请通过本许可证第五条所附邮箱联系许可方进行登记。
|
20 |
+
三、使用条件
|
21 |
+
3.1 被许可方复制、使用本模型或其衍生模型,须满足以下条件:
|
22 |
+
(1)向本模型或衍生模型接收者提供本许可证副本;
|
23 |
+
(2)对本模型或衍生模型作出修改时,须以显著方式向接收者说明修改内容;
|
24 |
+
(3)保留本模型或衍生模型中与之相关的所有著作权、专利、商标及归属声明;
|
25 |
+
(4)遵守所有适用法律法规,不得将本模型或其衍生模型用于任何违法或不当目的,包括但不限于军事目的。
|
26 |
+
3.2 就衍生模型中的创造性贡献, 被许可方可主张相应的知识产权,并为复制或分发修改版本或整个衍生模型提供附加或不同的许可条款。
|
27 |
+
3.3 通过托管服务方式向用户提供本模型或其衍生模型时,第3.1条(1)和(2)项规定不适用,但应遵守第3.1条(3)和(4)项规定。
|
28 |
+
四、免责声明
|
29 |
+
4.1 在适用法律允许的最大范围内,许可方按"现状"提供本模型,不做任何形式的明示或默示保证,包括但不限于针对所有权、不侵权、适销性、特定用途适用性或其他方面的保证。被许可方应自行判断使用本模型及其衍生模型的适当性,并自行承担使用本模型及其衍生模型的全部风险。
|
30 |
+
4.2 被许可方应遵守相关法律法规处理本模型中可能包含的任何个人信息,并独自承担相关风险。
|
31 |
+
4.3 在任何情况下,许可方均不对被许可方因使用本模型或其衍生模型而产生的任何直接、间接、附带、特殊、惩罚性或后果性损害赔偿负责,包括但不限于数据损失、业务中断或任何其他商业损害或损失,即使被告知有可能发生此类损害赔偿。
|
32 |
+
五、其他
|
33 |
+
5.1 未经许可方事先书面同意,被许可方不得在产品或服务中使用许可方的任何商标、品牌或标志。
|
34 |
+
5.2 本许可证构成许可方与行使本许可证的被许可方之间关于本模型的完整协议。
|
35 |
+
5.3 若需将本模型及衍生模型用于商业用途,请通过邮箱([email protected])联系许可方进行申请,并提供:申请人名称、代理人名称(如有)、申请人联系方式及地址、代理人联系方式(如有)、模型衍生创作情况、拟开展的具体商业用途。
|
LICENSE.txt
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Apache License
|
2 |
+
Version 2.0, January 2004
|
3 |
+
http://www.apache.org/licenses/
|
4 |
+
|
5 |
+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
6 |
+
|
7 |
+
1. Definitions.
|
8 |
+
|
9 |
+
"License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
|
10 |
+
|
11 |
+
"Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
|
12 |
+
|
13 |
+
"Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
|
14 |
+
|
15 |
+
"You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.
|
16 |
+
|
17 |
+
"Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
|
18 |
+
|
19 |
+
"Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
|
20 |
+
|
21 |
+
"Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
|
22 |
+
|
23 |
+
"Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
|
24 |
+
|
25 |
+
"Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
|
26 |
+
|
27 |
+
"Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
|
28 |
+
|
29 |
+
2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
|
30 |
+
|
31 |
+
3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
|
32 |
+
|
33 |
+
4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
|
34 |
+
|
35 |
+
You must give any other recipients of the Work or Derivative Works a copy of this License; and
|
36 |
+
You must cause any modified files to carry prominent notices stating that You changed the files; and
|
37 |
+
You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
|
38 |
+
If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
|
39 |
+
You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
|
40 |
+
|
41 |
+
5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
|
42 |
+
|
43 |
+
6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
|
44 |
+
|
45 |
+
7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
|
46 |
+
|
47 |
+
8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
|
48 |
+
|
49 |
+
9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
|
50 |
+
|
51 |
+
END OF TERMS AND CONDITIONS
|
README.md
CHANGED
@@ -1,3 +1,164 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- zh
|
5 |
+
pipeline_tag: object-detection
|
6 |
---
|
7 |
+
# 360LayoutAnalysis
|
8 |
+
|
9 |
+
[github](https://github.com/360AILAB-NLP/360LayoutAnalysis)
|
10 |
+
|
11 |
+
[English](./README_EN.md)
|
12 |
+
|
13 |
+
## 一、背景
|
14 |
+
|
15 |
+
在当今数字化时代,文档版式分析是信息提取和文档理解的关键步骤之一。文档版式分析,也称为文档图像分析或文档布局分析,是指从扫描的文档图像中识别和提取文本、图像、表格和其他元素的过程。这项技术在自动化文档处理、电子数据交换、历史文档数字化等领域有着广泛的应用。
|
16 |
+
|
17 |
+
传统的文档版式分析模型往往难以准确区分文档中的段落和其他布局元素,这限制了文档信息的进一步处理和利用,而深度学习和模式识别技术的发展为文档版式分析带来了新的机遇,通过训练数据集,可以提高模型对文档结构的理解能力,但高质量的标注数据集是训练有效模型的基础。
|
18 |
+
|
19 |
+
在文档版式分析中,精细化的标注非常有必要,其中:段落的标注尤其关键,因为它直接影响到文本的语义理解和信息提取。当前,在版式分析领域,据我们了解,在论文场景中,以往的开源数据集如:CDLA(A Chinese document layout analysis),缺乏对段落信息的标注;在研报场景中的版式分析模型还相对空缺。
|
20 |
+
|
21 |
+
因此,为了解决这一问题,我们通过人工标注的方式对论文文档进行细粒度标签改造以及数据优化,并构建起研报场景细粒度版式分析数据集,最好利用这些标注数据集,训练了多个全新的中文文档版式分析模型,在**封闭测试集上表现优异**。
|
22 |
+
|
23 |
+
2024-06-15,我们优先开源了面向**论文**和**研报**两个场景的版面分析轻量化模型权重及相应的标签体系,旨在能够识别文档中的段落边界等信息,并准确区分文本、图像、表格、公式等其他元素,最终推动产业发展。
|
24 |
+
|
25 |
+
2024-06-28,新增**英文论文场景、通用场景**两个新版式分析模型,开源版式分析模型达到4个。
|
26 |
+
|
27 |
+
主要特点:
|
28 |
+
|
29 |
+
1)涵盖中文论文、英文论文、中文研报三个垂直领域及1个通用场景模型;
|
30 |
+
|
31 |
+
2)轻量化推理快速【基于yolov8训练,单模型6.23MB】;
|
32 |
+
|
33 |
+
3)中文论文场景包含段落信息【CLDA不具备段落信息,我们开源独有】;
|
34 |
+
|
35 |
+
4)中文研报场景/通用场景【基于数万级别高质量数据训练,我们开源独有】
|
36 |
+
|
37 |
+
|
38 |
+
## 二、使用
|
39 |
+
|
40 |
+
- 权重下载地址:[🤗LINK](https://huggingface.co/qihoo360/360LayoutAnalysis)
|
41 |
+
|
42 |
+
- 使用方式:
|
43 |
+
|
44 |
+
开源权重使用`yolov8`进行训练,预测方式如下:
|
45 |
+
|
46 |
+
```python
|
47 |
+
from ultralytics import YOLO
|
48 |
+
|
49 |
+
image_path = '' # 待预测图片路径
|
50 |
+
model_path = '' # 权重路径
|
51 |
+
model = YOLO(model_path)
|
52 |
+
|
53 |
+
result = model(image_path, save=True, conf=0.5, save_crop=False, line_width=2)
|
54 |
+
print(result)
|
55 |
+
|
56 |
+
print(result[0].names) # 输出id2label map
|
57 |
+
print(result[0].boxes) # 输出所有的检测到的bounding box
|
58 |
+
print(result[0].boxes.xyxy) # 输出所有的检测到的bounding box的左上和右下坐标
|
59 |
+
print(result[0].boxes.cls) # 输出所有的检测到的bounding box类别对应的id
|
60 |
+
print(result[0].boxes.conf) # 输出所有的检测到的bounding box的置信度
|
61 |
+
```
|
62 |
+
|
63 |
+
|
64 |
+
|
65 |
+
## 三、版面分析
|
66 |
+
|
67 |
+
### 3.1 论文场景
|
68 |
+
|
69 |
+
- 标签类别
|
70 |
+
|
71 |
+
| 元素 | 名称 |
|
72 |
+
| -------------- | ------------ |
|
73 |
+
| Text | 正文(段落) |
|
74 |
+
| Title | 标题 |
|
75 |
+
| Figure | 图片 |
|
76 |
+
| Figure caption | 图片标题 |
|
77 |
+
| Table | 表格 |
|
78 |
+
| Table caption | 表格标题 |
|
79 |
+
| Header | 页眉 |
|
80 |
+
| Footer | 页脚 |
|
81 |
+
| Reference | 注释 |
|
82 |
+
| Equation | 公式 |
|
83 |
+
|
84 |
+
- 示例
|
85 |
+
|
86 |
+
<div align="center">
|
87 |
+
<img src="./case/paper/1.jpg" width="50%" height="50%">
|
88 |
+
<img src="./case/paper/2.jpg" width="50%" height="50%">
|
89 |
+
</div>
|
90 |
+
|
91 |
+
|
92 |
+
|
93 |
+
|
94 |
+
### 3.2 研报场景
|
95 |
+
|
96 |
+
- 标签类别
|
97 |
+
|
98 |
+
| 元素 | 名称 |
|
99 |
+
| -------------- | ------------ |
|
100 |
+
| Text | 正文(段落) |
|
101 |
+
| Title | 标题 |
|
102 |
+
| Figure | 图片 |
|
103 |
+
| Figure caption | 图片标题 |
|
104 |
+
| Table | 表格 |
|
105 |
+
| Table caption | 表格标题 |
|
106 |
+
| Header | 页眉 |
|
107 |
+
| Footer | 页脚 |
|
108 |
+
| Toc | 目录 |
|
109 |
+
|
110 |
+
|
111 |
+
|
112 |
+
- 示例
|
113 |
+
|
114 |
+
<div align="center">
|
115 |
+
<img src="./case/report/1.jpg" width="50%" height="50%">
|
116 |
+
<img src="./case/report/2.jpg" width="50%" height="50%">
|
117 |
+
</div>
|
118 |
+
|
119 |
+
|
120 |
+
### 3.3 publaynet
|
121 |
+
|
122 |
+
- 标签类别
|
123 |
+
|
124 |
+
| 元素 | 名称 |
|
125 |
+
| ------ | ---- |
|
126 |
+
| text | 正文 |
|
127 |
+
| title | 标题 |
|
128 |
+
| list | 列表 |
|
129 |
+
| table | 表格 |
|
130 |
+
| figure | 图片 |
|
131 |
+
|
132 |
+
|
133 |
+
|
134 |
+
- 示例
|
135 |
+
|
136 |
+
<div align="center">
|
137 |
+
<img src="./case/publaynet/case1.jpg" width="50%" height="50%">
|
138 |
+
<img src="./case/publaynet/case2.jpg" width="50%" height="50%">
|
139 |
+
</div>
|
140 |
+
|
141 |
+
### 3.4 通用版式
|
142 |
+
|
143 |
+
- 标签类别
|
144 |
+
|
145 |
+
| 元素 | 名称 |
|
146 |
+
| -------- | ---- |
|
147 |
+
| Text | 正文 |
|
148 |
+
| Title | 标题 |
|
149 |
+
| Figure | 图片 |
|
150 |
+
| Table | 表格 |
|
151 |
+
| Equation | 公式 |
|
152 |
+
| Caption | 表/图标题 |
|
153 |
+
|
154 |
+
|
155 |
+
|
156 |
+
## License
|
157 |
+
|
158 |
+
This project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses.The content of this project itself is licensed under the [Apache license 2.0](./LICENSE.txt).
|
159 |
+
|
160 |
+
|
161 |
+
|
162 |
+
## 许可证
|
163 |
+
|
164 |
+
本仓库源码遵循开源许可证Apache 2.0。360LayoutAnalysis模型开源模型支持商用,若需将本模型及衍生模型用于商业用途,请通过邮箱([[email protected]](mailto:[email protected]))联系进行申请, 具体许可协议请见[《360LayoutAnalysis模型开源模型许可证》](./360LayoutAnalysis开源模型许可证.txt)。
|
README_EN.md
ADDED
@@ -0,0 +1,132 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# 360LayoutAnalysis
|
2 |
+
|
3 |
+
[github](https://github.com/360AILAB-NLP/360LayoutAnalysis)
|
4 |
+
|
5 |
+
[Chinese](./README.md)
|
6 |
+
|
7 |
+
## I. Background
|
8 |
+
|
9 |
+
In today's digital age, document layout analysis is one of the key steps in information extraction and document understanding. Document layout analysis, also known as document image analysis or document layout analysis, refers to the process of identifying and extracting text, images, tables, and other elements from scanned document images. This technology has extensive applications in fields such as automated document processing, electronic data exchange, and digitization of historical documents.
|
10 |
+
Traditional document layout analysis models often find it difficult to accurately distinguish paragraphs and other layout elements in documents, which limits the further processing and utilization of document information. The development of deep learning and pattern recognition technologies has brought new opportunities for document layout analysis. By training datasets, the model's understanding of document structure can be improved. However, high-quality annotated datasets are the foundation for training effective models.
|
11 |
+
In document layout analysis, refined annotation is very necessary, among which paragraph annotation is particularly crucial because it directly affects the semantic understanding and information extraction of the text. Currently, in the field of layout analysis, as far as we know, in paper scenarios, previous open-source datasets such as CDLA (A Chinese document layout analysis) lack annotation of paragraph information; The layout analysis model in the research report scenario is still relatively lacking.
|
12 |
+
Therefore, in order to solve this problem, we manually annotated the paper documents for fine-grained label transformation and data optimization, and constructed a fine-grained layout analysis dataset for research report scenarios. It is best to use these annotated datasets to train multiple new Chinese document layout analysis models, which performed well on the **closed test set**.
|
13 |
+
In this open source project, we have prioritized the development of lightweight model weights and corresponding label systems for page analysis in two scenarios: **paper** and **research report**. The aim is to identify paragraph boundaries and other information in documents, accurately distinguish text, images, tables, formulas, and other elements, and ultimately promote industrial development.
|
14 |
+
|
15 |
+
## II. Usage
|
16 |
+
|
17 |
+
- Weights download link: [🤗LINK](https://huggingface.co/qihoo360/360LayoutAnalysis)
|
18 |
+
|
19 |
+
- Usage:
|
20 |
+
|
21 |
+
The open-source weights are trained with `yolov8`, and the prediction method is as follows:
|
22 |
+
|
23 |
+
```python
|
24 |
+
from ultralytics import YOLO
|
25 |
+
|
26 |
+
image_path = '' # Path to the image to be predicted
|
27 |
+
model_path = '' # Path to the weights
|
28 |
+
model = YOLO(model_path)
|
29 |
+
|
30 |
+
result = model(image_path, save=True, conf=0.5, save_crop=False, line_width=2)
|
31 |
+
print(result)
|
32 |
+
|
33 |
+
print(result[0].names) # Output the id to label map
|
34 |
+
print(result[0].boxes) # Output all detected bounding boxes
|
35 |
+
print(result[0].boxes.xyxy) # Output the top-left and bottom-right coordinates of all detected bounding boxes
|
36 |
+
print(result[0].boxes.cls) # Output the id corresponding to the class of all detected bounding boxes
|
37 |
+
print(result[0].boxes.conf) # Output the confidence of all detected bounding boxes
|
38 |
+
```
|
39 |
+
|
40 |
+
## III. Layout Analysis
|
41 |
+
|
42 |
+
### 3.1 Academic Paper Scenario
|
43 |
+
|
44 |
+
- Label Categories
|
45 |
+
|
46 |
+
| Element | Name |
|
47 |
+
| -------------- | --------------------- |
|
48 |
+
| Text | Main Text (Paragraph) |
|
49 |
+
| Title | Title |
|
50 |
+
| Figure | Image |
|
51 |
+
| Figure caption | Image Caption |
|
52 |
+
| Table | Table |
|
53 |
+
| Table caption | Table Caption |
|
54 |
+
| Header | Header |
|
55 |
+
| Footer | Footer |
|
56 |
+
| Reference | Reference |
|
57 |
+
| Equation | Equation |
|
58 |
+
|
59 |
+
- Example
|
60 |
+
|
61 |
+
<div align="center">
|
62 |
+
<img src="./case/paper/1.jpg" width="50%" height="50%">
|
63 |
+
<img src="./case/paper/2.jpg" width="50%" height="50%">
|
64 |
+
</div>
|
65 |
+
|
66 |
+
|
67 |
+
### 3.2 Research Report Scenario
|
68 |
+
|
69 |
+
- Label Categories
|
70 |
+
|
71 |
+
| Element | Name |
|
72 |
+
| -------------- | --------------------- |
|
73 |
+
| Text | Main Text (Paragraph) |
|
74 |
+
| Title | Title |
|
75 |
+
| Figure | Image |
|
76 |
+
| Figure caption | Image Caption |
|
77 |
+
| Table | Table |
|
78 |
+
| Table caption | Table Caption |
|
79 |
+
| Header | Header |
|
80 |
+
| Footer | Footer |
|
81 |
+
| Toc | Table of Contents |
|
82 |
+
|
83 |
+
- Example
|
84 |
+
|
85 |
+
<div align="center">
|
86 |
+
<img src="./case/report/1.jpg" width="50%" height="50%">
|
87 |
+
<img src="./case/report/2.jpg" width="50%" height="50%">
|
88 |
+
</div>
|
89 |
+
|
90 |
+
|
91 |
+
### 3.3 publaynet
|
92 |
+
|
93 |
+
- Label Categories
|
94 |
+
|
95 |
+
| Element | Name |
|
96 |
+
| ------- | --------------------- |
|
97 |
+
| text | Main Text (Paragraph) |
|
98 |
+
| title | Title |
|
99 |
+
| figure | Image |
|
100 |
+
| list | List |
|
101 |
+
| table | Table |
|
102 |
+
|
103 |
+
- Example
|
104 |
+
|
105 |
+
<div align="center">
|
106 |
+
<img src="./case/publaynet/case1.jpg" width="50%" height="50%">
|
107 |
+
<img src="./case/publaynet/case2.jpg" width="50%" height="50%">
|
108 |
+
</div>
|
109 |
+
|
110 |
+
### 3.4 General Layout
|
111 |
+
|
112 |
+
- Label category
|
113 |
+
|
114 |
+
| Element | Name |
|
115 |
+
| -------- | --------- |
|
116 |
+
| Text | Main text |
|
117 |
+
| Title | Title |
|
118 |
+
| Figure | List |
|
119 |
+
| Table | Table |
|
120 |
+
| Equation | Image |
|
121 |
+
| Caption | Image Caption/Table Caption |
|
122 |
+
|
123 |
+
|
124 |
+
|
125 |
+
## License
|
126 |
+
|
127 |
+
This project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses. The content of this project itself is licensed under the [Apache license 2.0](./LICENSE.txt).
|
128 |
+
|
129 |
+
## License
|
130 |
+
|
131 |
+
The source code of this repository follows the open-source license Apache 2.0. The 360LayoutAnalysis model open-source model supports commercial use. If you need to use this model and its derivative models for commercial purposes, please apply through the email ([[email protected]](mailto:[email protected])), and see the specific license agreement in ["360LayoutAnalysis Model Open Source Model License"](./360LayoutAnalysis开源模型许可证.txt).
|
132 |
+
|
config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"model_type": "UltralyticsLayoutModel",
|
3 |
+
"checkpoint_path": "paper-8n.pt"
|
4 |
+
}
|
paper-8n.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:37fdbb04e49e253e427bfd34c7c4405a8763c79a25b25b42184691eeb749b064
|
3 |
+
size 6232494
|