# 关键信息提取 ## 概览 关键信息提取任务的数据集,文件目录应按如下配置: ```text └── wildreceipt ├── class_list.txt ├── dict.txt ├── image_files ├── test.txt └── train.txt ``` ## 准备步骤 ### WildReceipt - 下载并解压 [wildreceipt.tar](https://download.openmmlab.com/mmocr/data/wildreceipt.tar) ### WildReceiptOpenset - 准备好 [WildReceipt](#WildReceipt)。 - 转换 WildReceipt 成 OpenSet 格式: ```bash # 你可以运行以下命令以获取更多可用参数: # python tools/data/kie/closeset_to_openset.py -h python tools/data/kie/closeset_to_openset.py data/wildreceipt/train.txt data/wildreceipt/openset_train.txt python tools/data/kie/closeset_to_openset.py data/wildreceipt/test.txt data/wildreceipt/openset_test.txt ``` :::{note} [这篇教程](../tutorials/kie_closeset_openset.md)里讲述了更多 CloseSet 和 OpenSet 数据格式之间的区别。 :::