https://github.com/BeyonderXX/InstructUIE
InstructUIE
Large language models have unlocked strong multi-task capabilities from reading instructive prompts. However, recent studies have shown that existing large models still have difficulty with information extraction tasks. For example, gpt-3.5-turbo achieved an F1 score of 18.22 on the Ontonotes dataset, which is significantly lower than the state-of-the-art performance. In this paper, we propose InstructUIE, a unified information extraction framework based on instruction tuning, which can uniformly model various information extraction tasks and capture the inter-task dependency. To validate the proposed method, we introduce IE INSTRUCTIONS, a benchmark of 32 diverse information extraction datasets in a unified text-to-text format with expert-written instructions. Experimental results demonstrate that our method achieves comparable performance to Bert in supervised settings and significantly outperforms the state-of-the-art and gpt3.5 in zero-shot settings.
Data
Our models are trained and evaluated on IE INSTRUCTIONS. You can download the data from Baidu NetDisk or Google Drive.
Citation
If you are using InstructUIE for your work, please kindly cite our paper:
@article{wang2023instructuie,
title={InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction},
author={Wang, Xiao and Zhou, Weikang and Zu, Can and Xia, Han and Chen, Tianze and Zhang, Yuansen and Zheng, Rui and Ye, Junjie and Zhang, Qi and Gui, Tao and others},
journal={arXiv preprint arXiv:2304.08085},
year={2023}
}
- Downloads last month
- 31