README.md · yuyangmu125/uie-base-chinese at main

UIE(Universal Information Extraction)

Introduction

UIE(Universal Information Extraction) is an SOTA method in PaddleNLP, you can see details here.
Paper is here

Usage

I save the UIE model as a entire model(Ernie 3.0 backbone + start/end layers), so you need to load model as:

1. clone this model to your local path

git lfs install
git clone https://huggingface.co/xyj125/uie-base-chinese

If you don't have [git-lfs], you can also:

Download manually by click [Files and versions] at Top Of This Card.

2. load this model from local

import os
import torch
from transformers import AutoTokenizer

uie_model = 'uie-base-zh'
model = torch.load(os.path.join(uie_model, 'pytorch_model.bin'))        # load UIE model
tokenizer = AutoTokenizer.from_pretrained('uie-base')                   # load tokenizer
...

start_prob, end_prob = model(input_ids=batch['input_ids'],
                            token_type_ids=batch['token_type_ids'],
                            attention_mask=batch['attention_mask']))
print(f'start_prob ({type(start_prob)}): {start_prob.size()}')          # start_prob
print(f'end_prob ({type(end_prob)}): {end_prob.size()}')                # end_prob
...

Here is the output of model (with batch_size=16, max_seq_len=256):

start_prob (<class 'torch.Tensor'>): torch.Size([16, 256])
end_prob (<class 'torch.Tensor'>): torch.Size([16, 256])