yuyangmu125
/

uie-base-chinese

Model card Files Files and versions Community

uie-base-chinese / README.md

xuyingjie521

first

0db4d44 over 1 year ago

|

No virus

1.55 kB

	## UIE(Universal Information Extraction)

	### Introduction

	UIE(Universal Information Extraction) is an SOTA method in PaddleNLP, you can see details [here](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/uie).
	Paper is [here](https://arxiv.org/pdf/2203.12277.pdf)

	### Usage

	I save the UIE model as a entire model(Ernie 3.0 backbone + start/end layers), so you need to load model as:

	#### 1. clone this model to your local path

	```sh
	git lfs install
	git clone https://huggingface.co/xyj125/uie-base-chinese
	```

	If you don't have [`git-lfs`], you can also:

	* Download manually by click [`Files and versions`] at Top Of This Card.

	#### 2. load this model from local

	```python
	import os
	import torch
	from transformers import AutoTokenizer

	uie_model = 'uie-base-zh'
	model = torch.load(os.path.join(uie_model, 'pytorch_model.bin')) # load UIE model
	tokenizer = AutoTokenizer.from_pretrained('uie-base') # load tokenizer
	...

	start_prob, end_prob = model(input_ids=batch['input_ids'],
	token_type_ids=batch['token_type_ids'],
	attention_mask=batch['attention_mask']))
	print(f'start_prob ({type(start_prob)}): {start_prob.size()}') # start_prob
	print(f'end_prob ({type(end_prob)}): {end_prob.size()}') # end_prob
	...
	```

	Here is the output of model (with batch_size=16, max_seq_len=256):
	```python
	start_prob (<class 'torch.Tensor'>): torch.Size([16, 256])
	end_prob (<class 'torch.Tensor'>): torch.Size([16, 256])
	```