nbroad
/

donut-base-ascii

Image-Text-to-Text

vision-encoder-decoder

Inference Endpoints

Model card Files Files and versions Community

donut-base-ascii / README.md

nbroad's picture

nbroad HF staff

Update README.md

39129e8 over 1 year ago

|

1.08 kB

	---
	license: apache-2.0
	---

	# donut-base-ascii

	This is ["naver-clova-ix/donut-base"](https://huggingface.co/naver-clova-ix/donut-base) but with all non-ascii tokens removed. This means the model is good for basic English use cases where the text is primarily a-zA-Z0-9 and basic punctuation.


	The original model, `"naver-clova-ix/donut-base"`, did not have a token for `"1"`, so that has also been added. The notebook [remove-donut-tokens.ipynb](remove-donut-tokens.ipynb) details the whole process.


	This has not been trained any more than the original model.

	I made a whole video about it: https://youtu.be/Uzr553x1gdM


	I did a quick speed test for generation against the default model and using `bad_words_ids`. The `bad_words_ids` was only 12k tokens instead of the 30k that were removed and it was still noticeably slower.

	Speed script [here](speed_test.py)
	Launched with [this](run_speed_tests.sh)


	approach \| time to generate 10 tokens
	- \| -
	"naver-clova-ix/donut-base" \| 205ms
	"naver-clova-ix/donut-base" + 12k `bad_words_ids` \| 280ms
	"donut-base-ascii" \| 195ms