This is a pre-trained ViT model with danbooru images. It may be accurate when used as a base model for image processing of animated illustrations.