--- language: - en tags: - stable-diffusion - text-to-image license: creativeml-openrail-m --- # Neopian-Diffusion Stable Diffusion models, starting with [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), trained on images extracted from gifs from https://www.neopets.com/funimages.phtml. CLIP ViT-B/32 (OpenAI) was used to filter the best matching frame of the GIF for every given caption/GIF pair. The frame with the minimum spherical distance was chosen and saved for training. In total this amounts to 1950 images around 100x100px. The DreamBooth models were finetuned at 448x448px on a Colab T4 with the term "low-resolution" concatenated onto 1/3 of prompts, to hopefully combat artifacting in the final results (see this link for a hypothesis from someone on Discord about using negative terms while training Textual Inversions https://cdn.discordapp.com/attachments/1008246088148463648/1041538692432527470/image.png). Example chosen frame of GIF from CLIP | Caption | Unprocessed GIF | Chosen Frame | | --- | --- | --- | | "yurble_baby_clap" | ![](https://images.neopets.com/template_images/yurble_baby_clap.gif) | ![](https://cdn.discordapp.com/attachments/1010693530181718146/1043310485413576794/yurble_baby_clap.jpg) | ## Training Details The text encoder was trained along with the UNet at half precision for 15% of the total 8,000 steps (1,200 steps), and then the UNet was trained alone for the rest. I used a polynomial learning rate decay starting at 2e-6 (the default in fast-DreamBooth). ## How to use with `diffusers` library (section from [openjourney](https://huggingface.co/openjourney/openjourney)) ### Installing necessary libraries _NOTE: This model currently works on a computer which has at least one NVIDIA GPU with CUDA support_. ``` pip install diffusers transformers ftfy scipy accelerate ``` ### Logging in For logging in, you have to use `huggingface-cli login` command. ### Importing necessary libraries ```python import torch from torch import autocast from diffusers.models import AutoencoderKL from diffusers import StableDiffusionPipeline ``` ### Creating the pipeline ```python pipe = StableDiffusionPipeline.from_pretrained("doohickey-neopian-diffusion", use_auth_token=True) pipe = pipe.to("cuda") ``` ### (Optional) Disabling NSFW Filter _NOTE: Remember disabling this is not recommended, but since people had problems with some very basic prompts, we offer this. Remember AI art has a vast majority of users, so keep underage and sensitive users safe._ ```python def dummy(images, **kwargs): return images, False pipe.safety_checker = dummy ``` ### Image Generation ```python prompt = "my prompt" with autocast("cuda"): image = pipe(prompt=prompt, num_inference_steps=100, width=512, height=512, guidance_scale=15).images[0] image.save("image.png") ``` ## Neopets Copyright Notice "Don't forget, if you use these images on a non-Neopets page, you need to include our Copyright Notice." https://www.neopets.com/terms.phtml