Linaqruf commited on
Commit
5a3a2ea
1 Parent(s): 484371d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -0
README.md ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: creativeml-openrail-m
3
+ language:
4
+ - en
5
+ pipeline_tag: text-to-image
6
+ tags:
7
+ - stable-diffusion
8
+ - stable-diffusion-diffusers
9
+ inference: true
10
+ widget:
11
+ - text: >-
12
+ masterpiece, best quality, 1girl, brown hair, green eyes, colorful, autumn,
13
+ cumulonimbus clouds, lighting, blue sky, falling leaves, garden
14
+ example_title: example 1girl
15
+ - text: >-
16
+ masterpiece, best quality, 1boy, medium hair, blonde hair, blue eyes,
17
+ bishounen, colorful, autumn, cumulonimbus clouds, lighting, blue sky,
18
+ falling leaves, garden
19
+ example_title: example 1boy
20
+ datasets:
21
+ - Linaqruf/anything-v3-1-dataset
22
+ library_name: diffusers
23
+ ---
24
+
25
+ # Anything V3.1
26
+
27
+ ![Anime Girl](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/example_image/thumbnail.png)
28
+
29
+ Anything V3.1, A third party continuation of a latent diffusion model, Anything V3.0. This model is claimed to be a better version of Anything V3.0 with fixed VAE model and fixed CLIP position id key, CLIP reference taken from Stable Diffusion V1.5. VAE Swapped using Kohya's `merge-vae` script and CLIP fixed using Arena's `stable-diffusion-model-toolkit` webui extensions.
30
+
31
+ Anything V3.2, supposed to be a resume training of Anything V3.1. The current model is fine-tuned with a learning rate of `2.0e-6`, 50 epochs and 4 batch sizes on the datasets collected from many sources, with 1/4 of them are synthetic dataset. Dataset has been preprocessed using [Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be converted to latents and trained at non-square resolutions. This model supposed to be a test model to see how clip fix affect training. Like other anime-style Stable Diffusion models, it also supports Danbooru tags to generate images.
32
+
33
+ e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
34
+
35
+ - Use it with the [`Automatic1111's Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui) see: ['how-to-use'](##How-to-Use)
36
+ - Use it with 🧨 [`diffusers`](##🧨Diffusers)
37
+
38
+
39
+ # Model Details
40
+
41
+ - **Currently maintained by:** Cagliostro Research Lab
42
+ - **Model type:** Diffusion-based text-to-image generation model
43
+ - **Model Description:** This is a model that can be used to generate and modify anime-themed images based on text prompts.
44
+ - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
45
+ - **Finetuned from model:** Anything V3.1
46
+
47
+ ## How-to-Use
48
+ - Download `Anything V3.1` [here](https://huggingface.co/Linaqruf/anything-v3-1/resolve/main/anything-v3-1.safetensors), or `Anything V3.2` [here](https://huggingface.co/Linaqruf/anything-v3-1/resolve/main/anything-v3-2.safetensors), all model are in `.safetensors` format.
49
+ - You need to adjust your prompt using aesthetic tags to get better result, you can use any generic negative prompt or use the following suggested negative prompt to guide the model towards high aesthetic generationse:
50
+ ```
51
+ lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
52
+ ```
53
+ - And, the following should also be prepended to prompts to get high aesthetic results:
54
+ ```
55
+ masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details
56
+ ```
57
+ ## 🧨Diffusers
58
+
59
+ This model can be used just like any other Stable Diffusion model. For more information, please have a look at the [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion). You can also export the model to [ONNX](https://huggingface.co/docs/diffusers/optimization/onnx), [MPS](https://huggingface.co/docs/diffusers/optimization/mps) and/or [FLAX/JAX](). Pretrained model currently based on Anything V3.1.
60
+
61
+ You should install dependencies below in order to running the pipeline
62
+
63
+ ```bash
64
+ pip install diffusers transformers accelerate scipy safetensors
65
+ ```
66
+ Running the pipeline (if you don't swap the scheduler it will run with the default DDIM, in this example we are swapping it to DPMSolverMultistepScheduler):
67
+
68
+ ```python
69
+ import torch
70
+ from torch import autocast
71
+ from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
72
+
73
+ model_id = "cag/anything-v3-1"
74
+
75
+ # Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
76
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
77
+ pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
78
+ pipe = pipe.to("cuda")
79
+
80
+ prompt = "masterpiece, best quality, high quality, 1girl, solo, sitting, confident expression, long blonde hair, blue eyes, formal dress"
81
+ negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
82
+
83
+ with autocast("cuda"):
84
+ image = pipe(prompt,
85
+ negative_prompt=negative_prompt,
86
+ width=512,
87
+ height=728,
88
+ guidance_scale=12,
89
+ num_inference_steps=50).images[0]
90
+
91
+ image.save("anime_girl.png")
92
+ ```
93
+ ## Limitation
94
+ - This model is overfitted and can't follow prompt well even after text encoder has been fixed, lead to laziness of prompting because you will get good result only by typing `1girl`
95
+ - This model is an anime based model and also biased to anime female character, it's hard to generating manly male character without hard prompting
96
+
97
+ ## Example
98
+
99
+ Here is some cherrypicked samples and comparison between available models
100
+
101
+ ![Anime Girl](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/example_image/cherry-picked-sample.png)
102
+
103
+ ### Prompt and settings for Example Images
104
+ ```
105
+ masterpiece, best quality, high quality, 1girl, solo, sitting, confident expression, long blonde hair, blue eyes, formal dress, jewelry, make-up, luxury, close-up, face, upper body.
106
+
107
+ Negative prompt: worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry
108
+
109
+ Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 994051800, Size: 512x768, Model hash: ea61e913a0, Model: hitokomoru-v2, Batch size: 2, Batch pos: 0, Denoising strength: 0.6, Clip skip: 2, ENSD: 31337, Hires upscale: 1.5, Hires steps: 20, Hires upscaler: Latent (nearest-exact)
110
+ ``````
111
+
112
+ ## License
113
+
114
+ This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
115
+ The CreativeML OpenRAIL License specifies:
116
+
117
+ 1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
118
+ 2. The authors claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
119
+ 3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)
120
+ [Please read the full license here](https://huggingface.co/spaces/CompVis/stable-diffusion-license)
121
+
122
+
123
+ ## Credit
124
+ Public domain.