--- language: - ja tags: - text-to-image - stable-diffusion - japanese-stable-diffusion pipeline_tag: text-to-image license: other extra_gated_prompt: By downloading, using, or distributing any portion or element of this model, you agree to be bound by the agreement described in the LICENSE file. extra_gated_fields: Name: text Email: text Country: text Organization or Affiliation: text I allow Stability AI to contact me about information related to its models and research: checkbox --- # Japanese Stable Diffusion XL ![image](./jsdxl.png) ## Model Details Japanese Stable Diffusion XL (JSDXL) is a Japanese-specific [SDXL](https://arxiv.org/abs/2307.01952) model that is capable of inputting prompts in Japanese and generating Japanese-style images. ## Usage ```python from diffusers import DiffusionPipeline import torch pipeline = DiffusionPipeline.from_pretrained( "stabilityai/japanese-stable-diffusion-xl", trust_remote_code=True ) pipeline.to("cuda") # if using torch < 2.0 # pipeline.enable_xformers_memory_efficient_attention() prompt = "柴犬、カラフルアート" images = pipeline(prompt=prompt).images[0] ``` ## Model Details * **Developed by**: [Stability AI](https://stability.ai/) * **Model type**: Diffusion-based text-to-image generative model * **Model Description**: This model is a fine-tuned model based on [SDXL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0). In order to maximize the understanding of the Japanese language and Japanese culture/expressions while preserving the versatility of the pre-trained model, we performed a PEFT training using one Japanese-specific compatible text encoder. As a PEFT method, we applied [Orthogonal Fine-tuning (OFT)](https://arxiv.org/abs/2306.07280) for better results and training stability. * **License**: [STABILITY AI JAPANESE STABLE DIFFUSION XL COMMUNITY LICENSE](./LICENSE) ## Uses ### Direct Use The model is intended for research purposes only. Possible research areas and tasks include - Generation of artworks and use in design and other artistic processes. - Applications in educational or creative tools. - Research on generative models. - Safe deployment of models which have the potential to generate harmful content. - Probing and understanding the limitations and biases of generative models. Excluded uses are described below. ### Out-of-Scope Use The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model. ## Limitations and Bias ### Limitations - The model does not achieve perfect photorealism - The model cannot render legible text - The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere” - Faces and people in general may not be generated properly. - The autoencoding part of the model is lossy. ### Bias While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases. ## How to cite ```bibtex @misc{JSDXL, url = {[https://huggingface.co/stabilityai/japanese-stable-diffusion-xl](https://huggingface.co/stabilityai/japanese-stable-diffusion-xl)}, title = {Japanese Stable Diffusion XL}, author = {Shing, Makoto and Akiba, Takuya} } ```