---
language:
  - ja
tags:
  - text-to-image
  - stable-diffusion
  - japanese-stable-diffusion
pipeline_tag: text-to-image
license: other
extra_gated_prompt: By downloading, using, or distributing any portion or element of this model, you agree to be bound by the agreement described in the LICENSE file.
extra_gated_fields:
  Name: text
  Email: text
  Country: text
  Organization or Affiliation: text
  I allow Stability AI to contact me about information related to its models and research: checkbox
---

# Japanese Stable Diffusion XL

![image](./jsdxl.png)

## Model Details

Japanese Stable Diffusion XL (JSDXL) is a Japanese-specific [SDXL](https://arxiv.org/abs/2307.01952) model that is capable of inputting prompts in Japanese and generating Japanese-style images. 

## Usage

```python

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/japanese-stable-diffusion-xl", trust_remote_code=True
)

pipeline.to("cuda")

# if using torch < 2.0
# pipeline.enable_xformers_memory_efficient_attention()

prompt = "柴犬、カラフルアート"

images = pipeline(prompt=prompt).images[0]
```

## Model Details

* **Developed by**: [Stability AI](https://stability.ai/)
* **Model type**: Diffusion-based text-to-image generative model
* **Model Description**: This model is a fine-tuned model based on [SDXL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).
In order to maximize the understanding of the Japanese language and Japanese culture/expressions while preserving the versatility of the pre-trained model, we performed a PEFT training using one Japanese-specific compatible text encoder.
As a PEFT method, we applied [Orthogonal Fine-tuning (OFT)](https://arxiv.org/abs/2306.07280) for better results and training stability.
 
* **License**: [STABILITY AI JAPANESE STABLE DIFFUSION XL COMMUNITY LICENSE](./LICENSE)


## Uses

### Direct Use

The model is intended for research purposes only. Possible research areas and tasks include

- Generation of artworks and use in design and other artistic processes.
- Applications in educational or creative tools.
- Research on generative models.
- Safe deployment of models which have the potential to generate harmful content.
- Probing and understanding the limitations and biases of generative models.

Excluded uses are described below.

### Out-of-Scope Use

The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

## Limitations and Bias

### Limitations

- The model does not achieve perfect photorealism
- The model cannot render legible text
- The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
- Faces and people in general may not be generated properly.
- The autoencoding part of the model is lossy.

### Bias
While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.


## How to cite

```bibtex
@misc{JSDXL, 
    url    = {[https://huggingface.co/stabilityai/japanese-stable-diffusion-xl](https://huggingface.co/stabilityai/japanese-stable-diffusion-xl)}, 
    title  = {Japanese Stable Diffusion XL}, 
    author = {Shing, Makoto and Akiba, Takuya}
}
```