Add safety module functionality to readme
#30
by
natolambert
- opened
README.md
CHANGED
@@ -180,6 +180,14 @@ Texts and images from communities and cultures that use other languages are like
|
|
180 |
This affects the overall output of the model, as white and western cultures are often set as the default. Further, the
|
181 |
ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts.
|
182 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
183 |
|
184 |
## Training
|
185 |
|
|
|
180 |
This affects the overall output of the model, as white and western cultures are often set as the default. Further, the
|
181 |
ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts.
|
182 |
|
183 |
+
### Safety Module
|
184 |
+
|
185 |
+
The intended use of this model is with the [Safety Checker](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) in Diffusers.
|
186 |
+
This checker works by checking model outputs against known hard-coded NSFW concepts.
|
187 |
+
The concepts are intentionally hidden to reduce the likelihood of reverse-engineering this filter.
|
188 |
+
Specifically, the checker compares the class probability of harmful concepts in the embedding space of the `CLIPTextModel` *after generation* of the images.
|
189 |
+
The concepts are passed into the model with the generated image and compared to a hand-engineered weight for each NSFW concept.
|
190 |
+
|
191 |
|
192 |
## Training
|
193 |
|