Trained for 0 epochs and 500 steps.

Trained with datasets ['text-embeds-pixart-nofilter', 'photo-concept-bucket']
Learning rate 4e-07, batch size 1, and 1 gradient accumulation steps.
Used DDPM noise scheduler for training with epsilon prediction type and rescaled_betas_zero_snr=False
Using 'trailing' timestep spacing.
Base model: PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
VAE: madebyollin/sdxl-vae-fp16-fix

Files changed (8) hide show

README.md +101 -0
optimizer.bin +3 -0
random_states_0.pkl +3 -0
scheduler.bin +3 -0
training_state-photo-concept-bucket.json +0 -0
training_state.json +1 -0
transformer/config.json +30 -0
transformer/diffusion_pytorch_model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,101 @@

+---
+license: creativeml-openrail-m
+base_model: "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS"
+tags:
+  - stable-diffusion
+  - stable-diffusion-diffusers
+  - text-to-image
+  - diffusers
+  - full
+inference: true
+---
+# pixart-sigma
+This is a full rank finetune derived from [PixArt-alpha/PixArt-Sigma-XL-2-1024-MS](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS).
+The main validation prompt used during training was:
+```
+a cute anime character named toast holding a sign that says SOON, sitting next to a red square on her left side, and a transparent sphere on her right side
+```
+## Validation settings
+- CFG: `6.5`
+- CFG Rescale: `0.7`
+- Steps: `30`
+- Sampler: `ddpm`
+- Seed: `42`
+- Resolutions: `1024x1024,1152x960,896x1152`
+Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
+<Gallery />
+The text encoder **was not** trained.
+You may reuse the base model text encoder for inference.
+## Training settings
+- Training epochs: 0
+- Training steps: 500
+- Learning rate: 4e-07
+- Effective batch size: 8
+  - Micro-batch size: 1
+  - Gradient accumulation steps: 1
+  - Number of GPUs: 8
+- Prediction type: epsilon
+- Rescaled betas zero SNR: False
+- Optimizer: AdamW, stochastic bf16
+- Precision: Pure BF16
+- Xformers: Not used
+## Datasets
+### photo-concept-bucket
+- Repeats: 0
+- Total number of images: ~559160
+- Total number of aspect buckets: 1
+- Resolution: 1.0 megapixels
+- Cropped: True
+- Crop style: center
+- Crop aspect: square
+## Inference
+```python
+import torch
+from diffusers import DiffusionPipeline
+model_id = "pixart-sigma"
+prompt = "a cute anime character named toast holding a sign that says SOON, sitting next to a red square on her left side, and a transparent sphere on her right side"
+negative_prompt = "malformed, disgusting, overexposed, washed-out"
+pipeline = DiffusionPipeline.from_pretrained(model_id)
+pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
+image = pipeline(
+    prompt=prompt,
+    negative_prompt='',
+    num_inference_steps=30,
+    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
+    width=1152,
+    height=768,
+    guidance_scale=6.5,
+    guidance_rescale=0.7,
+).images[0]
+image.save("output.png", format="PNG")
+```

optimizer.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c85d8a2d1c97fe78d2306ffefdf9dd68c4e5ed997dd0037c69d22b6586efcee
+size 3665677155

random_states_0.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c8e9a343abcece94a75918726ea0ae31aa7e80c90cc8eb4a3cff5cb7062d3fb
+size 16100

scheduler.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:49246353654cfb564d254bc11935f6ff1ba736f5d3d0b93bdad25d62d10b67ac
+size 1000

training_state-photo-concept-bucket.json ADDED Viewed

The diff for this file is too large to render. See raw diff

training_state.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"global_step": 500, "epoch_step": 500, "epoch": 1, "exhausted_backends": [], "repeats": {}}

transformer/config.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "_class_name": "PixArtTransformer2DModel",
+  "_diffusers_version": "0.30.0.dev0",
+  "_name_or_path": "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS",
+  "activation_fn": "gelu-approximate",
+  "attention_bias": true,
+  "attention_head_dim": 72,
+  "attention_type": "default",
+  "caption_channels": 4096,
+  "cross_attention_dim": 1152,
+  "double_self_attention": false,
+  "dropout": 0.0,
+  "in_channels": 4,
+  "interpolation_scale": 2,
+  "norm_elementwise_affine": false,
+  "norm_eps": 1e-06,
+  "norm_num_groups": 32,
+  "norm_type": "ada_norm_single",
+  "num_attention_heads": 16,
+  "num_embeds_ada_norm": 1000,
+  "num_layers": 28,
+  "num_vector_embeds": null,
+  "only_cross_attention": false,
+  "out_channels": 8,
+  "patch_size": 2,
+  "sample_size": 128,
+  "upcast_attention": false,
+  "use_additional_conditions": false,
+  "use_linear_projection": false
+}

transformer/diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:17df19bafac045f1612004a7c7f19e5e6d90004e7ca841b9a491fad08e3a797e
+size 1221780352