vwu142's picture
Update README.md
9c3f5d8 verified
---
library_name: diffusers
license: creativeml-openrail-m
datasets:
- vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000
language:
- en
---
# Fine-Tuned Pokemon Generator Model Card
This model was fined-tuned with a Pokemon and Pokemon Card Image dataset with Stable Diffusion v2-1 as the Base Model
Most of the documentation would still be the same as the Base Model's repo, but with some of the fine-tuning done
Base Model Repo: https://huggingface.co/stabilityai/stable-diffusion-2-1
Dataset: https://huggingface.co/datasets/vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000
# Stable Diffusion v2-1 text2image fine-tuning - vwu142/fine-tuned-pokemon-and-pokemon-card-generator-13000
The model was fine-tuned on the vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000 dataset. You can find some example images in the following.
![img_0](./image_0.png)
![img_1](./image_1.png)
![img_2](./image_2.png)
## How to Get Started with the Model
```python
# Building the pipeline with the Fined-tuned model from Hugging Face
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("vwu142/fine-tuned-pokemon-and-pokemon-card-generator-13000")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
pipeline = pipeline.to("cuda")
# Image generation
prompt = "A Pokemon Card of the format tag team,with pokemon of type dragon and ghost with the title Gratina in the Tag Team form from Sun & Moon with an Electric type Pikachu as the buddy of the Tag Team"
images = pipeline(prompt).images
images
```
## Training Details
### Training Procedure
The weights were trained on the Free GPU provided in Google Collab.
The data it was trained on comes from this dataset:
https://huggingface.co/datasets/vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000
It has images of pokemon cards and pokemon with various descriptions of the image.
#### Training Hyperparameters
```python
!accelerate launch diffusers/examples/text_to_image/train_text_to_image.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--dataset_name=$dataset_name --caption_column="caption"\
--use_ema \
--use_8bit_adam \
--resolution=512 --center_crop --random_flip \
--train_batch_size=1 \
--gradient_accumulation_steps=8 \
--gradient_checkpointing \
--mixed_precision="fp16" \
--max_train_steps=$max_training_epochs \
--learning_rate=1e-05 \
--max_grad_norm=1 \
--lr_scheduler="constant" --lr_warmup_steps=0 \
--output_dir="pokemon-card-model"
```