vwu142
/

fine-tuned-pokemon-and-pokemon-card-generator-13000

StableDiffusionPipeline

Inference Endpoints

Model card Files Files and versions Community

fine-tuned-pokemon-and-pokemon-card-generator-13000 / README.md

vwu142's picture

Update README.md

9c3f5d8 verified 4 months ago

|

history blame contribute delete

2.57 kB

	---
	library_name: diffusers
	license: creativeml-openrail-m
	datasets:
	- vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000
	language:
	- en
	---

	# Fine-Tuned Pokemon Generator Model Card

	This model was fined-tuned with a Pokemon and Pokemon Card Image dataset with Stable Diffusion v2-1 as the Base Model

	Most of the documentation would still be the same as the Base Model's repo, but with some of the fine-tuning done

	Base Model Repo: https://huggingface.co/stabilityai/stable-diffusion-2-1

	Dataset: https://huggingface.co/datasets/vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000

	# Stable Diffusion v2-1 text2image fine-tuning - vwu142/fine-tuned-pokemon-and-pokemon-card-generator-13000
	The model was fine-tuned on the vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000 dataset. You can find some example images in the following.

	![img_0](./image_0.png)
	![img_1](./image_1.png)
	![img_2](./image_2.png)

	## How to Get Started with the Model

	```python
	# Building the pipeline with the Fined-tuned model from Hugging Face
	from diffusers import DiffusionPipeline

	pipeline = DiffusionPipeline.from_pretrained("vwu142/fine-tuned-pokemon-and-pokemon-card-generator-13000")
	pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
	pipeline = pipeline.to("cuda")

	# Image generation
	prompt = "A Pokemon Card of the format tag team,with pokemon of type dragon and ghost with the title Gratina in the Tag Team form from Sun & Moon with an Electric type Pikachu as the buddy of the Tag Team"
	images = pipeline(prompt).images
	images

	```

	## Training Details

	### Training Procedure

	The weights were trained on the Free GPU provided in Google Collab.

	The data it was trained on comes from this dataset:
	https://huggingface.co/datasets/vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000

	It has images of pokemon cards and pokemon with various descriptions of the image.

	#### Training Hyperparameters
	```python
	!accelerate launch diffusers/examples/text_to_image/train_text_to_image.py \
	--pretrained_model_name_or_path=$MODEL_NAME \
	--dataset_name=$dataset_name --caption_column="caption"\
	--use_ema \
	--use_8bit_adam \
	--resolution=512 --center_crop --random_flip \
	--train_batch_size=1 \
	--gradient_accumulation_steps=8 \
	--gradient_checkpointing \
	--mixed_precision="fp16" \
	--max_train_steps=$max_training_epochs \
	--learning_rate=1e-05 \
	--max_grad_norm=1 \
	--lr_scheduler="constant" --lr_warmup_steps=0 \
	--output_dir="pokemon-card-model"
	```