Model description

This model is a fine-tuned version of pszemraj/long-t5-tglobal-base-16384-book-summary on a custom sample-size dataset. The dataset was kmfoda/booksum fed into GPT3.5-turbo with a finely tuned prompt to output high quality Stable Diffusion prompts. The small dataset (less than $10 of OpenAI credits) was roughly 15k entries as a proof of concept.

The goal for this model concept was to create a text summarization model that creates decent Stable Diffusion prompts comparable to a human or high-end LLM like GPT-4.

Example generations from an excerpt of Hemingway:

this model: village in late summer, river and plain, mountains, pebbled boulders, blue water, troops marching, dusty trees, soldiers marching along road, crops rich with fruit trees, battle in the mountains, artillery flashes, cool nights, highly detailed, dramatic lighting

gpt-4: desert landscape with camel caravan at sunset, nomad tents, sand dunes, oasis, traditional clothing, dramatic lighting, 8k UHD, highly detailed, masterpiece, digital painting, global illumination

This is a VERY rough proof-of-concept model that could be greatly improved by a higher quality dataset and possibly different hyperparameters.

Training procedure

Training was completed over 7 epochs with a modified version of the run_summarization.py Huggingface training script.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 6
total_train_batch_size: 48
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 7.0

Training results

Training Loss	Epoch	Step	Validation Loss
2.453	0.28	30	2.0444
2.2692	0.56	60	1.8970
2.1485	0.84	90	1.8373
2.0469	1.12	120	1.8033
1.9954	1.4	150	1.7762
1.9778	1.68	180	1.7593
1.9536	1.96	210	1.7472
1.8524	2.24	240	1.7306
1.8438	2.52	270	1.7255
1.8436	2.8	300	1.7140
1.7765	3.08	330	1.7049
1.7537	3.36	360	1.7057
1.7328	3.64	390	1.6977
1.723	3.92	420	1.6973
1.6592	4.2	450	1.7058
1.6563	4.48	480	1.7034
1.6443	4.76	510	1.6969
1.5782	5.04	540	1.6953
1.509	5.32	570	1.7136
1.5516	5.6	600	1.7064
1.558	5.88	630	1.7045
1.5016	6.16	660	1.7182
1.5288	6.44	690	1.7111
1.4665	6.72	720	1.7030

Framework versions

Transformers 4.36.0.dev0
Pytorch 2.1.1+cu118
Datasets 2.15.0
Tokenizers 0.15.0

vahn9995
/

longt5-stable-diffusion-prompt

Model description

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for vahn9995/longt5-stable-diffusion-prompt

Evaluation results