A new idea: style browsing library
I recently found an artist-style prompt library, It feels good, does not require complex prompts, and is suitable for most mainstream models.
Project website: https://github.com/SupaGruen/StableDiffusion-CheatSheet
I think it would be great to integrate its database into Guernika.app?
It uses a simple json format to store data, like this:
{"Type":"1","Name":"Abbey, Edwin Austin","Born":"1852","Death":"1911","Prompt":"style of Edwin Austin Abbey","NPrompt":"","Category":"Illustration, Painting, Oil, Pastel, Ink, USA, 19th Century","Checkpoint":"Deliberate 2.0","Extrainfo":"","Image":"Edwin-Austin-Abbey.webp","Creation":"202306200852"}
I specially made a simple interface design drawing, and I hope to adopt it if I can.
Hey @andykoko ! sorry for the late response, I have been working on a new update that will, hopefully, improve the creation UI in Guernika and add support for Stable Diffusion XL :D and multiple ControlNets. Here are a couple of images in case you want a sneak peak:
I would love your feedback in this and any other suggestions you have for this screen, I wanna focus on this one at the moment and improve collection manager in a different update.
When I first saw this idea I though, this might not work because some artists/styles may not be available in every model and it could be frustrating if you try to use one and it's not working. But the data does seem really great and at worst it can give people some inspiration on what to try.
The thing is, would it make sense to be able to access this directly from the Create tab instead? I'm not sure how but maybe that would be the best way to work on a prompt, maybe this could take over the left side of the screen.
If not, maybe this could be a companion app or do you think it's worth it to package it together?
That will be the prompt history yes, it will show the last 10 prompts used, and below the field is a row with the most used "keywords".
The problem with that is that the style library is huge! Maybe having that and a separate tab to explore it 🤔I first have to explore what data is actually available.
@GuiyeC For multiple ControlNets, I think it is necessary to provide a function to adjust scaling and translation for input images. For example, I want to make the walking characters in the sample smaller than cat. In this way, multiple input pictures can be better integrated.
I'm a photographer, and I'm looking forward to functions like infinite canvas, integrating inpaint and outpaint, which is the real productivity (like the latest version of PS).
@andykoko that would be the goal, my idea is to have a new tab "Canvas" where you just run whatever model you want wherever you want and then export the result but not in this update, hopefully not long from now.
Guernika 6 should be out now with a few nice things, I'll keep working on it!
@GuiyeC Thank you for your work. Once again, you are in front of Apple's official.👏
I simply tried version 6.0
Bug:
- Once the window height is adjusted, the software will crash, every time.
- The new painting function is great. It supports all models and does not affect the unselected area, but it cannot automatically crop input pictures that do not meet the requirements, otherwise the following error will be prompted:
For input feature 'z', the provided shape 1 × 3 × 1536 × 1024 is not compatible with the model's feature description.
Suggestion:
The setting area UI of version 5.0 seems to be more neat and beautiful.
apple/ml-stable-diffusion with a new PR, it seems that can denoise the preview picture without adding time.
https://github.com/apple/ml-stable-diffusion/pull/210
Other:
I used the latest GuernikaModelConverter 5.0 to convert SDXL 0.9 Base, but I don't know why. Under the same number of steps, resolution and keywords, the generated pictures and dreamstudio.ai are not level at all. The photographic keywords are understood as comics, and the picture quality is very poor, and there are many mosaics.
Update
I found that the random seeds used by dreamstudio.ai are basically 6 digits, so I set the seeds of Guernika to 6 digits and got similar pictures, but the picture quality is still very poor.
@andykoko thanks for the response :)
Bugs:
- do you mean the main window? what tab are you in when it crashes? I'm not able to reproduce this, I will try to see if I got a crash report from Apple
- This should also be working, I will take a look at models with different sizes, does img2img work for you with that model?
Suggestions:
- How would you arrange it? do you have anything in mind?
- This will be in the next update, I had to add it to a few more samplers
Stable Diffusion XL vs DreamStudio:
As far as I can tell the outputs of python implementation and Guernika are pretty similar, here is the same image generated with both, they look almost the same:
This is using the inputs I saw in your picture with PNDM sample some samplers generate bigger differences I will try to take a look at those
It could be that DreamStudio already has an updated model or that they are running the refiner on every generation, refiner model should work in Guernika too as a basic img2img model. It could also have something to do with the sampler being used, is that configurable in DreamStudio?
@GuiyeC It seems to be a problem with the sampler, PNDM can get good results, while the picture quality of DDIM and DPM-Solver++ is very poor, it seems that PNDM requires 40-50 steps.
The following is the picture of my new test(no refine). The picture quality and color are quite good. I think it is more suitable for 6-digit random numbers for SDXL seeds. You can test it.
prompt: Photographic, Beautiful girl standing in the garden
negative prompt: blurry, grainy, low-resolution
steps: 40
seed: 810145
guidance scale: 5
@GuiyeC I observed that there are two repeated remaining time displays in Guernika v6, but the remaining time error has been very large, especially img2img. There is no reference significance. I think it's better to replace it with duration?
In this way, it is convenient to analyze the impact of different parameter settings on the generation time.
@GuiyeC
I made a Apple shortcut to generate Stable Diffusion XL Style, If there is no add style preset plan, this is also a good choice.
https://github.com/czkoko/SDXL-Style-Presets-shortcut
@GuiyeC
I found a project, I don't know if it can help improve Guernika performance. DrawThings relies on it to improve its performance.
https://github.com/philipturner/metal-flash-attention
About the that project, I did see that but I think they are using a completely different implementation, I will have to take a better look but I don't think that will be applicable for these models.
👏 It would be more beautiful if the corresponding icon could be added.
https://github.com/Stability-AI/StableStudio/tree/main/packages/stablestudio-ui/public/presets
👏 It would be more beautiful if the corresponding icon could be added.
https://github.com/Stability-AI/StableStudio/tree/main/packages/stablestudio-ui/public/presets
Dang, that is dope! Though these aren't really suitable as icons given the fidelity of the image. At small sizes it would be hard to distinguish the difference in styles, meaning the list would have to be pretty big to show visually the differences between each style. Perhaps a preview image showing the style next to the dropdown list might be better, but this would also cause a bit more over complication of the UI.
My bad, I see you posted that mockup 16 days ago. That does indeed look and work a lot better.
@GuiyeC
The efficiency is really fast. It looks good. Don't forget to count the number of prompt words displayed with the number of style prompt words.
I think img2img and controlnet are more suitable for the layout below as before.
@GuiyeC that works too, though at a glance the mockup @andykoko allows the user to view all the options at once and choose the one they like. In a situation where there become lots of styles (say, 40+) that would work better perhaps.
However for what we have right now, that works great and is functioning as intended. I like the ability to expand/collapse that field too, means the UI will be less visually cluttered / over-whelming. Nice work!
@andykoko what do you mean? Having a row for Img2Img and then a row for each ControlNet?
The problem with that is that you can expand the right column to be able to work on longer prompts comfortably but the images have an aspect ratio that is not suitable for this, I think it makes sense to have them in a row below.
Waow, I missed the party, sorry for asking the same thing everywhere else,
@andykoko
kind of opened up my eyes with the cheatsheet, and I'm enjoying going back to the basics.
I like the "Style Picker"
@GuiyeC
added for SDXL,
can It exist as a similar customizable for SD1.5 models ? : I'd like to pin multiple tokens to the textbox at once ?
A search field would be great in the style picker too !
Those ideas are very cool, makes my Mac shine !