In Stable Diffusion, the term img2img stands for image-to-image.
Unlike most AI image generation methods, where you start with text prompts alone, you would provide an image as the starting point.
When combined with a text prompt in conjunction with a parameter called Denoising, you can then adjust the degree of influence from the prompt.
For the Denoising parameter, 0 = none, 1 = full. Everything in between is thus everything in between.
1/ I created the initial image in Midjourney. It’s baby bulldog wearing a dragon onesie.
2/ In Stable Diffusion, I used the same text prompt “bulldog wearing a dragon onesie, painting by kandinsky” to influence the image generation. By varying the Denoising parameter from 0 to 1 in 0.2 intervals, I got these results.
3-8/ Here are all 6 images produced through the different denoising values, while keeping all the other settings the same (including the seed)
Highly generally speaking, if the intention is to keep the image composition relatively intact, while having the text prompt make a bit of changes, then the sweet spot tends to be around 0.45 - 0.65. You will need to experiment with this. Both the text prompts and the checkpoint models used will greatly affect the ideal denoising value. What is “ideal” is also entirely subjective, and depends highly on your intentions.
9-10/ WebUI Forge, a fork of Automatic1111’s popular web UI is used for these. It has the same ease of use as Automatic1111, but with memory efficiency of ComfyUI. It’s developed by Lvmin Zhang, aka lllyasviel, one of the authors of Control Net — an image generative AI technology that has fundamentally changed how Stable Diffusion is used. Control Net is a deep topic, and is often seen implemented into commercial AI products using “clever marketing names,” but to nerds around the world who have used CN since the very beginning, no commercialized “pretty names” could fool anyone about the real hero behind their “innovation.”
This is also a wish I have for humanity — to never stop imagining. I believe that AI is an important technology for it allows those who don’t have artisitic abilities to visualize their dreams and imagination — so that we can revisit that imagination we have as a child.