Stable Diffusion with NightCafe
NightCafe was my first app to play with generative AI for images. I just checked the log of my creations, and it was 2 years ago. Back then the models were not really good, especially not for tango. But it was one of only a few options, and it was fun enough to fiddle with a bit. And it was “free”. You get 5 credits every day if you log in, so you can create a picture or two per day.
This is what that looked like:
There’s tango inside these pictures, but it’s not photorealistic, and mostly weird. But you already see the different styles that are possible. What you don’t see yet, and if you followed my series of posts so far, you would not expect is the following:
The above screenshot shows the options you have on your own images. Do you see the DNA symbol? This is where it gets really interesting: you can mutate an existing image, and use that to trigger additional pictures based on this as a starter. Do you remember our tests with DALL-E, where we would always get new random creations each time? With StableDiffusion we don’t have that restriction! We will explore that a bit later… so stay tuned.
Step by step
Let’s start the process like we did in the articles before, with the most simple prompt. Since this is NightCafe, it is not that simple, because we are confronted with options from the first second. See below for the edit screen in basic mode.
The first thing you need to choose is a model. The default model is good, but you can also use fine-tuned models with different training datasets (e.g. there is a specialized model to create Mangas), and different styles.
As text prompt, we’ll take “tango dancers”.
Here’s the default output. Are you surprised? We have tango, we have glamor, and we have poses. Very cliché, no? It’s the default settings, of course, and it’s the look and feel that we expect from AI-generated images.
Let’s try the same prompt that ChatGPT generated for us yesterday. It doesn’t work, because the word “intimate” is not allowed. I replaced it by “connected” and it starts to render images.
A milonguero dance couple in the midst of a social dance at a milonga in Buenos Aires. The setting is vibrant and authentic, filled with dancers. The couple is in close embrace, capturing the intimate and intricate style of milonguero dancing. The man is dressed in traditional attire, and the woman is in a stylish dress, both moving gracefully on the dance floor. The background shows other dancers and the lively atmosphere of the milonga, with vintage decorations and warm lighting.
Prompt generated by ChatGPT in the previous post.
You see the pictures are different from what we got with DALL-E. There is tango inside, but also enough weirdness. Especially the third image has weird legs and arms. Even the first image, which looks almost like a real (?) milonguero couple has issues.
Let’s take the fourth image and try to make it better using the DNA button, which copies the same settings over to the editor, and also adds the image as an input.
Notice the new section called “Evolving”.
We will use the same seed, so we are as close as possible to the same picture. If we did not choose the same seed, we would only take the composition and setup of the image as a starter, which is an option also with your own images.
Activate “Advanced Settings”.
This enables us to use a negative prompt which contains elements that we don’t want to see. We are starting with the default negative prompt and add more text.
And here are the results:
You see that there is still variation between the images, but the first image is quite close to the starting image we chose. The scenery is very similar, and the general idea of the couple is the same: a follower in a red dress, and a leader in a tuxedo with a similar face, beard, and hair style. But it is also slightly different, and still broken. Check out the hands and the number of fingers, as well as the arm connections between the two dancers.
There are so many more options to use and optimize. It can get fun but also frustrating quickly.
Let’s round this post off with a few variations in style, by adding a style attribute to the text prompt, but keeping everything else the same, especially the starting image and the seed.
Leave a Reply
Only people in my network can comment.