Generating tango images (part 2)

Bing and DALL-E-3

The world of visual generative AI models and apps built around these models has exploded in the last year. So far, many of these models have not been good enough for tango visuals, but this is changing as we speak. And when image generation was very niche 2 years ago, today it’s as easy as downloading an app for your mobile phone. Well, maybe almost that easy: often the apps are not free, since the AI models require quite some computational power, and someone needs to pay for that. Most services offer subscriptions.

In this post, I want to show you a free option based on DALL-E-3, the most recent version of OpenAI’s image-generating AI model. It is available in Microsoft’s Bing brand of tools, and it is also available for end users, and not just as part of their corporate offerings.

When you go to https://www.bing.com/images/create you are greeted with the following page, asking you to describe your image (and log in. you need a free Microsoft account, e.g. a Hotmail account).

Essentially, this is all there is to know:

Go to your model’s interface.
Write some text that describes the image.
Create an image.

Landing page of Bing's DALL-E interface.

Let’s get started!

First prompt: “tango”

This is the most basic prompt you could ever use to create a tango visual. After about a minute you are presented with four images. The generation is based on randomness, so you might see something else than the images below. There is a certain coherence in the creation and its style. If you create with the same prompt again, the results will be different every time due to its randomness. This also means that images created are unique, and not reproducible. (there are mechanisms to make it less random, and we might cover them in a later part of this series)

And here is the second use of the same prompt “tango”. You can see: the idea of tango is there again, but the motive, the style, and the viewpoint is different. Some hands are even looking good!?

First variations of your images

Well, here’s the bad news. In many other genAI image generators you can further modify your images based on one of your outputs, but not in Bing.

So the only thing you can do is to modify your prompt. What is a prompt?

The AI we are using is a text-to-image-generator, and it uses the text to generate an image that fits the description. So a descriptive prompt is needed. Good prompts follow a certain structure, just like language and grammar. Be as specific with details as you can, and hopefully the AI will understand it. Prompt design and Prompt engineering are skills that are just at the beginning, and we don’t really understand everything yet.

Subject – action – visual style
A general prompt structure.

Here are some examples:

a tango couple in front of the pyramids as a pencil sketch

a tango couple in front of the pyramids as a pencil sketch

a tango couple in front of a shopping mall as a photorealistic rendering

As you can see: the general intention in a prompt like “a tango couple in front of a shopping mall as a photorealistic rendering” is realized by the generative model, but maybe not fully perfect. In this case, I was thinking of something else when I wrote “photorealistic rendering”, and you probably, too. This is where prompt engineering comes in to play:

Build a prompt.
Try it out once or twice.
Adjust the words.
Repeat.

Don’t try to make the prompt too complicated in the beginning, and start with a general idea. Then make it more specific, until you understand the general process.

Some links to get you started developing better prompts:

https://www.geeky-gadgets.com/dalle-3-prompts-advanced-guide/
You will also find infinite help on YouTube:

Copyright?

I am not a lawyer: the terms of the service say that images created are not owned by Microsoft, and they claim no rights. Using their services requires that you grant them an unlimited license to use your prompts and images generated.

Ownership of Content. Microsoft does not claim ownership of Prompts, Creations, or any other content you provide, post, input, or submit to, or receive from, Image Creator (including feedback and suggestions). However, by using Image Creator, posting, uploading, inputting, providing, or submitting content you are granting Microsoft and its affiliated companies permission to use the Prompts, Creations, and related content in connection with the operation of their businesses (including, without limitation, all Microsoft Services), including, without limitation, the license rights to: copy, distribute, transmit, publicly display, publicly perform, reproduce, edit, translate and reformat the Prompts, Creations, and other content you provide; and the right to sublicense such rights to any supplier of Image Creator.

Microsoft does not claim ownership of Prompts, Creations, customizations, instructions, or any other content you provide, post, input, or submit to, or receive from, the Online Services (including feedback and suggestions). However, by using the Online Services, posting, uploading, inputting, providing or submitting content you are granting Microsoft, its affiliated companies and third party partners permission to use the Prompts, Creations, customizations (including GPTs) , and related content in connection with the operation of its businesses (including, without limitation, all Microsoft Services), including, without limitation, the license rights to: copy, distribute, transmit, publicly display, publicly perform, reproduce, edit, translate and reformat the Prompts, Creations, and other content you provide; and the right to sublicense such rights to any supplier of the Online Services. Your use of the Online Services does not grant you any ownership rights in any underlying technologies, intellectual property, or other data that comprise or support the Online Services.

You warrant and represent that you own or otherwise control all of the rights to your content as described in this Agreement including, without limitation, all the rights necessary for you to provide, post, upload, input or submit the content.
Section “Ownership” from Microsoft’s terms (Last Updated: November 14, 2023)

This post is part of a series on using generative AI to create tango visuals:
Generating tango images (part 1)

@danieldekay