Stable Diffusion recently announced its public beta, so I signed up and used the simple method of generating images via DreamStudio. I tried a few less-than-normal instructions such as a banana swimming across a pond in an English village.
Then I tried simpler instructions, such as this one in the image above – ‘Two cows eating in a field’ and this is the result.
In case you are not familiar with AI and what it can do, the idea is that one gives it verbal instructions and it looks in its bank of understanding of the connections between words and visual representations – and then acts on the instructions.
For some reason it decided that four cows are better than two, and it joined the horns of the two cows on the right of the image.
So what do I think of it? I think the tones and shading of the hides of the two cows on the right are really very pleasant. And the cow-ness of the shapes reminds me of primitive paintings or paintings from the early Middle Ages.
The pastoral scene of the grass and trees is faithful to what one might see in real life, and so overall, a good job,
This photo below, on the other hand, is a photo of a real scene.
When I gave Stable Diffusion an instruction to draw a tall thin man chasing a ball down a Manhattan street, it did it, and here it is.
Why is it in black and white? Which word in that instruction told AI to render the scene in monochrome? Why are there two men? Are they chasing the ball or is the ball chasing them?
We all know from experience that there can be a gulf of missed understanding between what a person believes they are communicating and what the recipient believes they are receiving in the communication.
That said, humans have got pretty good at understanding instructions like ‘Pass me the hammer’ or ‘Put the tube in the hole at the top.’
Can we trust AI to reach the same level of understanding?
Person: “Turn the rocket motor off,”
AI: “I have many variations of the verb ‘to turn’ and decided that you meant to turn the rocket motor ninety degrees anticlockwise. Was that correct? I am unsure.”