Tips

How AI creates 3D photos and stereoscopic images

3D image composition in 3DJournal DX, original picture: GetImg.ai
3D image composition in 3DJournal DX, original picture: GetImg.ai

Generating images through one of the many artificial intelligences that have sprung up in the last year is becoming increasingly popular, so logically you would think: What if we had AI create a stereoscopic (3D) photo for us? And it's just a step from asking the question to typing the first prompt.

Share:  

Let's say two things at the outset: With AI, you can create 3D images. But it's probably not as straightforward as you might think.

At the beginning of our "3D AI" journey, we asked if anyone had any positive experiences with 3D creation through AI (Artificial Intelligence). The result actually didn't surprise us: you can find plenty of anaglyph 3D images all over the web, especially from Midjourney, but when you put on a pair of blue and red glasses, the spatial effect turns out to be missing. The AI will produce an image with blue-green and red "ghosts" but somehow lacks the "understanding" of what they are good for. And how the spatiality of the image should be encoded in them.

Let's test AI

But there are also artificial intelligences, or, actually, their authors, who boast that their models can handle stereoscopy. And that's how we found Civit.AI. In our first test, we did not modeify their standard prompt much and anxiously awaited the result.

Prompt: lora:lora3DAnaglyphImage_lora3DLight:1 colorful, highly detailed beautiful girl in forest, sharp focus, 2k, 4k, 8k, hdr, highres, absurdres, best quality, sharp, smooth, cinematic lighting, detailed background, extremely detailed, powerful impression, hyperdetailed, hyperrealistic, CG, unity, polished, high-definition

Negative prompt: bad-hands-5bad_prompt_version2By bad artist-neg verybadimagenegative_v1.1-6400Unspeakable-Horrors-Composition-4vEasyNegative

The first pictures came out (you can see one of them below) and of course we experimented further.

One of the first results of anaglyph creation in Civitai.ai

One of the first results of anaglyph creation in Civitai.ai

The second time we used a simplified prompt:

lora:lora3DAnaglyphImage_lora3DLight:1 colorful, highly detailed beautiful girl in a modern city, sharp focus, 2k, 4k, 8k, hdr, best quality, sharp, smooth, cinematic lighting, detailed background, extremely detailed, powerful impression, high-definition

Negative prompt: bad-hands, By bad artist

And more tests, more experiments. Is there any way to summarize the result? Well: It looks like an anaglyph, but we don't observe much of a 3D effect in the glasses. Or, perhaps, a little. Is desire the father of thought here? The promise that this AI can handle stereoscopic images seems rather hollow.

Let's ask Bing about 3D

So let's try Bing. He got a query from us: Can you draw a stereoscopic picture? And the result?

In truth - a bit confusing: the AI system Dall-E used to generate the image created split images - as if separately for the left and right eye; but at the same time each of them is an anaglyph, i.e. an image that is already intended for both eyes. Interestingly, his anaglyphs, when viewed through blue-red (or blue-red) glasses, actually appear to have some depth.

A bit of a confused attempt at a 3D image from Bing.

A bit of a confused attempt at a 3D image from Bing.

Encouraged by our partial success, we try it a little differently: Hello, please, paint an anaglyph where is a nice young girl in the foreground and a forest in background.

This time, surprisingly, Bing only offered us three images - and although they look like anaglyphs at first glance, you can't actually see the 3D effect when you put the glasses on. So, here again, the AI may be mimicking anaglyphs, but it doesn't "get" (yes, that may not be quite the right word) their principle.

Since Bing did better with architecture in experiment number one, let's try another assignment: Hello, please, paint an anaglyph of a huge modern city of the future. No success.

So how about trying it another way? Hello, please, try to paint a stereoscopic image, but NOT an anaglyph. In this case, the image looks promising: we always get an image divided into two halves, one for the left eye, one for the right. And the objects seem to be displaced on them, so it might work. But when we test the spatiality, it turns out that it's just a simulation again - and it's not really a 3D image.

Last try: Try, please, to create a stereoscopic - 3D - anaglyph image, but NOT split vertically in the center. Will Bing this time finally create an anaglyph that has the correct stereoscopic effect, but is not split in half at the same time? Will there be any 3D effect?

If we tried hard to see the 3D effect, perhaps we would see some hint of it. But even here, the wish is rather the father of the idea.

Another attempt at anaglyph by Bing

Another attempt at anaglyph by Bing

We're trying to explain

Let's try it another way. We'll explain to Bing's AI exactly what we want:

"Thank you for your efforts to create a stereoscopic anaglyph. I'll be honest with you, the results aren't very good. Let's try again - and I'll explain how the result should work. This time, please draw an anaglyph where far in the background is a desert with an oasis. Because they are far away from the observer, the blue, green and red components of this image are in the same place. In the foreground is a helicopter flying. Because it is close to the observer and he must perceive it as close when using cyan-red glasses, the red component of the helicopter's image must be slightly further to the left than the blue and green components of its image. Can you draw it like that?

Yes, it's a bit of a naive effort, but what if... Is it possible that the Chat GPT language model that is used by Bing could give the Dall-E model that draws images the instructions to actually produce a usable anaglyph? We already know the answer: In short, no.

How to create anaglyph with artificial intelligence

The only reliable way so far seems to be to break the task down into three steps. The first is the classic creation of an image. So you specify any prompt that makes an AI - and it probably doesn't really matter which one it is - draw a nice image.

The second step is to use another AI to create a depth map of the image, a black-and-white image in which the brightness of each area shows how far away it is from the observer. One such AI is MiDaS, and you can use it online at this page.

And the third step is to use the original image and the depth map to create a stereoscopic (3D) image. You can use for example our 3DJournal DX software, which you can download for free in our software section (link in the menu above).

Image created using the procedure mentioned above. Original image: GetImg.ai

Image created using the procedure mentioned above. Original image: GetImg.ai

You can find our gallery with the images created in this way here. It's certainly worth noting that while the errors can be quite overlooked in the anaglyph format, if you use virtual reality goggles, you can clearly see that the conversion to 3D is imperfect. Hopefully this will improve with new versions.

And we are looking forward that maybe in a few weeks or months some AI will be able to create stereoscopic photos on its own. In fact, all it would need to do is link together what different AIs can already do. Or maybe one of us will discover the right trick to make the current AIs do it :).

3DJournal, January 2024
Share: