...among many other things. Plus, the training set for video is orders of magnitude smaller than for digital art. (And is additionally burdened with copyright issues.)
As I see it, there's simply no path from the DALL-E of today to something like that. And all for art that, essentially, "says nothing and means nothing".
Characters and dialogue are effectively solved, just look at GPT-3.
The entity behind StableDiffusion is also supporting generative music art, so let's see what is coming out of that: https://www.harmonai.org/
We are currently far away from generating a production quality movie with AI, but I don't think it's going to be nearly as long as a lifetime. In my opinion, we'll have high quality AI shorts within the decade.
>Characters and dialogue are effectively solved, just look at GPT-3.
Is this the motherload of exaggeration?
Current language models cannot generate coherent dialog (and even then it's mostly bad dialog) spanning more than a minute or two. And their current capabilities in that area are definitely significantly below those of the average human writer.
We were talking about a Marvel action flick, I don't think incredible dialog spanning multiple minutes is much of a thing apart from exposition dumps. I asked GPT-3 to spit out some paragraphs from a hypothetical script for Thor 5:
INT. DARKNESS
We hear a faint beating heart. A moment later, we see a light slowly growing in the darkness. As the light grows, we see that it is coming from a glowing object in a person’s hand. The object is a hammer.
We see the face of the person holding the hammer. It is Thor. He looks tired and beaten.
Suddenly, we hear a voice from the darkness.
Black Panther: You are not welcome here, Thor.
Thor: I know. But I must speak with you.
Black Panther: You have nothing to say that I want to hear.
Thor: I come bearing a warning. Thanos is coming.
Black Panther: We are prepared.
Thor: He is not coming alone. He has an army.
Black Panther: So do we.
Thor: Thanos is not like any enemy you have faced before. He is ruthless and he will not stop until he has destroyed everything that you hold dear.
Black Panther: We will stop him.
Thor: I hope you can. Because if you cannot, then all is lost.
Eh, looks real enough to me. Fine tune the model with all the specialities that make up Marvel movies and you'll crank out good-enough drafts in no time.
>cannot generate coherent dialog (and even then it's mostly bad dialog) spanning more than a minute or two
I think that was pretty clear and that posted dialog is a perfect illustration.
You cannot generate the entire movie script coherently without significant human input and that's not going to change in the next several years. So, your initial claim that dialogue is "solved" is indeed false.
That's true. But the thing with technology, and the reason we've kept up with Moore's law is that someone eventually has a bright idea that leaves current methods and improvement extrapolation in the dust, and then the real thing happens earlier than the most optimistic dates, and performs better than what people expected.
The question is not if one day an AI can generate a movie that you can't differentiate from a human-made movie. The question is how long will it take for an AI generated movie to be better than all the human-made movies in history, if it's possible at all it'll happen much sooner than people think possible.
* Coherent video
* Characters with backstory
* Dialogue (including jokes and witty banter)
* Music
...among many other things. Plus, the training set for video is orders of magnitude smaller than for digital art. (And is additionally burdened with copyright issues.)
As I see it, there's simply no path from the DALL-E of today to something like that. And all for art that, essentially, "says nothing and means nothing".