This is one of the most important moments in all of art history. Millions of people just got unconditional access to the state-of-the-art in AI text-to-image for absolutely free less the cost of hardware. I have an Nvidia GPU myself and am thrilled beyond belief with the possibilities that this opens up.
Am planning on doing some deep dives into latent-space exploration algorithms and hypernetworks in the coming days! This is so, so, so exciting. Maybe the most exciting single release in AI since the invention of the GAN.
EDIT: I'm particularly interested in training a hypernetwork to translate natural language instructions into latent-space navigation instructions with the end goal of enabling me to give the model natural-language feedback on its generations. I've got some rough ideas but haven't totally mapped out my approach yet, if anyone can link me to similar projects I'd be very grateful.
>This is one of the most important moments in all of art history.
I agree, but not for the reasons you imply. It will force real artists to differentiate themselves from AI, since the line is now sufficiently blurred. It's probably the death of an era of digital art as we know it.
Making art is already not making much money for the vast majority of producers (outside of 3d modeling). There really aren't very many jobs in making art. I'd reckon most people are artists because they love doing it.
This is completely wrong, the digital art industry is enormous and, mind you, pretty uniformly pissed off that this model has been trained off of their non public-domain work.
They're mostly posting incorrect claims like "all it does it make collages out of other people's art" though. It doesn't do that; the model stores about 1 byte per original image it's seen.
Think of that 1 byte as a hyper-compressed essence of knowledge necessary to make said collage. Then it will be correct. Those models would not exist without the labor of the artist community - which in this case was used without consent and without pay.
These models would exist, just without painterly style. Most of the training data is photos of real things, a lot of which is stock photography that has been ripped off much in the same way like those artists. It is a challenge to copyright law, but if artists are allowed learn by looking at other peoples work, why treat an AI differently?
You're kicking the can down the road. Obviously, an AI is not a person, that's in the premise of the question. What makes an AI so different from a person that warrants differential treatment in this case? It's not that there aren't any good answers to this question, but yours is not much of an answer at all.
It's probably okay to learn facts from copyrighted material even if you're an AI - you can't reproduce the text of a novel but you can learn the meaning of a word from context in it. Similarly you can learn to draw a hand from looking at a ton of stock photos with hands, as long as you produce original hands.
AI will need some favorable legal precedents to avoid getting banned though, or else they'll have to only train off CC0 Flickr/Wikipedia scraping.
I also think it's a more obvious problem that it can reproduce copyrighted characters by eg prompting for "Homer Simpson".
Did we say the same in the 80s when audio sampling became a thing? We accepted it (after obligatory legal battles) and moved on, giving rise to the Creative Commons.
Yeah, the fact that these models are necessarily based on existing works leaves me hopeful that humans will remain the leaders in this space for the time being.
Human works are needed to create the initial datasets, but an increasing amount of models use generative feedback loops to create more training data. This layer can easily introduce novel styles and concepts without further human input.
The time is coming where we will need to, as patrons, reevaluate our relationships with art. I fear art is returning to a patronage model, at least for now, as certainly an industry which already massively exploits digital artists will be more than happy to replace them with 25% worse performing AI for 100% less cost.
The generated pictures that are posted in the blog post are superior than the average artist work. Which isnt surprising, AI corrects "human mistake" (e.g. composition, colors, contrasts, lines, etc) easily.
Why would people want to consume art that says nothing and means nothing? While this technology is fascinating, it produces the visual equivalent of muzak, and will continue to do so in perpetuity without the ability to reason.
That's the problem for me too. This tech is cool for games, stock images etc but for actual art it's pretty meaningless. The artist's experience, biography and relationship with the world and how that feeds into their work is the WHOLE point for me. I want to engage, via any real artistic product, with a lived human experience. Human consciousness in other words.
To me this technology is very clever but it's meaningless as far as real art goes, or it's a sideshow at best. Perhaps best case, it can augment a human artistic practice.
Oh, nice take! Reminds me of the chess world, long time ago chess entities that could beat any human in the world have existed, people wouldn't play them because losing all the time was boring.
But then came the neural network chess engines (powered by the same training technology of text to images generators), which even people enjoyed playing for some time due to novelty and how they learned how to play from scratch (Alpha Zero, Lc0, et al.)
In the process to get there, we got networks of all kinds of strengths, you could find one exactly as strong as you, and its mistakes were like the mistakes you would make.
And yet, they were missing the "human factor", people would rather play against other humans online, some even willing to pay for accounts in places like playchess and chess.com to play other humans.
As the networks became stronger and then stockfish assimilated them to get the best of both worlds with NNUE, nobody cared and there was even an explosion of human vs. human chess (when twitch.tv and youtube stars were playing each other and the audience didn't even care how bad the chess was, it turned out that it held up as a great spectacle despite, or thanks to the stars being novices - and the race to get better at chess.)
Now chess bots are just a curiosity and only used by people without online access to other humans, I wonder if it'll be the same for art, and if "show your work" becomes a thing.
You can ask an AI to produce a great picture... now try asking it to make a video of you making that art from scratch and the creative process - the whole art section of twitch tv is about the artist's process, and yes, there'S some people that get enough from that to dedicate all their time to their art.
We want to interact with conscious entities for the obvious reason as well that we want to connect. A machine is a blind, dead entity. There's nothing to connect to. Also, even if the output is far superior to a human in limited domains, eg chess, the way it arrives at these outcomes is sort of banal (if clever). It's not intelligence and its not thinking, I think the 'intelligence' part is a misnomer in AI but perhaps its a semantic argument. To me at least, consciousness is fundamental to our type of animal intelligence. I'm a naturalist through and through, we might even be able to create animal-like intelligence and consciousness one day, but until then at least, interacting with Turing machines is a cold boring experience if you know what is really 'inside'.
The line gets blurry when a dead machine one day passes the Turing test, but if I ultimately knew I was interacting with a philosophical zombie, that would kill the appeal quickly.
It's easy to generate believable backstory. A large LM comes to write the bio of the "artist" and even a few press interviews. If you like you can chat with it, and of course request art to specification. You can even keep it as a digital companion. So you can extend the realism of the game as much as you like.
Can photography be good art? Is Marcel Duchamp (found object) art? Can good art be discovered almost serendipitously, or can good art only be created by slowly learning and applying a skill?
I think art is mostly about perception and selection, by the viewer. There are others that think art is more about the crafting process by the artist. How do you tell the difference between an artist and a craftsperson?
We can tell the difference between muzak and "real music"; we just dislike the muzak. But the real risk and likelihood is that we get to the point that AI will be generating art that is indistinguishable from human-generated art, and it's only muzak if someone subscribes to the idea that the content of art is less relevant than its provenance. Some people will, particularly rich people who use art as a status signifier/money laundering vehicle, but mass media artists will struggle to find buyers among the less discerning mass audience.
Considering the phenomenal progression of Dall-E-1 to Dall-E-2 in just over a year, I'm not really understanding your confidence on the limits of AI content generation.
...among many other things. Plus, the training set for video is orders of magnitude smaller than for digital art. (And is additionally burdened with copyright issues.)
As I see it, there's simply no path from the DALL-E of today to something like that. And all for art that, essentially, "says nothing and means nothing".
Characters and dialogue are effectively solved, just look at GPT-3.
The entity behind StableDiffusion is also supporting generative music art, so let's see what is coming out of that: https://www.harmonai.org/
We are currently far away from generating a production quality movie with AI, but I don't think it's going to be nearly as long as a lifetime. In my opinion, we'll have high quality AI shorts within the decade.
>Characters and dialogue are effectively solved, just look at GPT-3.
Is this the motherload of exaggeration?
Current language models cannot generate coherent dialog (and even then it's mostly bad dialog) spanning more than a minute or two. And their current capabilities in that area are definitely significantly below those of the average human writer.
We were talking about a Marvel action flick, I don't think incredible dialog spanning multiple minutes is much of a thing apart from exposition dumps. I asked GPT-3 to spit out some paragraphs from a hypothetical script for Thor 5:
INT. DARKNESS
We hear a faint beating heart. A moment later, we see a light slowly growing in the darkness. As the light grows, we see that it is coming from a glowing object in a person’s hand. The object is a hammer.
We see the face of the person holding the hammer. It is Thor. He looks tired and beaten.
Suddenly, we hear a voice from the darkness.
Black Panther: You are not welcome here, Thor.
Thor: I know. But I must speak with you.
Black Panther: You have nothing to say that I want to hear.
Thor: I come bearing a warning. Thanos is coming.
Black Panther: We are prepared.
Thor: He is not coming alone. He has an army.
Black Panther: So do we.
Thor: Thanos is not like any enemy you have faced before. He is ruthless and he will not stop until he has destroyed everything that you hold dear.
Black Panther: We will stop him.
Thor: I hope you can. Because if you cannot, then all is lost.
Eh, looks real enough to me. Fine tune the model with all the specialities that make up Marvel movies and you'll crank out good-enough drafts in no time.
>cannot generate coherent dialog (and even then it's mostly bad dialog) spanning more than a minute or two
I think that was pretty clear and that posted dialog is a perfect illustration.
You cannot generate the entire movie script coherently without significant human input and that's not going to change in the next several years. So, your initial claim that dialogue is "solved" is indeed false.
That's true. But the thing with technology, and the reason we've kept up with Moore's law is that someone eventually has a bright idea that leaves current methods and improvement extrapolation in the dust, and then the real thing happens earlier than the most optimistic dates, and performs better than what people expected.
The question is not if one day an AI can generate a movie that you can't differentiate from a human-made movie. The question is how long will it take for an AI generated movie to be better than all the human-made movies in history, if it's possible at all it'll happen much sooner than people think possible.
Humans are trained on other humans' work as well though. Is there a type of ideological or aesthetic exploration that can't be expressed as part of an AI model?
Labour protections/willingness to strike, I suspect they mean. But I don't buy it. I've seen far too many people who "should" be worried about this technology instead be absolutely in love with it.
> particularly interested in training a hypernetwork to translate natural language instructions into latent-space navigation instructions with the end goal of enabling me to give the model natural-language feedback on its generations.
Imagine every conceivable image is laid out on the ground, images which are similar to each other are closer together. You’re looking at an image of a face. Some nearby images might be happier, sadder, with different hair or eye colours, every possibility in every combination all around it. There are a lot of images, so it is hard to know where to look if you want something specific, even if it is nearby. They’re going to write software to point you in the right direction, by describing what you want in text.
AFAICT: making a navigation/direction model that can translate phrase-based directions into actual map-based directions, with the caveat that the model would be updated primarily by giving it feedback the same way that you would give a person feedback.
Sounds only a couple of steps removed from basically needing AGI?
I suspect you’d want to start by trying to translate differences between images into descriptive differences. Maybe you could generate examples by symbolic manipulation to generate pairs of images or maybe nlp can let us find differences between pairs of captions? Large nlp models already feel pretty magical to me and encompass things that we would have said required AGI until recently so it seems possible, though really tough
I have a friend who works as an artist and he's excited and nervous about this. But also he's trying to learn how to use these well. If you try these AI out, there's definitely an art to writing good prompts that gets what you actually want. Hopefully these AI will become just another brush in the artist's palette rather than a replacement.
I hope these end up similar to the relationship between Google and programming. We all know the jokes about "I don't really know how to code, I just know how to Google things". But using a search engine efficiently is a very real skill with a large gap between those who know how and those who don't.
Replying to myself because I just had a chat with him about this. He's thinking of getting a high end GPU now, lol.
Some ideas of how this could be useful in the future to assist artists:
Quickly fleshing out design mockups/concepts is the obvious first one that you can do right now.
An AI stamp generator. Say you're working on a digital painting of a flower field. You click the AI menu, "Stamp", a textbox opens up and you type "Monarch butterfly. Facing viewer. Monet style painting." And you get a selection of ai generated images to stamp into your painting.
Fill by AI. Sketch the details of a house, select the Fill tool, select AI, click inside one of the walls of the house, a textbox pops up, you write "pre-war New York brick wall with yellow spray painted graffiti"
I have a friend who paints, google/ athanasart. His paintings are usually purchased before they are even finished, and sometimes even stolen. All of that technology, Dalle2, Disco diffusion, Stable Diffusion, is totally boring to him. He doesn't even care. He will not install this technology to his machine, or use it online, even if he is paid to do it. I love Craiyon, Dalle2, Stable Diffusion, but just beautiful images, are not art.
But the definition of art can't depend on your knowledge of how was it made. I can show you two beautiful pictures and ask you if any of them is art, and you'd know I was trying to trick you. But you could pass a piece made by an AI as art if you didn't know how it was produced.
Yes, in case someone gets a painting without any additional information, a computer art piece, could easily be mistaken for a human art piece. That's true. My position on the subject is as follows, the best art for humans is made by humans, and the best art for computers is made by computers. Actually i have yet to see a computer generation on the same level as Mona Lisa, not even close. However computer generated women, some of them are alright!
Kinda like the creation of copilot and it's ilk for programmers, and gpt3 for writers. Ive seen some talk recently around 'prompt engineers'... Probably to some extent, every job will become prompt engineering in some way.
Eventually I suppose the AIs will also do the prompts.
At which point I hope we've all agreed to a star trek utopia, or it's gonna get real bad. Or maybe it'll get way better.
I think right now we could setup the AI's to do the prompts. You type in a vague description - "gorilla in a suit" and that is passed to GPT-3's API with instructions to provide a detailed and vivid description of the input in X style where X is one of several different styles. GPT-3 generates multiple prompts, the prompts passed to Stable Diffusion, the user gets back a grid of different images and styles. Selecting an image on the results grid prompts for variation, possibly of the prompt and the images.
Yeah, if we're gonna replace every fucking profession with a half-assed good-enough AI version, what're we even here for? We're sure not all gonna survive in a capitalist society where you have to create some kind of "value" to earn enough money to pay for your roof and your food and your power.
IIRC there is some vague "it sure got real bad" somewhere in the Trek timelines between "capitalism ended" and "post-scarcity utopia" and I sure am not looking forwards to living through those times. Well, I'm looking forwards to part of that, I'm looking forwards to the part where we murder a lot of landlords and rent-seekers and CEOs and distribute their wealth. That'll be good.
Next let's get rid of all the artists and replace them with AI. Redistribute their skills so we can all make art. Oh wait that just happened. You're a hypocrite that wants to redistribute the wealth of others, but not your own.
Also joking about murdering people is bad taste and not how you convey a point or win an argument. Very low class.
> Next let's get rid of all the artists and replace them with AI. Redistribute their skills so we can all make art. Oh wait that just happened. You're a hypocrite that wants to redistribute the wealth of others, but not your own.
Recognising that doing the former without the latter demonstrably hurts people isn’t being hypocritical. Hence all the talk of post-scarcity. Post-scarcity for me not for thee is very much a sign of the times though.
There weren't "uprisings" mostly because the destroyed jobs were replaced with others that paid better and were less "dangerous". There were some local minima (Detroit auto workers) where this happen only partially and we know the pathology this led to.
Is this replacement of jobs happening this time around? No. So violence there will be.
(OP here) I agree. I am an artist (not by trade but by lifelong obsession) in several different mediums, but also an AI engineer--so I feel a weird mixture of emotions tbh. I'm thrilled and excited but also terrified lol.
It's my fucking job. I've spent my whole fucking life getting good at drawing. I can probably manage to keep finding work but I am really not happy about what this is going to do to that place where a young artist is right at the threshold of being good enough to make it their job. Because once you're spending most of your days doing a thing, you start getting better at it a lot faster. And every place this shit gets used is a lost opportunity for an artist to get paid to do something.
I wanna fucking punch everyone involved in this thing.
I mean I get what you're saying, sucks to have someone or something take your job, but isn't this a neo-luddite type argument? AI is gonna come for us all eventually.
Please save this comment and re-read it when a new development in AI suddenly goes from "this is cute" to "holy fuck my job feels obsolete and the motherfuckers doing it are not gonna give a single thought to all the people they're putting out of work". Thank you.
Look at that, you said the same thing to me 8 days ago [0]. I'll stick to the same rebuttals you got for that comment as well, namely that AI comes for us all, the only thing to be done is to adapt and survive, or perish. Like @yanderekko says, it is cowardice to assume we should make an exception for AI in our specific field of interest.
Admittedly as someone who's been subscribing to Creative Cloud for a while I already wanna punch a lot of people at Adobe so the people working on this particular part of Photoshop are gonna have to get in line.
You got to move one step higher and work with ideas and concepts instead of brushes. AI can generate more imagery than was previously possible, so it's going to be about story telling or animation.
I make comics, it's already about storytelling and ideas as much as it is about drawing stuff. I make comics in part because I like drawing shit and that gives me a framework to hang a lot of drawings on. I like the set of levels I work at and don't want to change it. I've spent an entire fucking lifetime figuring out how to make my work something I enjoy and I sure bet nobody involved in this is gonna fling a single cent in the direction of the artists they're perpetrating massive borderline copyright infringement upon.
But here's all these motherfuckers trying to automate me out of a job. It's not even a boring, miserable job. It's a job that people dream of having since they were kids who really liked to draw. Fuck 'em.
This feels so wrong this AI thing, strangely enough Im not that stressed.
Look at the recursive side of it its hilarious. Im an artist. AI is around. Do I want to put my work online? No. So no artists put stuff online anymore? So how does future of art work exactly? Will send booklets by mail again?
There is the democratic side of it. We the people ultimately decides. Those pictures are generated by using artists works, not out of thin air. So not a freedom question here but a people one: do we want that or not? Maybe with 30+ years old pictures for example, could be good for economy.
I mean you can literally replace any job you want but the artist's it seems. Imagine when people will be jobless because AI replaced them. You'd think they'd want to make art during their retirement right?
Its a case where the snake eat itself. If AI ruins our online life, we'll stop going online.
Yeah going offline begins to look tempting except for the part that huge amounts of how you get people to want to pay for your work is by sharing it publicly. Except then someone can come along and scrape it all and dump it into their AI training black box without giving two shits about copyright.
All of the artists of the world will remove their online work. Who'd want to put their work online again? Its not even in the interests of the big players like Epic Games or Marvel or Disney. Nah I dont see how this is gonna fly.
Automation doesn't eliminate jobs. As such this automation won't eliminate your job, QED.
(It's bad monetary policy that eliminates jobs. The US has very very low unemployment right now so it doesn't appear that's been happening.)
Go ahead and try it. It'd be impossibly harder to create one of your own pictures with it than it was for your to make one. Instead what's going to happen is you'll get more efficient ways to do backgrounds and unimportant bits of an image, just like Blender CSP provide 3D models to do layouts with.
Check out the melodies I made with an AI assistant I created (human-in-the-loop still but much quicker than if I tried to come up with them from the scratch): https://www.youtube.com/playlist?list=PLoCzMRqh5SkFwkumE578Y.... There are also good AI tools for other parts of making music, like singing voice generation.
AI will just replace existing jobs. It is unethical since a lot of people will be destroyed but our leaders sponsored by the majority's insatiable appetite for power will soon make it legal.
AI is a new tool that will automate away a lot of workers like other machines.
What happens with these workers is what defines us.
And there are a lot of influencers saying that it will be "really nice that AI will replace the boring jobs so we can focus on creative/fulfilling life" yeah right...
Ironically, with Moravec's Paradox, (digital) creative tasks will probably be automated while the boring tasks of moving boxes around might not be for a while:
> Moravec's paradox is the observation by artificial intelligence and robotics researchers that, contrary to traditional assumptions, reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources. The principle was articulated by Hans Moravec, Rodney Brooks, Marvin Minsky and others in the 1980s. Moravec wrote in 1988, "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility".
Am planning on doing some deep dives into latent-space exploration algorithms and hypernetworks in the coming days! This is so, so, so exciting. Maybe the most exciting single release in AI since the invention of the GAN.
EDIT: I'm particularly interested in training a hypernetwork to translate natural language instructions into latent-space navigation instructions with the end goal of enabling me to give the model natural-language feedback on its generations. I've got some rough ideas but haven't totally mapped out my approach yet, if anyone can link me to similar projects I'd be very grateful.