Stable Diffusion Public Release

mabbo · on Aug 22, 2022

The most interesting part, to me, of a release like this is the amount of "please don't abuse this technology" pleading. No licence will ever stop people from doing things that the licence says they can't. There will always be someone who digs into the internals and makes a version that does not respect your hopes and dreams. It's going to be bad.

As I see it, within a couple years this tech will be so widespread and ubiquitous that you can fully expect your asshole friends to grab a dozen photos of you from Facebook and then make a hyperrealistic pornographic image of you with a gorilla[0]. Pandora's box is open, and you cannot put the technology back once it's out there.

You can't simply pass laws against it because every country has its own laws and people in other places will just do whatever it is you can't do here.

And it's only going to get better/worse. Video will follow soon enough, as the tech improves. Politics will be influenced. You can't trust anything you see anymore (if you even could before, since Photoshop became readily available).

Why bother asking people not to? I guess if it helps you sleep at night that you tried, I guess?

[0]A gorilla if you're lucky, to be honest.

ForrestN · on Aug 22, 2022

I wonder/hope if a downstream effect of this technological change will be the end of the idea of a "shameful" or "humiliating" image. We all have bodies, we all have sex, and I agree that soon images of ourselves being nude/having sex etc. will proliferate because they'll be generated instantly via Siri shortcut as part of casual banter.

In a world where every celebrity is having sex with gorillas, doesn't such an image lose its charge? Will Norms and values around sex/body shaming change?

tshaddox · on Aug 22, 2022

That seems unlikely, given that making up hurtful stories about people and transmitting them via text or voice is still a thing. Everyone knows that anyone can make up any story they want without any technology whatsoever, and yet spreading rumors is still a thing.

seydor · on Aug 22, 2022

Not really. I can't think of a recent "leaked texts" that the participants cannot not easily and plausibly deny (e.g. elon's supposed messages to gates), or even voice messages. Even most images can already be denied as photoshop if all the witnesses agree. The only medium that is somewhat hard to deny is videos, like sex tapes, but that's also not too hard. I think there will soon be a race to make deep learning pics look completely indistinguishable from phone pics.

tshaddox · on Aug 22, 2022

Perhaps that's somewhat true for famous people, although there are plenty of examples of false stories (without any forged evidence, literally just stories) causing real embarrassment and damage to reputation.

But it's even more true for non-famous people getting bullied in their social groups, both online and offline, and that's more what I was responding to (the "asshole friends" in the original comment).

zone411 · on Aug 22, 2022

It'll be hard to deny crypto-signed photos https://petapixel.com/2022/08/08/sonys-forgery-proof-tech-ad..., especially if they include metadata that distinguish photos of AI generated images from normal photos.

seydor · on Aug 22, 2022

any camera can be hacked to plant an image in its framebuffer

dleary · on Aug 22, 2022

There are a few different kinds of 'secure enclaves' implemented on chips, where you can have some degree of trust that it "cannot" be faked.

E.g. crypto wallets, hardware signing tokens, etc.

We could imagine an imaging sensor chip made by a big-name company whose reputation matters, where the imaging sensor chip does the signing itself.

So, Sony or Texas Instruments or Canon start manufacturing a CCD chip that crypto signs its output. And this chip "can't" be messed with in the same way that other crypto-signing hardware "can't" be messed with.

That doesn't seem too far-fetched to me.

* edit: As I think about it, I think more likely what happens is that e.g. Apple starts promising that any "iPhoneReality(tm)" image, which is digitally signed in a certain way, cannot have been faked and was certainly taken by the hardware that it 'promises' to be (e.g. the iPhone 25).

Regardless of how they implement it at the hardware level to maintain this guarantee, it is going to be a major target for security researchers to create fake images that carry the signature.

So, we will have some level of trust that the signature "works", because it is always being attacked by security researchers. Just like our crypto methods work today. There will be a cat-and-mouse game between manufacturers and researchers/hackers, and we'll probably know years in advance when a particular implementation is becoming "shaky".

jfoster · on Aug 23, 2022

How would such a camera ensure you're not taking a picture of a picture?

beckingz · on Aug 23, 2022

Just get a really really nice screen to display deepfakes on before photographing them

drdeca · on Aug 23, 2022

Maybe have the signed content include a timestamp and location? But... GPS can be spoofed I think.

LouisSayers · on Aug 22, 2022

In which case if found out will result in that person / publishers credibility going down the drain.

We will learn to trust sources (cryptographically signed) rather than just what we see.

zone411 · on Aug 22, 2022

Maybe at some point for some cameras. But not soon after their release if they took steps to protect their pipeline with hardware.

nerdponx · on Aug 22, 2022

Are journalists savvy or ethical enough to give a shit? What about the people reading/viewing/listening to the news?

Swizec · on Aug 22, 2022

> In a world where every celebrity is having sex with gorillas, doesn't such an image lose its charge? Will Norms and values around sex/body shaming change?

I hoped this would happen with social media. Everyone says stupid shit online and everyone has past beliefs they’ve outgrown. So what’s the big deal?

Instead we went the opposite way. Everyone is super self conscious and censoring at all times because you never know who’s gonna take it out of context and make a big deal.

seydor · on Aug 22, 2022

This is not the same though, it's about being able to deny that that photo of you is real, not that it's taken out of context.

taylorportman · on Aug 22, 2022

It is an interesting point [the hope that values will adapt to reflect typical mischievous patterns of social dynamics within various clusters].. I used to marvel during the dotcom era at the salivating delight of "the internet never forgets" of college students caught smoking a bong and the subsequent impact on their career (virtue signalling?).. I saw hope that society could then adapt about ridiculous FUD biases. There is a strange relationship of scarcity & opportunity and these windows into souls painting a distorted picture of the darker side of our ambitions. It is a cancer. Also, humans are always testing boundary conditions - curiosity, discontent, security,insecurity. Some inherent desperation of people jockeying for the next path from frying pan to fire in the search for greener pasture breeds opportunism that seems certain to favor desperation and negativity.

Swizec · on Aug 22, 2022

Well what’s a “real” photo?

If I take your photo and photoshop (very well) different surroundings. Is it a real photo if you?

If I photoshop your face (very well) onto a different body. Is that real?

If I feed your photos into a model that can create realistic versions of those photos in different poses or with different facial expressions. Is that real?

They all start with something that is very definitely a real photo. You can’t (yet? ever?) generate a realistic photo of a specific person from a textual description. The machinery needs a source.

kortex · on Aug 22, 2022

Simple: a "real" photo is one in which a light field from the real world impinges on a photosensitive media (CCD, film), and directly encodes that information, with some allowance for global light levels, gain, ISO, and speed. Anything else is a modification therein. HDR, multi-exposure compositing, etc, aren't truly "real". They may be 99% real, but aren't 100% real. If you crop it, it's 99.9% real (we have models which can detect cropping and even from which region it originated, obviously it can't reconstruct the missing data).

Yes, by that definition, most photographs already aren't real.

Swizec · on Aug 22, 2022

> Yes, by that definition, most photographs already aren't real.

What if I use no digital manipulation at all, but play around with focal lengths or perspective to produce the desired effect. Is that a real photo?

For example from covid reporting: https://twitter.com/baekdal/status/1254460167812415489

kortex · on Aug 23, 2022

Sure, I'd say that's a real photo - you could easily pull it off with a single exposure of film.

Just because the photo is real, doesn't mean it's free from deception. I could take a "real video" of gorilla suit^W^W bigfoot and it'd still be deceptive.

simondotau · on Aug 22, 2022

So if I take a photo of a photoshopped photo, it is a real photo.

cgio · on Aug 22, 2022

It is a real photo of a photoshopped photo, not a real photo of the photoshopped photo’s subject.

seydor · on Aug 22, 2022

A real photo is one created by photons outside the camera

Swizec · on Aug 22, 2022

You’ll be surprised to learn that doesn’t work without some amazing tech to process the photons. Different settings will produce a different photo.

Hell, just changing focal length makes a bigg difference to what your face looks like: https://imgur.io/gallery/ZKTWi no digital manipulation required.

Which of those faces is “real”? They’re all just recording photons hitting the camera, but look very different.

It gets even worse when we start talking about colors. For example: it took cameras decades before they could accurately capture black faces. Where accurately means “an average person would say it looks right”

https://www.vox.com/2015/9/18/9348821/photography-race-bias

Edit: here's a fun example of how journalists use perspective and focal distance tricks to support their desired story angle. No digital manipulation https://twitter.com/baekdal/status/1254460167812415489

BrainVirus · on Aug 22, 2022

>In a world where every celebrity is having sex with gorillas, doesn't such an image lose its charge?

No, because the image itself doesn't matter. What matters is how much the public wants to hate someone. If t he public is primed, any remotely plausible incriminating image will do as an excuse.

Fortunately, these images are actually far than what someone can cook up with Photoshop. Unfortunately, it's a part of a bigger trend where we get more and more tools to produce, manipulates and share information, while the tools to analyze and filter information are lagging by at least half a century.

jiggawatts · on Aug 23, 2022

Reminds me of the novel The Light of Other Days by Stephen Baxter: https://en.wikipedia.org/wiki/The_Light_of_Other_Days

I don't generally like his style, but this novel really captivated me. Without spoiling the plot (much), the end-result of everyone being able to spy on anyone anywhere at any time is a kind of societal disinhibition, especially related to sex, nudity, or similar taboos.

numpad0 · on Aug 23, 2022

> doesn't such an image lose its charge?

Yep! Individual images, and individuals in the first place, absolutely loses significance as supplies increase.

> Will Norms and values around sex/body shaming change?

The bar skyrockets! Look at TV stars from before 1995 - they should already appear below-average amateurs to modern eyes.

Fauntleroy · on Aug 22, 2022

If we get to the point where we can't tell AI generated images from reality, I'm not sure "body shaming" will be on society's collective radar anymore.

geysersam · on Aug 22, 2022

I bet someone said in 1830, "by the time we can send robots to Mars, body shaming will not be a thing anymore". For better and worse, that's not how we humans do things generally.

ForrestN · on Aug 22, 2022

TylerE · on Aug 22, 2022

Much bigger fish to fry.

Think things like forged evidence in trials.

bwest87 · on Aug 22, 2022

Appropriately prioritizing problems has never really been society's strength...

mmmpop · on Aug 22, 2022

That's because pesky things like "democracy" get in the way of getting shit done.

Don't you think?

telesilla · on Aug 22, 2022

It's the right time to get into the digital forensics business.

867-5309 · on Aug 22, 2022

..and the Berlin gorilla nightscene

mmmpop · on Aug 22, 2022

I second this.

narrator · on Aug 22, 2022

As someone who witnessed AI Dungeon's GPT-3 model with an unlimited uninhibited imagination for erotica I would download everything now before they cripple the models. I would not be surprised if they very shortly completely stop downloads due to "abuse" and pursue a SAAS model.

I think it's funny how Yandex, a Russian company, releases these big language models without all the AI safety handwringing in their press releases. The Russians have a tradition of releasing technology without giving a lot of worry about what happens to it, for better or for worse. For example, they made between 75 and 100 million ak-47s, a not in any way limited machine gun, and it spread to every corner of the world. They even gave out all the plans and technical assistance so any one of the organizations they worked with could produce their own. 20 different countries make ak-47s currently. Of course you had to register every xerox machine in the Soviet Union, so maybe they just had different priorities?

The west is absolutely fascinated these days with the control of advanced technology. Drones, Blockchain, and AI models seem to be the latest things that the west is determined to exercise control over. For example:

"Many of the technological advances we currently see are not properly accounted for in the current regulatory framework and might even disrupt the social contract that governments have established with their citizens. Agile governance means that regulators must find ways to adapt continuously to a new, fast-changing environment by reinventing themselves to understand better what they are regulating. To do so, governments and regulatory agencies need to closely collaborate with business and civil society to shape the necessary global, regional and industrial transformations." -Klaus Schwab, "The Fourth Industrial Revolution", Page 70.

astrange · on Aug 23, 2022

StableDiffusion was trained by an academic lab in Germany, which is rather different from a company in California doing it. Though they both have the problem that the output is illegal in many countries with the wrong prompts.

Valakas_ · on Aug 23, 2022

"The West".

As if we're all one block, a single minded hive. Corrupted, degraded by a heavy capitalistic and egotistical mindset. It's fascinating to see and talk about our shortcomings isn't it? We are so inferior. The fact you wrote "The West" signifies you're one of "The Others". Someone possibly from Russia. Then you compliment Russia and degrade "The West" adding further strength to this hypothesis.

Using "The West", a simple minded sound byte that's been used in propaganda for centuries, a way of appealing to our ape-instincts to protect our tribe against others. "Us" vs "The Others".

You should make a little effort in seeing how ridiculous, infantile, and brainwashed it is to refer to countries that are not Asia or Russia as "The West".

RandomLensman · on Aug 22, 2022

It's not about control of advanced technologies but rather making such technologies look more powerful by pointing out real or imagined destructive capabilities. If you want funding or "big up" a topic, claim it has the power to bring down the world (anyone remembering gray goo?).

PcChip · on Aug 23, 2022

The RTS game?

devenvdev · on Aug 23, 2022

No, the concept it was based on: https://www.britannica.com/technology/grey-goo

> grey goo, a nightmarish scenario of nanotechnology in which out-of-control self-replicating nanobots destroy the biosphere by endlessly producing replicas of themselves and feeding on materials necessary for life.

wwwtyro · on Aug 22, 2022

> Politics will be influenced. You can't trust anything you see anymore

I've been wondering for a while now if this will lead to an unexpected boon: perhaps people will be forced to pay attention to a speaker's content instead of simply who is speaking.

edouard-harris · on Aug 22, 2022

Unfortunately a speaker's content can also be auto-generated now, at least for brief enough snippets. And that means the content can (and will) be optimized to appeal to a target segment much more than has ever been previously possible.

dbingham · on Aug 22, 2022

The problem with this is that you will never know who is actually speaking. Deep fakes are already a thing, but as they get better and more accessible we will approach a world where anyone can make anyone say anything and make it hyper believable. In that world, it will be very difficult to tell what is real.

bluejellybean · on Aug 22, 2022

My personal hunch is that this will end up leading to a situation in which presenters do a cryptographic handshake that works to verify and prove authenticity. This isn't a new idea, and it has some very obvious drawbacks, but I don't see much of a way around the issue. The handshake could work great for something like official news releases, but for other instances that might come up in court, say, dash cam footage of an accident, it seems to me that the legal system is going to face some serious issues as these programs progress.

nelsondev · on Aug 22, 2022

Looking forward to a future where all a politician’s quotes are on a blockchain, signed by their private key, and they chose to do so voluntarily out of fear of deep fakes.

Will remove all the useless “I didn’t say that”

marak830 · on Aug 22, 2022

It will end up back where we started, "it's not on the Blockchain so I didn't say it", while making racist remarks with friends.

sangnoir · on Aug 23, 2022

"We verified the validity of our source's signature and vouch for it's authenticity"

- Journalist from a respected newsroom, when they choose to keep the source confidential. So it won't be better or worse than it is now: all about transitive reputation.

The unspoken threat to further denials is the risk that the source may go public, or if unauthorized, doxxing the person who recorded the video (assuming signatures have nonrepudiation)

swader999 · on Aug 22, 2022

That's a start! I think their video appearances should be like a car in NASCAR with permanently displayed logos superimposed from all the interests that have funded their rise.

Breza · on Sept 3, 2022

Why would this require a blockchain?

astrange · on Aug 23, 2022

https://www.washingtonpost.com/news/the-intersect/wp/2017/09...

dinosaurdynasty · on Aug 22, 2022

Do politicians understand blockchains?

nelsondev · on Aug 22, 2022

Rename it as “verified speech transcripts.” I don’t need to understand video codecs to watch Youtube.

kelseyfrog · on Aug 22, 2022

Neither do we[1], so it's sort of a mixed bag.

1. True for an overwhelming majority of the body politic.

fpgaminer · on Aug 22, 2022

If I recall from the interview with Stability.ai's founder, he has more or less the same opinion, and that humans will adapt to the new technology as we always have. I figure "please don't abuse this technology" warning stickers are more CYA. It'll make the vast majority of judges look at a motion to dismiss and not blink an eye.

gitfan86 · on Aug 22, 2022

Historically, he is correct. It is easy to find people that were against TVs, Cars, Trains, Electric Cars. Those people were not entirely wrong with their logic. Trains and cars did make it much easier for scammers to come into a town and then leave quickly.

theptip · on Aug 22, 2022

In terms of the equilibrium, this is certainly a true observation. However, historically speaking new technology can be extremely disruptive in the short-term as society figures out the new norms, and the power-structures are disrupted and then re-equilibrate.

Concretely, it's probably true that children born with this technology will have adapted to many of the negative (and positive) aspects of it. But the current generation of elites, politicians, and voters might have a harder time adapting.

permo-w · on Aug 31, 2022

are we even sure that TVs are a good thing?

aortega · on Aug 22, 2022

>make a hyperrealistic pornographic image of you with a gorilla[0]

I don't understand this irrational fear. This can be done today, just need some minutes instead of some seconds to create a good Photoshop.

Also, seriously this is the thing you fear? fake porn? there are much worse thing you can do with this tech, like phising, falsifications, etc. Not mentioning leaving millions of graphic designers out of job.

peoplefromibiza · on Aug 22, 2022

Unfortunately Anything that can go wrong will go wrong

Photoshop is a skill, not very widespread as we assume.

Typing something is literally at everyone's fingertips.

notahacker · on Aug 23, 2022

Photoshop is a skill possessed by more than enough people for the world to have already been transformed by Photoshops of famous people and anybody else graphic designers wanted to embarrass, if moderately-convincing images were that transformational a technology for adversarial use.

I agree that it's much easier to do low effort stuff to wind friends up, but universal access and low effort don't make it more likely to be impactful and believable.

cerol · on Aug 22, 2022

That's what I was thinking. Whether a pornographic picture of me and a gorilla was made by photoshop or AI is irrelevant. People's reactions will be the same, and repercussions will be mostly the same (which doesn't means there will be consequences).

If someone really wants to hurt you, not having AI isn't going to stop them.

aortega · on Aug 22, 2022

The real effect will be that you can publish a real picture of you fucking a gorilla and nobody would believe it because it's trivial to generate it with an AI.

mabbo · on Aug 23, 2022

> seriously this is the thing you fear? fake porn?

I'm being polite. Things will be so much worse.

Someone will make child pornography using your child's face as the input. Someone is going to take private videos of politicians and then edit them to have them say incriminating things. Someone is going to short the stock of a large company, then release a faked video of the CEO being shot, and profit from the immediate stock plunge.

And this is just what my mind can come up with. Imagine what 4chan will invent.

mrshadowgoose · on Aug 22, 2022

> The most interesting part, to me, of a release like this is the amount of "please don't abuse this technology" pleading.

> Why bother asking people not to? I guess if it helps you sleep at night that you tried, I guess?

I've perceived this as them doing the necessary amount of virtue-signaling and ass-covering to avoid the ire of groups that are loud/powerful enough to cause issues for them in the short-term. I've been following the developments in this space for a while now, and I don't get the impression that Stability AI cares too much about forcing Western ideals of "correctness" onto the public.

It's plainly obvious that this is going to be immediately used to produce content considered obscene, offensive and/or illegal by various groups of people. And it is what it is, we're going to just have to figure out how to live with it as a society. It's going to get far far worse (from certain points of view) as we continue to replicate functionality previously exclusively featured in human brains.

AJ007 · on Aug 22, 2022

I never saw it create anything more NSFW than female boobs. It won't generate penises or genitals besides mounds of hair. Right now the "NSFW" flag/filter is 99%+ false positives. This is stuff that used to be allowed in G-rated movies in the US.

True to both Stable Diffusion and Dalle2, the more encompassing the content of your image is, the more incorrect the detail is. Both generally will make fantastic faces but anatomy and consistency starts falling apart with full bodies. Contorted limbs, missing fingers, too many limbs, etc. No one is going to be fooled here.

Of course the models could be trained on offensive images and over a long enough time period eventually will. But, for now, someone is going to have to spend millions of dollars on compute and have the human expertise behind it too. Then again, if we had an image generator making pictures of [insert X offensive thing] that is a whole lot less disgusting than the plentiful real photos and videos of that thing in reality.

numpad0 · on Aug 23, 2022

It feels sometimes that there has to be small, influential group of people on some jihad against generated pornography on the Internet, with impacts in the order of $10-100mil.

I'm sure this is not a view shared by everyone, but it is a straightforward course of action to add such disclaimers in cases where you do agree with it and not wanting to be targeted and blacklisted from payment networks(that seems amongst their weaponry).

oifjsidjf · on Aug 22, 2022

This.

They just have to cover their asses, any sane dev would make the same license due to the power of this tech.

On some level I can't stop laughing since OpenAI really got smoked. "OpenAI" my ass, this is what open TRULY means!

Cheering for these devs.

mdcds · on Aug 22, 2022

> you can fully expect your asshole friends to grab a dozen photos of you from Facebook and then make a hyperrealistic pornographic image of you with a gorilla

my prediction is that, as a result, people will start assuming pics online are fake until proven otherwise.

lern_too_spel · on Aug 22, 2022

> my prediction is that, as a result, people will start assuming pics online are fake until proven otherwise.

"That worked well for quotations." — Abraham Lincoln

naillo · on Aug 22, 2022

I much prefer a company that asks me to not abuse it but lets me, then a company that treats me like child and force filters it out for me.

astrange · on Aug 23, 2022

That's not the only issue with NSFW; the larger problem is when you don't ask for it and get it anyway. Especially because this model is not particularly good at it and you'll get body horror.

naillo · on Aug 23, 2022

Well it comes with an NSFW filter that's activated by default so that's a non-issue (but importantly you can turn it off if you're ok with accidentally seeing stuff like that without explicitly asking for it).

hahajk · on Aug 22, 2022

Like you mentioned near the end, all this has been possible with photoshop, with amateur level skill. Hollywood can CGI the entire Captain Marvel movie, so as far as state-level efforts go, AI can really only be an incremental improvement at best.

I think this is all just trendy popular sentiment moralizing AI.

trention · on Aug 23, 2022

Scale and access (≈quantity) matter.

skybrian · on Aug 22, 2022

It seems inevitable like it's "inevitable" that they Photoshop your face onto porn. Yes, of course it will happen but maybe not to most people? I'd guess inevitable for many celebrities.

croes · on Aug 22, 2022

It will be more now. Photoshop needs some skills. With AI it gets easier and easier.

eastbound · on Aug 22, 2022

But even today, we deal correctly with it. Fakes and real photos are mingled together in 9Gag/LatestNews reports about Ukraine. Under the fakes (and the real), people ask for confirmation. Someone says it’s true, no-one believes him, until a link to a newspaper is dropped. And 9Gag isn’t the highest IQ community around, so yes, general population does distrusts photos by default until proven.

They are laughed at anyway if they tell a story coming from a forged photo.

Sure, newspapers could forge stories, display pictures with, I don’t know, Biden’s son with a crackpipe, and make the populace believe untrue stories. But guess what, they already do it anyway, newspapers already “spin” (as they say, i.e. forge, suggest without literally saying) stories all the time.

The world deals correctly with fakes.

croes · on Aug 22, 2022

I have a quite different perception of 9gag. Yes, some ask for confirmation but it depends very much on the topic.

Wrong topic and facts get downvoted and the fake news prevail.

And not all links to newspaper are considered valid, especially if it's about "woke culture". Then you have to search the reasonable needle in the haystack of transphobia, homophobia and misogyny.

reactordev · on Aug 22, 2022

The problem comes from the early adult newsroom interns responsible for sourcing content. They don’t know it’s fake, it sounds like a good click-baity article to them, so they run it. It happens.

eastbound · on Aug 22, 2022

I wouldn’t shift responsibility on the shoulders of the last newcomer. The top of the management has had ample time to diagnose this. If it remains like this, it’s by design.

yarg · on Aug 23, 2022

Significant progress has been made on video - inter-frame preservation of the surface level spatial invariants in 3D environments has been achieved.

But preservation of transtemporal spatial invariants requires understanding far more than that - dynamic lighting, density, flexibility, rigidity and momentum, the viscosity of the air, the skeletomuscular system, the flow of the fluids within, and so on ad infinitum.

And a lot of that is tacitly understood by the human mind (even when the human mind would struggle to generate a scene, it can often detect that something is wrong - try turning on the lights in a lucid dream).

It's going to be quite some time before it reaches the point where a human cannot detect that video was generated (or altered) and even longer before computers can't.

But then, it's going to be a shit-show - and I'm not talking about the bestiality videos.

Evidence, as we know it, will be meaningless - the implications for the legal system are terrifying.

earth_walker · on Aug 22, 2022

I think you don't give enough emphasis on the "if you even could before" part. Sure, this makes realistic fakes easier and more available to to the masses. So what? We've had the technology to 'frame' people in similar ways for decades now, and other than entertain the 4chan teens it does squat.

There's enough money interest in bringing down certain politicians, if faking a sex tape or back-room conversation would make any difference to the world it would have been done already. Hell, politicians pretty much admit publicly that they are rapists and we don't do anything about it. Who's gonna care about a couple of gorilla-pics of a normie?

People don't and never have trusted any evidence that doesn't reaffirm their world view, regardless of how true that evidence is. More realistic evidence won't change that.

AJ007 · on Aug 22, 2022

These images are good for satire, spoofs, and comedy but they definitely won't be fooling anyone.

nootropicat · on Aug 22, 2022

This is great news for those that actually have sex with a gorilla, because now they can claim it's an ai photo. :)

Kidding aside, I think this is actually good. Humans need ephemerality. We are never getting the full version of it back, but with photorealistic ai video and image creation some freedom returns. I think without it a society in which everyone has a camera all the time would mean absolute ossification of social norms. Right now it's very, very new - I mean multiple generations living with the current, or better (eg. recording eye implants) technology.

seydor · on Aug 22, 2022

I think that s a hopeful message.

At first we will see a lot of "prompt censoring mobs" that will try to "stop abusers because children and terrorists", but as the images multiply, and they will multiply, the line between real and fake photos will become hard to spot, for real. This is i think a pivotal and great moment, because everyone can now claim plausible deniability to any picture. No revenge porn will be believable anymore, nor will anyone know if that Bezos's weiner pic is real or not.

choppaface · on Aug 22, 2022

Really, the licensing is most interesting? There’s a lot of public info about training and development too.

The license itself is pretty irrelevant. What people will actually do with the training blueprints, and how fast things will evolve.. now that’s interesting.

croes · on Aug 22, 2022

Perhaps we should have started with artificial wisdom instead of artificial intelligence

cypress66 · on Aug 22, 2022

> Why bother asking people not to? I guess if it helps you sleep at night that you tried, I guess?

They obviously are aware. They just put all that so they don't get "canceled". It's just virtue signaling and covering your ass.

emporas · on Aug 22, 2022

Well we will be able to generate satirical political images of politicians, of religions easily and quickly. So there are some upsides for the technology to be given to every comedian out there. Politicians and religions, did have an easy ride the last few years, so we must set the record straight from now on.

Some fear porn was thrown out when GPT-3 was released. I love GPT-3, i used it just yesterday it is very good. I am just wondering when the total destruction of the world will happen because of GPT-3.

alan-stark · on Aug 23, 2022

> Politics will be influenced. You can't trust anything you see anymore

How much could you trust media content previously? Staged footage, false narration, biased coverage are nothing new. A counter-intuitive side-effect of opening the Pandora's box could be a realization that media is a form of simulation. Perhaps this will lead more people to filter what they see through a prism of critical thinking.

OmarIsmail · on Aug 22, 2022

We do a decent job of banning child pornography.

And bringing the two ideas together, is child pornography that is provably created by an AI still illegal?

causi · on Aug 22, 2022

As far as I'm aware countries fall broadly into two camps. Camp 1, USA for example, is concerned purely with the abuse of children, i.e., anything that depicts or is constructed of pieces of real children is illegal but other things such as drawings, stories, adults role playing, etc is not. Camp 2 outlaws any representation of it whether or not a child was involved.

Nowhere will a training set featuring pictures of naked children be legal.

OmarIsmail · on Aug 22, 2022

> Nowhere will a training set featuring pictures of naked children be legal.

Appropriately from the recent news stories, but it's easy to imagine at least portions of such pictures being available for medical diagnostic purposes. I've sent pictures of my children to my doctor, so presumably in the future it's easy to imagine sending pictures to an AI to diagnose which would require a suitably fleshed out (pardon the pun) training set.

notahacker · on Aug 23, 2022

I imagine the AI which diagnoses your children won't be the same AI your neighbours use to make pictures with.

16890c · on Aug 22, 2022

>Nowhere will a training set featuring pictures of naked children be legal.

True, but generalizing beyond the training set is precisely the point of machine learning. A good generative model will be able to produce such images, no matter how heinous the content is.

seydor · on Aug 22, 2022

Actually ... we just did a good job of making generated child pornography ubiquitous.

Geee · on Aug 22, 2022

I think the actual problem is that this gives plausible deniability against photographic evidence, which might result in increase of bad behavior. Even cameras which cryptographically sign their output can't prove that the input was actually photographed from the real world, or if it's just an image of an image.

matheusmoreira · on Aug 22, 2022

> you can fully expect your asshole friends to grab a dozen photos of you from Facebook and then make a hyperrealistic pornographic image of you with a gorilla

... Someone is gonna do this to children. This technology is gonna end up on the news. Maybe they'll even try to ban it.

skellyclock · on Aug 23, 2022

Seems less likely that everyone will "just be cool with" having pornographic images of themselves with gorillas spread everywhere and more likely that this is the impetus for demagogues to issue a digital ID for access to the internet.

sfifs · on Aug 23, 2022

Isn't it good if people learn not to trust photos and images shared in social media and learn to treat them as entertainment. So much productivity gains :-)

gjsman-1000 · on Aug 22, 2022

I had a thought for a utopian/dystopian world.

One day these image generators will also get video support... and pornography support. When that happens, a few things may occur that I think are reasonable to predict:

EDIT: Original post was way too wordy, TL;DR:

When AI-generated pornography becomes available, it could be likely that demand for "real" pornography disappears because the AI will match and surpass the "real." When that occurs, the "real" will become increasingly regulated and legally risky, and may just outright be as good as banned.

suby · on Aug 22, 2022

There will be a huge and unkillable market for non-AI generated pornography, even if people cannot tell the difference in an AB test. The demand will be too strong and I don't think there will be much outcry to ban it if it's all consenting adults.

scarmig · on Aug 22, 2022

If people can't tell the difference in an AB test, how will the real porn out compete the generated stuff? Porn distributors aren't known for their truth in advertising or care in sourcing of material. And even if they were, how would the porn distributors be able to source the real stuff, when anyone can create porn of anyone they imagine? You might say PKA will save us, but people aren't going to be typing out `gpg --verify` when their hands are otherwise occupied.

tastemykungfu · on Aug 22, 2022

Curious as to how you arrived at that conclusion.

derac · on Aug 22, 2022

A quick Google shows an estimate that it was a 100 billion dollar market in 2017. Seems there is a lot of demand.

thfuran · on Aug 22, 2022

But what portion of that demand is specifically for artisinally handcrafted authentic sex rather than for apparent sex acts?

derac · on Aug 22, 2022

Oh sorry, the OP was worded differently when I saw it. That's not what I meant.

suby · on Aug 23, 2022

I didn't edit my post if you were referring to me. Either way, I don't have a strong argument for why I feel this to be true, but I do think people are going to scoff at ai generated porn.

crazypython · on Aug 23, 2022

Setting norms of asking folks not to abuse it reduces the likelihood people will step out of line with those norms.

indymike · on Aug 22, 2022

Corollary: If you did not make it, someone else would have anyway.

tgv · on Aug 22, 2022

You’re right. They call this an “ethical release”, but what ethics, I may ask. A profitable IPO is more likely to have been their consideration. As other researchers before them, they are willingly releasing something with the potential to do harm, or pave the way for it, washing their hands in innocence.

fpgaminer · on Aug 22, 2022

Played with it for a bit in DreamStudio so I could control more of the settings. So far everything it generates is "high quality", but the AI seems to lack the creativity and breadth of understanding that DALL-E 2 has. OpenAI's model is better at taking wildly differing concepts and figuring out creative ways to glue them together, even if the end result isn't perfect. Stable Diffusion is very resistant to that, and errs towards the goal of making a high quality image. If it doesn't understand the prompt, it'll pick and choose what parts of the prompt are easiest for it and generate fantastic looking results for those. Which is both good and bad.

For example, I asked it in various ways for a bison dressed as an astronaut. The results varied from just photos of astronauts, to bisons on earth, to bisons on the moon. The bison was always drawn hyper realistically, which is cool, but none of them were dressed as an astronaut. DALLE on the other hand will try all kinds of different ways that a bison might be portrayed as an astronaut. Some realistic, some more imaginative. All of them generally trying to fulfill the prompt. But many results will be crude and imperfect.

I personally find DALLE to be more satisfying to play with right now, because of that creativity. I'm not necessarily looking for the highest quality results. I just want interesting results that follow my prompt. (And no, SD's Scale knob didn't seem to help me). But there's also a place for SD's style if you just want really great looking, but generic stuff.

That said, the current version of SD was explicitly finetuned on an "aesthetically" ranked dataset. So these results aren't really surprising. I'm sure the next generations of SD will start knocking DALLE out of the park in both metrics. And, of course, massive massive props to Stability.ai for releasing this incredible work as open source. Imagine all the tinkering and evolving people are going to do on top of this work. It's going to be incredible.

ralfd · on Aug 22, 2022

I read that you have to give longer prompts to Stable diffusion.

This is my bison astronaut:

https://i.imgur.com/ohIuG6F.png

The prompt was:

"A bison as an astronaut, tone mapped, shiny, intricate, cinematic lighting, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by terry moore and greg rutkowski and alphonse mucha"

raylad · on Aug 23, 2022

Not sure how you got that from stable diffusion. This is the kind of thing I get - just a bison.

https://i.postimg.cc/bw1R10gB/A-bison-as-an-astronaut-tone-m...

Rebelgecko · on Aug 24, 2022

I've found that sometimes you need a lot of attempts to get good results. Here's a run I just did, I'd say the top left result is OK, the bottom 2 are wrong/misinterpretations, and the top right result is fantastic: https://i.imgur.com/l8BsWvI.png

naillo · on Aug 22, 2022

Img2img and inpainting is usually a lot easier to get cool results (with a lot of control on your end) with in my view.

GenericPoster · on Aug 22, 2022

Interesting. I took a stab at your prompt and SD really struggles. It just completely ignores part of the prompt. Even craiyon puts in an effort to at least complete the entire prompt.

The bison is very realistic at least. So maybe the future is different models that have different specialties.

Edit: managed to get this one after a few more tries https://imgur.com/a/3061n5d

IanCal · on Aug 23, 2022

You can take an image and feed it into SD along with the prompt (look for img2img in the readme), I think you can use that to create the idea in craiyon and then move to SD for a quality finish.

raylad · on Aug 23, 2022

It doesn't seem to understand things that DALL-E seems to (usually) understand. For example, it doesn't know that there are 5 fingers on most hands. It cuts objects in half randomly, etc.

I haven't found many of the images it produces to be really usable.

Linked is an example from Dream Studio showing some of these issues: https://i.postimg.cc/qrDKGSVJ/Screen-Shot-2022-08-22-at-7-41...

Valakas_ · on Aug 23, 2022

And then you have Midjourney which would most likely be the best one at doing that specific request. The most crazily creative artistic of the 3.

danielbln · on Aug 23, 2022

MJ has announced a new beta mode, that uses SD under the hood.

badsectoracula · on Aug 22, 2022

Is there any way to download this on my PC and run it offline? Something like a command-line tool like

    $ ./something "cow flying in space" > cow-in-space.png

that runs with local-only data (i.e. no internet access, no DRM, no weird API keys, etc like pretty much every AI-related application i've seen recently) would be neat.

mkaic · on Aug 22, 2022

Yes, that's actually the biggest reason this is such a cool announcement! You just need to download the model checkpoints from HuggingFace[0] and follow the instructions on their Github repo[1] and and you should be good to go. You basically just need to clone the repo, set up a conda environment, and make the weights available to the scripts they provide.

[0] https://huggingface.co/CompVis/stable-diffusion [1] https://github.com/CompVis/stable-diffusion

Good luck!

snak · on Aug 23, 2022

What's the difference between those 4 checkpoints?

From the GitHub's README:

    sd-v1-1.ckpt: 237k steps at resolution 256x256 on laion2B-en. 194k steps at resolution 512x512 on laion-high-resolution (170M examples from LAION-5B with resolution >= 1024x1024).

    sd-v1-2.ckpt: Resumed from sd-v1-1.ckpt. 515k steps at resolution 512x512 on laion-aesthetics v2 5+ (a subset of laion2B-en with estimated aesthetics score > 5.0, and additionally filtered to images with an original size >= 512x512, and an estimated watermark probability < 0.5. The watermark estimate is from the LAION-5B metadata, the aesthetics score is estimated using the LAION-Aesthetics Predictor V2).

    sd-v1-3.ckpt: Resumed from sd-v1-2.ckpt. 195k steps at resolution 512x512 on "laion-aesthetics v2 5+" and 10% dropping of the text-conditioning to improve classifier-free guidance sampling.

    sd-v1-4.ckpt: Resumed from sd-v1-2.ckpt. 225k steps at resolution 512x512 on "laion-aesthetics v2 5+" and 10% dropping of the text-conditioning to improve classifier-free guidance sampling.

Which one is the general use case checkpoint one should be using?

vintermann · on Aug 22, 2022

You need a decent GPU, though. I suspect my 6080MiB won't cut it any longer :(

ompto · on Aug 22, 2022

There's a version that's a bit slower but more memory efficient https://github.com/basujindal/stable-diffusion that runs on 6GB too.

miohtama · on Aug 22, 2022

Is Apple M1 support expected soon? Because even if Apple’s chips are slower, they have plenty of RAM on laptops. I saw some weeks ago it was coming, but I am not sure where to follow the process.

bm-rf · on Aug 24, 2022

looks like there is a nightly release for apple silicon, https://towardsdatascience.com/gpu-acceleration-comes-to-pyt...

neurostimulant · on Aug 22, 2022

You're going to need at least 10GB VRAM. My SFF pc with 4GB VRAM can only run dalle mini / craiyon :(

mempko · on Aug 22, 2022

Not if you change the precision to float16. Should work on a smaller card. Tried on a 1080 with 8GB and it works well.

krisoft · on Aug 22, 2022

How would one do that?

-----

Sorry my bad, found the answer. One simply adds the following flags to the StableDiffusionPipeline.from_pretrained call in the example: revision="fp16", torch_dtype=torch.float16

Found it in this blogpost: https://huggingface.co/blog/stable_diffusion

mempko thank you for your hint! I was about to drop a not insignificant amount of money on a new GPU.

What does one lose by using float16 representation? Does it make the images visually less detailed? Or how can one reason about this?

ShamelessC · on Aug 23, 2022

Zero loss. All upside. Only causes issues when training. 32-bit ships by default because it is compatible with cpu and GPU’s that might not have native fp16 support.

Edit: Just to be clear, your intuition that it could cause issues is certainly merited - and not _all_ models can be trivially converted from fp32 to fp16 without some new error accumulating (during inference). Variational autoencoders like VQGAN and GAN's are particularly prone to such issues.

But in this case, it's all upside.

tariq2388 · on Aug 22, 2022

Can you please tell me where is the model.ckpt? I am not able to find any weight with ".ckpt" format there in the both links that you have given. There are the ".bin" file on the hugging face.

NosliwPilf · on Aug 23, 2022

On the huggingface site, click the checkbox and then access the repository. The chkpt file is under the "files and version" tab.

IanCal · on Aug 23, 2022

For anyone else reading you need the -original versions. The others are setup for the diffusers library and I can't find a checkpoint file in that, just the original one.

jsmith45 · on Aug 22, 2022

I think the answer is yes, but setup is a bit complicated. I would test this myself, but I don't have an NVIDIA card with at least 10GB of VRAM.

One time:

1. Have "conda" installed.

2. clone https://github.com/CompVis/stable-diffusion

3. `conda env create -f environment.yaml`

4. activate the Venv with `conda activate ldm`

5. Download weights from https://huggingface.co/CompVis/stable-diffusion-v-1-4-origin... (requires registration).

6. `mkdir -p models/ldm/stable-diffusion-v1/`

7. `ln -s <path/to/model.ckpt> models/ldm/stable-diffusion-v1/model.ckpt`. (you can download the other version of the model, like v1-1, v1-2, and v1-3 and symlink them instead if you prefer).

To run:

1. activate venv with `conda activate ldm` (unless still in a prompt running inside the venv).

2. `python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms`.

Also there is a safety filter in the code that will black out NSFW or otherwise expected to be offensive images (presumably also including things like swastikas, gore, etc). It is trivial to disable by editing the source if you want.

entrep · on Aug 22, 2022

Thanks for these instructions.

Unfortunately I'm getting this error message (Win11, 3080 10GB):

> RuntimeError: CUDA out of memory. Tried to allocate 3.00 GiB (GPU 0; 10.00 GiB total capacity; 5.62 GiB already allocated; 1.80 GiB free; 5.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CON

Edit:

>>> from GPUtil import showUtilization as gpu_usage

>>> gpu_usage()

| ID | GPU | MEM |

------------------

| 0 | 1% | 6% |

Edit 2:

Got this optimized fork to work: https://github.com/basujindal/stable-diffusion

orpheansodality · on Aug 22, 2022

I also have a 10g card and saw the same thing - to get it working I had to pass in "--n_samples 1" to the command, which limits the number of generated images to 2 in any given run. This has been working fine for me

cypress66 · on Aug 22, 2022

I haven't gotten around it, but I remember reading on /g/ that you can make it run on 5GB (sacrificing accuracy).

You should check their threads there, there's some good info.

bfirsh · on Aug 23, 2022

You can run it with Cog: https://github.com/replicate/cog

    cog predict r8.im/stability-ai/stable-diffusion -i prompt="cow flying in space"

Or you can run the Docker image directly. More details under "run on your own computer" here: https://replicate.com/stability-ai/stable-diffusion

cmdr2 · on Aug 24, 2022

Thanks! This worked for me finally. The conda approach suggested elsewhere was getting too complicated, and wasn't working properly (for me).

I built a simple UI around this, which installs Stable Diffusion's docker image, and lets you play with it locally in a browser-based UI. https://github.com/cmdr2/stable-diffusion-ui

_fjb4 · on Aug 22, 2022

As an aside I wonder how performance would be like running this on CPU (with the current GPU shortage this might well be a worthwhile choice). Even something like 30 minutes to generate an image on a multicore CPU would greatly increase the number of people able to freely play with this model.

GaggiX · on Aug 22, 2022

Yes you can https://github.com/huggingface/diffusers/releases/tag/v0.2.3 (probably the easiest way)

timmg · on Aug 22, 2022

Garr:

> And log in on your machine using the huggingface-cli login command.

I find that annoying. I guess it is what it is.

cercatrova · on Aug 22, 2022

Yes, clone the repo (https://github.com/CompVis/stable-diffusion), download the weights and follow the readme for setting up a conda environment. I am presently doing so on my RTX 3080.

tariq2388 · on Aug 22, 2022

Can you please tell me where is the model.ckpt? I am not able to find any weight with ".ckpt" format there in the both links that you have given. There are the ".bin" file on the hugging face.

cercatrova · on Aug 22, 2022

https://cdn-lfs.huggingface.co/repos/4c/37/4c372b4ebb57bbd02...

Here's the direct link.

isoprophlex · on Aug 22, 2022

I can't believe i sold my rtx2070 last month, aaargh...!

google234123 · on Aug 23, 2022

You should have sold it a year ago when the market was better :p

naillo · on Aug 22, 2022

This should be possible if someone just exported them to tflite or onnxruntime etc (quantization could help a ton too). Not sure why ppl haven't yet. Sure it'll come in the next few days (I might do it).

cmdr2 · on Aug 24, 2022

You can try https://github.com/cmdr2/stable-diffusion-ui . It installs Stable Diffusion to your local computer, and provides a simple browser-based UI for playing with it. No need to mess with conda and other environment settings.

squeaky-clean · on Aug 22, 2022

There's some additional discussion on running it locally here

https://old.reddit.com/r/StableDiffusion/comments/wuyu2u/how...

drexlspivey · on Aug 22, 2022

Yes but you need a GPU with 10Gb of vRAM

simsspoons · on Aug 28, 2022

or just run it on baseten https://app.baseten.co/apps/VqK2vYP/operator_views/pqvba2q

martopix · on Aug 23, 2022

That is literally the announcement.

hunkins · on Aug 22, 2022

This release changes society forever. Free and open access to generate a hyper-realistic image via just a text prompt is more powerful than I think we can imagine currently.

Art, media, politics, conspiracy theories; all of it changes with this.

gjsman-1000 · on Aug 22, 2022

Eh... if I was making conspiracy theories, it's not like Photoshop hasn't existed for decades already, with far more predictable results.

realce · on Aug 22, 2022

Photoshop requires experience and some talent, this doesn't. If I was some small rebel group in Africa or the Middle East with basically no money or training, I'd use this tool every single day until I was in power, or I'd frame my opposition as using it against the People.

Everyone just got their own KGB art department.

roguas · on Aug 22, 2022

Try and do that. Likely those people won't care too much about you, they lean towards authority figures in their community. It is way easier to find those guys and corrupt them. Rather than running some underground news agency changing minds of millions of people.

In fact quite often the former is the case + cheaper + less time to execute. "Western" societies will be more resilient to this scenario. So, mostly its gonna be a lot of "political" art we gonna see.

tsol · on Aug 23, 2022

>"Western" societies will be more resilient to this scenario

I would think western societies, with our higher consumption of media, would be less resilient. 'Fake news' is already a large issue here.

roguas · on Aug 26, 2022

And thus we become more resilient. I know a lot of people who trusted news, now they verify, research presented data.

seydor · on Aug 22, 2022

This is like a gazillion photoshops being released in the wild. Things change with scale, and there is a threshold where, if enough people start doubting often enough, then all the people will doubt all the time

jcims · on Aug 22, 2022

>it's not like Photoshop hasn't existed for decades already, with far more predictable results.

Agreed. For me those results are predictably shit. Every time.

astrange · on Aug 23, 2022

You could take a photography class? They're fun.

johnfn · on Aug 22, 2022

Photoshop requires hours of work from a skilled professional to create results of decent quality. Now anyone can do it for free, virtually instantaneously.

TulliusCicero · on Aug 22, 2022

Photoshop requires skill. This mostly doesn't.

vagabund · on Aug 22, 2022

Longterm, multimodal generative models will be society-altering. But right now this is just a really cool toy.

mlrtime · on Aug 23, 2022

What will change society forever is when the hardware required to run this software is available in the latest medium/high end phone and 100's million of people can download the app to say what they want.

rvz · on Aug 22, 2022

Yes. This changes everything.

Until this point, you really cannot believe any image you see on the internet anymore.

mkaic · on Aug 22, 2022

This is one of the most important moments in all of art history. Millions of people just got unconditional access to the state-of-the-art in AI text-to-image for absolutely free less the cost of hardware. I have an Nvidia GPU myself and am thrilled beyond belief with the possibilities that this opens up.

Am planning on doing some deep dives into latent-space exploration algorithms and hypernetworks in the coming days! This is so, so, so exciting. Maybe the most exciting single release in AI since the invention of the GAN.

EDIT: I'm particularly interested in training a hypernetwork to translate natural language instructions into latent-space navigation instructions with the end goal of enabling me to give the model natural-language feedback on its generations. I've got some rough ideas but haven't totally mapped out my approach yet, if anyone can link me to similar projects I'd be very grateful.

daenz · on Aug 22, 2022

>This is one of the most important moments in all of art history.

I agree, but not for the reasons you imply. It will force real artists to differentiate themselves from AI, since the line is now sufficiently blurred. It's probably the death of an era of digital art as we know it.

derac · on Aug 22, 2022

Making art is already not making much money for the vast majority of producers (outside of 3d modeling). There really aren't very many jobs in making art. I'd reckon most people are artists because they love doing it.

cshenton · on Aug 22, 2022

This is completely wrong, the digital art industry is enormous and, mind you, pretty uniformly pissed off that this model has been trained off of their non public-domain work.

astrange · on Aug 23, 2022

They're mostly posting incorrect claims like "all it does it make collages out of other people's art" though. It doesn't do that; the model stores about 1 byte per original image it's seen.

trention · on Aug 23, 2022

Think of that 1 byte as a hyper-compressed essence of knowledge necessary to make said collage. Then it will be correct. Those models would not exist without the labor of the artist community - which in this case was used without consent and without pay.

incrudible · on Aug 23, 2022

These models would exist, just without painterly style. Most of the training data is photos of real things, a lot of which is stock photography that has been ripped off much in the same way like those artists. It is a challenge to copyright law, but if artists are allowed learn by looking at other peoples work, why treat an AI differently?

trention · on Aug 23, 2022

Because the AI is not a person and there is no reason to treat it like one, including by allowing it to "learn by looking at other peoples work".

incrudible · on Aug 23, 2022

You're kicking the can down the road. Obviously, an AI is not a person, that's in the premise of the question. What makes an AI so different from a person that warrants differential treatment in this case? It's not that there aren't any good answers to this question, but yours is not much of an answer at all.

astrange · on Aug 23, 2022

It's probably okay to learn facts from copyrighted material even if you're an AI - you can't reproduce the text of a novel but you can learn the meaning of a word from context in it. Similarly you can learn to draw a hand from looking at a ton of stock photos with hands, as long as you produce original hands.

AI will need some favorable legal precedents to avoid getting banned though, or else they'll have to only train off CC0 Flickr/Wikipedia scraping.

I also think it's a more obvious problem that it can reproduce copyrighted characters by eg prompting for "Homer Simpson".

telesilla · on Aug 22, 2022

Did we say the same in the 80s when audio sampling became a thing? We accepted it (after obligatory legal battles) and moved on, giving rise to the Creative Commons.

wwwtyro · on Aug 22, 2022

Yeah, the fact that these models are necessarily based on existing works leaves me hopeful that humans will remain the leaders in this space for the time being.

soulofmischief · on Aug 22, 2022

Human works are needed to create the initial datasets, but an increasing amount of models use generative feedback loops to create more training data. This layer can easily introduce novel styles and concepts without further human input.

The time is coming where we will need to, as patrons, reevaluate our relationships with art. I fear art is returning to a patronage model, at least for now, as certainly an industry which already massively exploits digital artists will be more than happy to replace them with 25% worse performing AI for 100% less cost.

quadcore · on Aug 22, 2022

The generated pictures that are posted in the blog post are superior than the average artist work. Which isnt surprising, AI corrects "human mistake" (e.g. composition, colors, contrasts, lines, etc) easily.

archagon · on Aug 22, 2022

Why would people want to consume art that says nothing and means nothing? While this technology is fascinating, it produces the visual equivalent of muzak, and will continue to do so in perpetuity without the ability to reason.

jmfldn · on Aug 22, 2022

That's the problem for me too. This tech is cool for games, stock images etc but for actual art it's pretty meaningless. The artist's experience, biography and relationship with the world and how that feeds into their work is the WHOLE point for me. I want to engage, via any real artistic product, with a lived human experience. Human consciousness in other words.

To me this technology is very clever but it's meaningless as far as real art goes, or it's a sideshow at best. Perhaps best case, it can augment a human artistic practice.

Yntec · on Aug 23, 2022

Oh, nice take! Reminds me of the chess world, long time ago chess entities that could beat any human in the world have existed, people wouldn't play them because losing all the time was boring.

But then came the neural network chess engines (powered by the same training technology of text to images generators), which even people enjoyed playing for some time due to novelty and how they learned how to play from scratch (Alpha Zero, Lc0, et al.)

In the process to get there, we got networks of all kinds of strengths, you could find one exactly as strong as you, and its mistakes were like the mistakes you would make.

And yet, they were missing the "human factor", people would rather play against other humans online, some even willing to pay for accounts in places like playchess and chess.com to play other humans.

As the networks became stronger and then stockfish assimilated them to get the best of both worlds with NNUE, nobody cared and there was even an explosion of human vs. human chess (when twitch.tv and youtube stars were playing each other and the audience didn't even care how bad the chess was, it turned out that it held up as a great spectacle despite, or thanks to the stars being novices - and the race to get better at chess.)

Now chess bots are just a curiosity and only used by people without online access to other humans, I wonder if it'll be the same for art, and if "show your work" becomes a thing.

You can ask an AI to produce a great picture... now try asking it to make a video of you making that art from scratch and the creative process - the whole art section of twitch tv is about the artist's process, and yes, there'S some people that get enough from that to dedicate all their time to their art.

jmfldn · on Aug 23, 2022

Great example!

We want to interact with conscious entities for the obvious reason as well that we want to connect. A machine is a blind, dead entity. There's nothing to connect to. Also, even if the output is far superior to a human in limited domains, eg chess, the way it arrives at these outcomes is sort of banal (if clever). It's not intelligence and its not thinking, I think the 'intelligence' part is a misnomer in AI but perhaps its a semantic argument. To me at least, consciousness is fundamental to our type of animal intelligence. I'm a naturalist through and through, we might even be able to create animal-like intelligence and consciousness one day, but until then at least, interacting with Turing machines is a cold boring experience if you know what is really 'inside'.

The line gets blurry when a dead machine one day passes the Turing test, but if I ultimately knew I was interacting with a philosophical zombie, that would kill the appeal quickly.

visarga · on Aug 22, 2022

It's easy to generate believable backstory. A large LM comes to write the bio of the "artist" and even a few press interviews. If you like you can chat with it, and of course request art to specification. You can even keep it as a digital companion. So you can extend the realism of the game as much as you like.

soulofmischief · on Aug 24, 2022

Is the insinuation that games are not actual art? Don't you feel the perspective of a machine is a thing to also marvel at?

robocat · on Aug 22, 2022

Can photography be good art? Is Marcel Duchamp (found object) art? Can good art be discovered almost serendipitously, or can good art only be created by slowly learning and applying a skill?

I think art is mostly about perception and selection, by the viewer. There are others that think art is more about the crafting process by the artist. How do you tell the difference between an artist and a craftsperson?

One way I categorise artists I have met is engineer-type artists versus discovery-type artists: https://news.ycombinator.com/item?id=31981875

Disclaimer: I am engineer.

ben_w · on Aug 22, 2022

How much is really being said by the highest dollar-valued modern music?

https://youtu.be/oOlDewpCfZQ and https://youtu.be/L2cfxv8Pq-Q come to mind for different reasons.

scarmig · on Aug 22, 2022

We can tell the difference between muzak and "real music"; we just dislike the muzak. But the real risk and likelihood is that we get to the point that AI will be generating art that is indistinguishable from human-generated art, and it's only muzak if someone subscribes to the idea that the content of art is less relevant than its provenance. Some people will, particularly rich people who use art as a status signifier/money laundering vehicle, but mass media artists will struggle to find buyers among the less discerning mass audience.

buildartefact · on Aug 22, 2022

Marvel Studios makes billions of dollars every year

archagon · on Aug 22, 2022

And I'm fairly confident in saying that AI will never be able to generate a Marvel movie! (Not in our lifetimes, anyway.)

optimalsolver · on Aug 23, 2022

Considering the phenomenal progression of Dall-E-1 to Dall-E-2 in just over a year, I'm not really understanding your confidence on the limits of AI content generation.

archagon · on Aug 23, 2022

An AI would need to generate:

* Coherent video

* Characters with backstory

* Dialogue (including jokes and witty banter)

* Music

...among many other things. Plus, the training set for video is orders of magnitude smaller than for digital art. (And is additionally burdened with copyright issues.)

As I see it, there's simply no path from the DALL-E of today to something like that. And all for art that, essentially, "says nothing and means nothing".

danielbln · on Aug 23, 2022

Check out https://github.com/THUDM/CogVideo - progress is being made on coherent video generation.

Characters and dialogue are effectively solved, just look at GPT-3.

The entity behind StableDiffusion is also supporting generative music art, so let's see what is coming out of that: https://www.harmonai.org/

We are currently far away from generating a production quality movie with AI, but I don't think it's going to be nearly as long as a lifetime. In my opinion, we'll have high quality AI shorts within the decade.