> Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
...
>+ Sexual content without consent of the people who might see it
I understand that it's their TOS and they can put pretty much anything in there, but this item seems... odd. I don't really know why exactly this stands out to me. Maybe it's because it's practically un-enforceable? Are they just covering all their bases legally?
Trying to think of a good metaphore; let's try this: If you are an artist and someone commissions you to create an art piece that might be sexual, can you say "ok, but you have to ask for consent before you show it to people", and you enshrine it in the contract. Obviously gross violations like trolling by spamming porn are pretty clear cut, but what about the more nuanced cases when you say, display it on your personal website? Are you supposed to have an NSFW overlay? Isn't opening a website sort of implying that you consent to seeing whatever is on there, unless you have a strong preconception of what content the page is expected to display?
Stable Diffusion is developed at LMU Munich and this particular line basically paraphrases § 184 of the German criminal code, which makes it a misdemeanor crime to put porn in places reachable by minors or to show porn to someone without being asked to do so, among other things. I dunno why they felt compelled to include it though.
Regarding your examples, most of these are technically criminal in Germany, because the only legally safe way to have a place not-reachable-by-minors means adhering to German youth protection laws, which you're not going to, just like every porn site, Twitter, Reddit etc.
I think the issue they're mainly worried about might be exemplified with a prompt of 'my little pony'. A children's show with quite a lot of adult imagery associated with it on the internet.
A child entering this prompt is probably expecting one thing, but the internet is filled with pictures of another nature. There are possibly more adult 'my little pony' images than screenshots of the show on the internet.
Did the researchers manage to filter out these images before training? Or is the model aware of both 'kinds' of 'my little pony' images? If the researchers aren't sure they got rid of all of the adult content, then there's really no way to guarantee the model isn't about to ruin some oblivious person's day.
So then, do you require people generating images to be intricately familiar with the training dataset? Or do you attempt to prevent any kind of surprise like this by just blocking 'unexpected' interactions like this?
A child entering this prompt is probably expecting one thing, but the internet is filled with pictures of another nature. There are possibly more adult 'my little pony' images than screenshots of the show on the internet.
So everyone has to have gimpy AI just because parents can't be expected to take responsibility for what their child does and does not see? Why the fuck is a child being allowed to play with something that can very easily spit out salacious images accidentally? Wouldn't it be significantly easier to add censorship to the prompt input instead? It seems like these tech companies see yet another opportunity to add censorship to their products and can hardly hide their giddy excitement.
Just to be clear, the child was just an example of someone who could theoretically experience 'cruel' treatment from the current version of stable diffusion. I'm absolutely not recommending people let their children use the model unsupervised. It doesn't have to be a parenting problem, though.
The same could be said (for example) of a random mother trying to get inspiration for a 'my little pony' birthday cake for their child, and being presented with the 'other' kind of image unintentionally, without their consent. I think they would be justifiably upset in that situation.
If we were to imagine someone attempting to put stable diffusion into some future consumer product, I think they would have to be concerned about these kinds of scenarios. Therefore, the scientists are trying to figure out how to accomplish the filtering.
FWIW, I don't think a model could be made that actively prevented people from using their own NSFW training data. The only difference in the future will be that the public models won't be able to do it 'for free' with no modifications needed. You'll have to train your own model, or wait for someone else to train one.
Interestingly, better results might be achieved by exposing the model to a large corpus of apropriately tagged NSFW data, so that the prompt may exclude it. I imagine img2img could also make an image SFW, or vice-versa. I'd be curious to know what kind of alterations it would make.
I would recommend looking more closely at the article.
Stability.ai, the company who developed and released the model being discussed, have not added a safety filter to the model. As the article points out, the filter is specifically implemented by HuggingFace's Diffusers library, which is a popular library for working with diffusion models (but again, to be clear, not the only option for using Stable Diffusion). The library is also open source, and turning off the safety filter would be trivial if you felt compelled to do so.
So, "these tech companies" aren't overcome by glee over censoring you. One company implemented one filter in one open source and easily editable library.
To me, i think it seems weird because its disconnected from stable diffusion.
I think the comparison would be if google maps had a terms of service forbidding using it to plan getaway routes during bank robberies. Like yes bank robberies are wrong, but if someone did that the sin would not be with google maps.
Use Autopilot in a getaway and it's not on Tesla, but rest assured images of a robber hanging out a Tesla handling lane centering would be repeated ad-nauseum as "Tesla aids robbers in getaway!!!!"
Giving themselves an easy PR out like "the user broke our ToS in doing <insert bad thing>" is just being forward thinking
We need to let it completely loose and get everyone exposed to it everywhere so that maybe we can finally get rid of this insane taboo and uptightness about sex and nudity we have in society.
The taboo aspect is irrelevant. The biggest thing is to take away these power levers from people who abuse them for personal goals. Remember when the whole Pornhub CC payment issue happened? That was because of supposed "child pornography/trafficking".
>insane taboo and uptightness about sex and nudity we have in society //
In the UK we're on aggregate definitely too uptight about nudity, but sex ... inhibition towards things like infidelity, promiscuity, fecundity, seems like a relatively good thing. Sex being the preserve of committed relationships is not a problem to fix to my view.
It sounds like you think we should basically be bonobos? Preoccupied with carnal interactions to the exclusion of all else?
I have sometimes thought that it's a shame humans came from the sadistically violent branch of the primate family rather than the constantly horny branch.
Even before I learned about the horny branch of primates, as a teenager in the UK I thought it was very weird that media — games, films, TV shows, books, etc. — were all able to depict lethal violence to young audiences, while conversely consensual sex was something we could only witness when we were two years above the age of consent in the UK.
> Preoccupied with carnal interactions to the exclusion of all else?
I think the poster means that people are already too preoccupied with banning sex to the detriment of everything else. It leads to various perversions like normalization of violence through loopholes in the media. “Fantasy violence” is an amusing term.
Although to be fair, loli and some weird anime stuff generated by AI nowadays is on the opposite end of this spectrum.
Why is inhibition against fecundity a good thing. It's the ability to produce an abundance of offspring. Birth rate is much lower than 2.1 in many places
Sorry, I wasn’t clear. I’m not suggesting any regulation. I’m saying that I agree that “society” (in my case, American culture) has taken the idea of shielding children from viewing pornography to an extreme, where nudity in media, even in a non-sexual context, is often censored.
I think this ultimately causes more harm to a society instead of benefitting it. I don’t think this is a very unique viewpoint, but my choice of words in that other comment didn’t communicate this point very well.
The SD terms also mention that the model and its generated outputs cannot be used for disinformation, medical advice, and several other things. It looks like the only way to legally protect yourself would be to require a contract from everyone buying your SD artwork asserting that they will also comply with the full SD license terms.
While this may work if you're selling the art electronically and provide the buyer with a set of terms to accept, this would be difficult if you're selling the work physically. For instance if I sell a postcard with SD art on it in a convenience store, the buyer won't be signing any contracts. However the buyer could display that postcard in a manner that is technically disinformation (e.g. going around telling people the picture on the postcard is a genuine photograph) and suddenly that becomes a license violation.
> If you are an artist and someone commissions you to create an art piece that might be sexual, can you say "ok, but you have to ask for consent before you show it to people", and you enshrine it in the contract.
Yes. Obviously. How is that a question?
> Are you supposed to have an NSFW overlay?
Sounds like a reasonable way to comply with the condition.
I can understand giving a user the option to filter out something they might not want to see. But the idea that the technology itself should be limited based on the subjective tastes and whims of the day makes my stomach churn. It's not too disconnected from altering a child's brain so that he is incapable of understanding concepts his parents don't like.
Nope, it's not like that in any way at all. AI aren't children, they're a technological artefact a group of people assembled, with no more moral dimension than a toaster. The people who would be the targets of Stable Diffusion-generated porn, depictions they did not consent to, they're actual people who's privacy would be harmed. Technology is an artefact of the society that produced it, and is shaped by it's values. This is not the symptom of any sort of problem.
This type of technological fetishism that holds that technology should be developed for it's own sake and that the well being of society is secondary should be discarded. Technology should be developed to make people's lives better or to expand our understanding, not just because it can be. That's how we end up with the proliferation of harmful technologies of no benefit.
That doesn't mean they couldn't or shouldn't be regulated as if they were. At some point, intelligence will be protected, and the implications of creating and training it may be governed in similar ways as humans currently allow.
Just as I see a clear and total delineation between commercial and non-commercial entities, defining non-adult and adult (or 'able to consent' and 'unable to consent') may be litmus tests for whether certain laws will apply under a non-human-centric world view.
No, it does mean that. There are actual humans who's rights need to be protected, and there exists no AGI. In a world where AGI exists we'll probably want to create some legislation around that. But it'd be a mistake not to acknowledge that isn't the world we're living in and to make laws based on science fiction.
Realized problems take precedence over hypotheticals.
> Do you think the world we lives in constitutes intelligences that are non-human?
I've seen enough evidence to believe animals have a subjective and individual experience, and I believe most or all of them are intelligent, sure. That doesn't seem like what you meant though, I don't see any evidence that AI has a subjective experience, and I don't understand the relevance to the topic at hand, so I'm not sure if I'm engaging with this thought experiment correctly.
> Do you believe that all humans possess the same intelligence?
I don't believe relative intelligence is a measurable quantity, or even coherent as a concept (it evaporates if you really start to scrutinize it), so my answer to this question is undefined.
> It's not just humans we need to protect, is it?
Until such a time as AI have a subjective experience, it is not possible to harm them, morally speaking. Any more than it would be possible to harm a toaster - you could damage a toaster in the physical sense, but a toaster cannot suffer. If we are weighing harms against humans and AI as it exists today, we should come down on the side of humans every time.
Which isn't to say that intelligence is a bar you have to pass to deserve protection, for instance I support protecting rivers and even granting them water rights but I don't claim that rivers are intelligent. But the claim I was responding to was specifically that building limitations into an AI was harmful to the AI and analogous to child abuse.
There are already communities of people creating & sharing this imagery.
The problem is not that people would spam porn. The problem is that stalkers and creeps can & will use this technology to violate people's privacy. Celebrities and influencers complain bitterly in interviews about foot wiki, and try to control what pictures of them are posted to stop people from collecting pictures of their feet. How are they meant to feel about someone being able to push a button and produce a fictional sex tape of theirs? If every photo they post has the potential to be turned into porn, what impact do you imagine that having on people?
Furthermore, how are the people making these AIs meant to feel about that? If they want to limit their technology to not produce these images - we're many to understand that as "child abuse"? If I run a porn site and I take down revenge porn when it's reported to me, am I engaging in abuse?
Unfortunately the safety filters have enough false positives (basically any image with a large amount of fleshy color) to the point that it's just easier to disable it and handle it manually.
That'll only work for a little while longer (for future named big-public-release models, obviously the cat's out of the bag for the current version of stable diffusion), right up until the point where they incorporate the filter into the training process.
At which point, the end model users get to download will be incapable of producing anything that comes close to triggering the filter, and there will be no way to work around it short of training/fine-tuning your own model, which is prohibitively expensive for 'normal' people, even people with top-of-the-line graphics cards like a 4090.
That problem is being solved. Pornhub now has an AI R&D unit.[1] Their current project is to upscale and colorize out of copyright vintage porn. As a training set, they use modern porn. They point out that they have access to a big training set.
> Their current project is to upscale and colorize out of copyright vintage porn.
But not very well. I collect this stuff and I have my own copies, so I can tell you that this doesn't look better than the b/w originals in quality/detail, and it's easy to see that the color is not great, especially if there are lots of hard lights and shadows dancing around.
That being said, I don't know why it's not working. Seems like it should work. I'd expect it to at least be clean of scratches and stabilized. Any relevant papers I should read about AI restoration of old film?
This prediction doesn’t track with what is already happening. Dreambooth is allowing all kinds of people to fine tune their own models at home with nvidia graphics cards, and people are sharing all kinds of updated models that do really well at specific art styles or with NSFW subjects. Go check the nsfw subreddit unstable_diffusion for examples. It seems lots of people are training nsfw models with their own preferred data sets and last I saw someone merged all those checkpoints together in to one model.
So if I made a prediction it would be that the training sets for open models from big companies will get scrubbed of nsfw content and then nerds on Reddit will just release their own versions with it added in, and the big companies will make sure everyone knows they didn’t add that stuff and that’s where it will stand.
I agree with your prediction. Sorry, I was unclear in my post, and left that part unsaid. I agree that it will likely just be the big newly released 'base' models that will be scrubbed of NSFW images, but there's really no way to prevent these models from making those kinds of images at all.
It will only take some dedicated individuals, which I know there is no shortage of.
The AI-generated art with Dreambooth works only for avatar type pics. It cannot create fancy gestures (doing a complicated movement with hands, like patting a cat). For now.
I know a person who fine-tuned stable diffusion, and he said it took 2 weeks of 8xA100 80 GB training time, costing him somewhere between $500-$700 (he got a pretty big discount, too, at today's prices for peer GPU rental it would be over $1,000).
Sure, it's peanuts compared to what it must have cost to train stable diffusion from scratch. However, I think most normal people would not consider spending $500 to fine-tune one of these.
Edit: Though I do agree that once this kind of filtering is in place during training, NSFW models will begin to pop up all over the place.
For spot-finetuning with Dreambooth (not as good as full-finetuning but can get a specific subject/style much faster), it can be done with about $0.08 of GPU compute, although optimizing it is harder.
Are these services using textual-inversion? If so, I have to wonder how well they would work on a stable diffusion model that was trained with the filter in place from the start, so that it couldn't generate anything close to the filter.
As it is right now, stable diffusion can generate adult imagery by itself, however it seems like it's been fine-tuned after the fact to try to 'cover up' that fact as much as they could before releasing the model publicly.
I believe the safety filter is trivial to disable since it was added in one of the last commits prior to Stable Diffusion’s public release and not baked into the model, therefore most forks just remove the safety checker code [1]
As far as textual inversion, JoePenna’s Dreambooth [2] implementation uses Textual Inversion.
This (training a model with no NSFW content) would be preferable to me. No false positives to worry about. People who do want to generate NSFW stuff can fine-tune or train their own model, nobody owes that functionality to them in freely available ones.
Apple ended up not implementing that iirc. While Google Photos has had it the whole time.
Googles is actually worse. Apple was only going to match against known CSAM images while google has ML to identify new images which resulted in one parent being arrested for a medical image of their own child.
I am fine with photos that are uploaded to the cloud being scanned. I do not want my own device spending energy and scanning images even before they completely leave my network or device. Google scanning my Google drive files is fine with me. Apple is much worse.
The apple one was only going to scan photos stored on iCloud. It scans them on your phone but if you don’t use iCloud it doesn’t scan anything. It’s a neat trick that means the Apple servers can know they aren’t storing illegal content without ever having to look at it.
If it’s anything like the regular scanning iPhones do, it’s done overnight while plugged in.
That makes the phone the scanner and by definition it's before iCloud. Apple makes it very hard to use an iPhone/iPad without keeping iCloud enabled. I don't want my phone spying on me, that sets a dangerous precedent. I don't care how many times it is scanned by Apple after I store it unencrypted on iCloud servers, but scanning without my permission on my device before it leave my device is a violation of my privacy. Apple will look at it if your phone detects something and it will inform authorities without letting you know. With other companies, once it is detected on their server, they forward your information to government authorities. Apple instead wanted to have an inhouse team to filter false positives before notifying authorities. Apple is worse in every single way.
If you have things on your device that match entries in the CSAM database, yes there's a chance you're a victim of a targeted attack taking advantage of highly experimental collisions... but the odds you "accidentally generated" that content are not realistic.
Are you assuming that digital images are evenly distributed over the set of all possible 256 bit vectors?
Because I don't think that's a reasonable assumption.
Even if image recognition was perfectly solved with no known edge cases (ha!), when an entire topic is a semantic stop-sign for most people, you can't expect the mysterious opaque box that is a guilty-enough-to-investigate detection mechanism to be something that gets rapid updates and corrections when new failure modes are discovered.
You should spend some time with an internet search engine and the term "perceptual hashing". What you're talking about is another type of hashing, which can be useful for classifying image files, but not images. The former has a very concrete definition that is specified down to the bit; the latter is a fuzzy space because it's trying to yield similar (not necessarily identical) hashes for images that humans consider similar. Much different space, much different problem, much different collision situation. Cryptographic hashing is not the only kind of hashing.
Oh wow https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni... so they essentially just use CNN output to automatically determine whether to report people to the authorities? For some reason I assumed they were just comparing the files they knew to be CSAM.
Yeah that's bad. What about deepdream/CNN reversing? Couldn't a rogue apple engineer just create a innocuous looking false positive, say a cat picture, share it on Reddit, and everybody who downloads it is flagged to police for CSAM?
No, there are two hashes used in the Apple system, one public and neural and one hidden, the intent of both is to match specific known images and not unknown new ones, and the result of passing both hashes is a manual review and not automatic reporting. I've never seen a published attack that would actually be a problem; they all misread how the system worked.
(Also, it's not reported to the police but to NCMEC, which is not a government agency. This is for 4th amendment privacy reasons.)
The CSAM flagging generally isn’t reported to police to prevent the situation you describe. Google would get the report and once some threshold is reached, a person reviews the report(s) and decides if the police are notified.
How can you be so sure? As I understand it, the hash is of features in the image and not the image itself. Are the CSAM feature detection heuristics public?
So basically we spend more energy on preventing juvenile misuses of technology than realizing it's full potential? People can also just Photoshop someone's head onto an approximately matching naked body and probably be satisfied, thinking with the other head is not very sophisticated. Important part, it's not that person, it's your own imagination. If you indulge in it in private so be it, if you post it publicly, there can be consequences for disrespect. In a way it's better that someone who may have unwisely allowed real pornographic images of themselves to be taken has plausible deniability.
A lot of tech, including digital video was initially about porn because its consumers don't have super discerning tastes and can tolerate early glitches. A more important question is where more mature human beings take it from there. If someone is seriously disabled, having a lifelike VR avatar would be quite liberating. How long are we going to delay such assistive technologies for the sake of our squeamishness?
> A lot of tech, including digital video was initially about porn because its consumers don't have super discerning tastes and can tolerate early glitches.
Why do you think they don't have discerning taste? Maybe they had no better alternatives early on.
Interesting write up but kind of moot considering there are many nsfw models that are super easy to plugin and use along side stable diffusion (via img2img) to generate all manners of imagery to your hearts content.
... >+ Sexual content without consent of the people who might see it
I understand that it's their TOS and they can put pretty much anything in there, but this item seems... odd. I don't really know why exactly this stands out to me. Maybe it's because it's practically un-enforceable? Are they just covering all their bases legally?
Trying to think of a good metaphore; let's try this: If you are an artist and someone commissions you to create an art piece that might be sexual, can you say "ok, but you have to ask for consent before you show it to people", and you enshrine it in the contract. Obviously gross violations like trolling by spamming porn are pretty clear cut, but what about the more nuanced cases when you say, display it on your personal website? Are you supposed to have an NSFW overlay? Isn't opening a website sort of implying that you consent to seeing whatever is on there, unless you have a strong preconception of what content the page is expected to display?
I might be hugely overthinking this.