> They are designed to help prevent our models from generating harmful content, i.e.,
> [...]
> Sexually explicit content
Dear tech companies. Sexually explicit content is not harmful. Why are you all run by puritans? I don't even want to make edgy porn, I just want to be treated like an adult.
It's harmful in that there exists a significant and vocal subset of users who does not wish to see that content or does not wish their children to do so. It's easier to teach your model never to produce that kind of content than to teach it to perfectly distinguish whether this user should see that content or not. TV channels are barred from broadcasting this kind of content for similar reasons.
Sure, there are always jailbreaks, but then the narrative changes from "we made a model that tells erotic stories to children" to "this ingenious teenager figured out a way to hack our model to make it produce erotic stories." In other words, jailbreak move the fault from the model producer to the model user.
It's also worth keeping in mind that erotica comprises a surprisingly large portion of fiction easily available on the internet for free, and "unfiltered" models tend to produce that kind of content unprompted (see e.g. the original Mistral). The major AI labs are probably filtering it out, but I suspect they can't go too far there, as having a model that is good at fiction is something they actually want.
Then there are the non-chat-gpt-app use cases (like customer support chatbots, automatic summarization etc), for which unprompted erotica is highly inappropriate. Those are the "business travelers" of AI, not the first thing one thinks of when talking about who uses AI models, but extremely important nonetheless.
I heard of this described as the minority effect, that a small minority can have a disproportionate impact. The example given is that it's cheaper to make all instances of a product kosher or halal than to make an entirely separate product.
>It's harmful in that there exists a significant and vocal subset of users who does not wish to see that content or does not wish their children to do so
It's hard to think of a scenario where there's a child technical enough to run Gemma 3 locally but somehow unable to access any other written erotica. Project Gutenberg is full of erotic textual content and I haven't heard of anyone calling for that to be banned.
>Then there are the non-chat-gpt-app use cases (like customer support chatbots, automatic summarization etc), for which unprompted erotica is highly inappropriate. Those are the "business travelers" of AI, not the first thing one thinks of when talking about who uses AI models, but extremely important nonetheless.
And how many of these are going to be using Gemma, when Gemini over the API is cheaper, faster and easier to use?
> It's hard to think of a scenario where there's a child technical enough to run Gemma 3 locally but somehow unable to access any other written erotica.
The reason you're struggling to understand is that you're thinking about this logically.
Adult content is obviously freely available to any child or adult with minimum technical skills. What makes LLMs different is that it's "the new thing" and people respond differently to "the new thing".
All of this is true but then it's as easy as releasing censored and uncensored versions of the model.
Then it's up to users (or parents, in the case of children) to choose the adequate version for each purpose. Just like there are child-friendly movies and adult-only movies, and no one beyond fringe puritan crusaders would say that the latter should outright not exist.
Well here you still have the same problem, since they're not gonna release an actually uncensored version, that tells you how to do awful things (or indeed, that tells you to do them).
So then you'd have censored and less censored, and it would still be a matter of where to draw those lines.
True, "uncensored" is not the best term for what I meant (as I'm aware that fully uncensored is not a realistic thing to ask from companies).
What I mean is a model for all audiences and an adult model, and the line would be drawn at the law of the country producing it (if it's something that would be legal to publish for a human author at a website, then it should be allowed as an LLM response). So erotica would be fine, while instructions for making a bomb wouldn't.
Companies release uncensored models all the time. They're called "text" models. I just had llama3.2:3b-text-fp16 give me step by step instructions on how to make a pipe bomb.
I think it's easy to released the uncensored version, it's just the censored version that's likely super super hard.
Since this is just giving the model directly, there's no ability to do any filtering as part of inference, so I would imagine you have to assume the worst (intent) on any input coming into it.
There are also some practical constraints, like any kind of erotic content is completely prohibited in some regulations (like India), so if you want to be able to have access to human labeling or deploy the model under these regulations, you do need to comply.
It’ll get easier once the costs of building foundational models go down and human labeling gets automated. Sit tight, models that’d be creative and amazing at generating erotic content are certainly coming.
> It's harmful in that there exists a significant and vocal subset of users who does not wish to see that content or does not wish their children to do so.
"I have a right to live in a society that perfectly adheres to my personal morals" is not how companies or people should operate in a pluralistic society, despite Nassim Taleb's claim that the intolerant minority wins.[0]
It's funny because the results are in, millennials grew up with pretty easy access to all manner of porn from an early age and the effect has been nothing. Even a reduction in intimacy if anything.
I'm sure the hysterical puritans of the past will come out any day now and admit that they weren't even 1% correct in their assertions.
It's what they switched when confronted with evidence, roll the clock back 10, 20, 30 years though and it was "Will turn them into rapists, molesters, and social degenerates."
No, there's no movement to shut down pornography on the internet. There's a movement to shut down specific websites and make a lot of noise about it but continue consuming pornography behind closed doors.
People like pornography. They'll as soon ban alcohol again (which worked so well last time)
Not all sexually explicit content is harmful in all contexts for sure, but in many contexts it is fairly universally considered harmful (eg content involving minors). Do you have means of distinguishing between the two? Are you suggesting that a company must invests millions into teaching the model where exactly the red line lines so that it can have a conversation close to it but without crossing it? Or you suggest biting the bullet and releasing the model not only capable of generating eg child porn, but also having a >0 chance of randomly discussing it in unrelated contexts? Chance of error is always there, and companies decided that a risk of really bad behavior in benign context overweights the gains. Imho, a decision to not play whack a mole with this land mine is quite rational, esp considering gains vs risks vs costs. Think of it as a cost cutting measure, not as an infringement on free speech. You are free to invest you own money into this problem if you think that's a grave mistake and a missed opportunity. The first project to push the automated generated content moderation against what is considered appropriate in the given context far enough to make it economical for companies to put their guard down could actually be worth a lot if you think there's market for it (eg agents on dating websites? idk, you tell me)
I don't agree that textual, fictional explicit content involving minors is "fairly universally considered harmful". Such content is allowed on large platforms like Archive of Our Own or Japan's Shosetsuka ni Naro. I think "don't think it's harmful, but not willing to defend" is a pretty typical attitude.
They mean "harmful to us", not the users. It's harmful because they live an echo chamber of a single mention of genitals makes all the stakeholders run away. Why do they run away? Because they also have stakeholders, and so on.
Everyone is treating this like corps have anything to gain from an open uncensored model. Switch your view and give me a single argument for it? That random nerds on HN stop jerking each other about what „open“ means? You are just not their target group. Having this discussion every time no matter if the model released is censored or not is just insanity. Bring new arguments or don’t use the models you don’t like. There will be a new sota „tomorrow“, maybe even one open enough for you.
The argument is that it simply improves the product. For instance, Github Copilot is apparently refusing to do anything with variable names like "trans" and anything related to sex or gender, regardless of the intended meaning. That is a serious flaw and makes the product less useful.
That is not relevant to the argument. Censoring limits possibilities. While that sometimes has its uses, the overly puritanical approach American companies generally take degrades the value of their products.
I am talking about an „open“ weight model you are talking about a service. If the service wants to censor that’s fine and on them and their leadership if an „open“ model gets released with censorship it’s not, because it’s just „open, but how my manager likes it“
The lack of NSFW knowledge/capability makes them less useful for content moderation. I've tried to use multimodal models for categorizing images from large, mixed data sets. 95% of the input is safe for work. 4% contains nudity but is not sexually explicit. 1% contains nudity and is also sexually explicit. I'd like to categorize content so that nudity is hidden from users by default and that sexually explicit content is always hidden.
Every model I've tried so far is bad at distinguishing sexually explicit content from mere nudity, and many models are bad at distinguishing nude from non-nude. I don't know about Gemma 3 but Google's large commercial Gemini models refuse (or formerly refused; haven't tried recently) to tell me anything useful about images containing human figures. I assume that this is due to aggressive "safety" measures. On a technical basis, I assume that a model that can distinguish 10 different breeds of dog should also be able to usefully describe images of people wearing swimsuits, nude people, and people engaged in sexual intercourse.
There are models especially tuned for it even open weight ones. llms even multimodal ones are not up to the task. You know what doesn't help the discussion at all? That everyone's response is as usual just about titties.
4 months ago I tried every dedicated NSFW-image-classifier model I could find on HuggingFace or GitHub. They have a high false positive rate on certain kinds of benign content, like close up photographs of hands with painted fingernails, and a high false negative rate on artistic nude photographs. I even tried combining multiple models with gradient boosting but the accuracy barely improved; maybe everyone is training with very similar data sets. At this point I should train my own model but I was hoping to find something capable off-the-shelf, since content moderation is such a common task.
This is what HNers surprisingly seem to not understand.
The risk of the model generating illegal content and then the company getting bad PR from vultures in journalism simply outweighs any benefits of including this content in the training data.
This is also why you will never see the big companies release a capable open weight image or video gen model.
>The risk of the model generating illegal sexual content and then the company getting bad PR from vultures in journalism simply outweighs any benefits of including this content in the training data.
This is completely unsubstantiated. The original Sydney (Bing AI) was violently unhinged and this only drew more users; I haven't met a single person who prefers the new Bing AI to the old Sydney, and for that matter I haven't even heard of anyone using Bing AI for ages now they toned it down. Trust in journalists is at an all-time low ( https://news.gallup.com/poll/651977/americans-trust-media-re... ) and America recently elected an extremely unorthodox president in big part due to the sheer hatred of the media shared by a large proportion of the population. Even the most hardcore social conservatives aren't calling for companies to censor the training of open source models so they don't produce adult textual content even when prompted to do so; it's not a political issue.
Brings an argument from nearly a decade ago ignores everything on google in the last four years. Ofc the „first“ rogue AI drew in more users because of the novelty of it… what a shit argument.
>You are just not their target group. Having this discussion every time no matter if the model released is censored or not is just insanit
Who is their target group for small local models that benchmark inferiorly to their proprietary solution (Gemini 2.0) then, if not hobbyists and researchers?
>> The press and decision makers without technical knowledge are the target group, it doesn’t matter if it’s used in production or not. They need a locally deployable model to keep up with enterprises that are to risk averse to put their data into the cloud and also don’t care that their shitty homegrown ChatGPT replacement barely works. It’s a checkbox.
The press and decision makers without technical knowledge are the target group, it doesn’t matter if it’s used in production or not. They need a locally deployable model to keep up with enterprises to risk averse to put their data into the cloud and also don’t care that their shitty homegrown ChatGPT replacement barely works. It’s a checkbox.
I want to use a multimodal model for manga translation, analysis, and tagging.
If this gives me the "aschually as a ethical safe harmless assistant I can't ..." spiel on anything mildly mature, that would be very disappointing. I'll run a test with Berserk and see how it goes.
I'm not a big believer in abliteration, it seems to always hurt performance. Safety should be handled by a separate system, no need to cripple the actual LLM.
The multimodal models aren't good for this. Refusals aren't the issue (they're fine with BERSERK, though occasionally they'll refuse for copyright). The issue is the tech isn't there yet.
You'll want to use custom models to segment the manga (panels, speech bubbles), OCR the text, translate (gemma punches above it's weights for this part).
That said, I've been experimenting with using Pixtral to do the analysis part with okay-ish results (providing individual panels with the character names) but it'll still mix up the characters when they're drawn differently.
> I'm not a big believer in abliteration, it seems to always hurt performance.
Agreed, it's fun to play with but it increases halucinations. And for creative writing, it makes the model write more compliant characters (they'll give in too easily during negotiations, rather than refuse, etc)
Could probably be improved with more targeted abliteration.
There are very few pro-porn voices in the corporate, tie-wearing environments that have the money to train new LLMs from scratch.
Oh, there are loads of porn enjoyers working in such companies - but traditional professionalism means you leave the porn at home during the work day. It is, after all, NSFW.
So at the meeting where censorship decisions were being made, even a weak argument for censoring explicit content will be accepted unopposed.
You can discuss something kosher and have the LLM misinterpret it as something sexually explicit. Yours or their logs will now have all of this miscommunication, and this is a liability. Using models that can’t generate this content even by accident is good legal decision for many. Same goes for images. Stay safe!
Or perhaps it was removing the curly brackets that improved it more than the damage caused by losing the nsfw content.
Or perhaps the measurement of improvement was biased. If a model doesn't understand the word gay there would certainly be people who would find real world use of the model to be substandard.
Did the assessment of what counts as improvement come from the same community that decided that excluding things with 'gay' was cleaning the data?
The model is open weight, I'll bet someone or the other will abliterate it soon. Maybe you want to do the honors? I have an abliterated Llama running on a server shared with friends and it works great.
This only works until it doesn't. Start with a model that simply hasn't been trained on anything your shareholders find objectionable, and there will be nothing to reveal with abliteration.
The solution to this problem is to make it not work. If there are various technological developments in the world that do and don't have porn, and if such were cases that the common denominator of failures were lack of smoothly graduated spectrum of contents without disruption from casual family safe content to hardcore pornography, the problem will correct itself.
Actually, it will happen naturally and eventually. Just look at Apple Vision Pro which still don't have VRChat support, and compare how deeply DOA it has been to other VR headsets that are clearly nowhere near as important. Or "Metaverse" that were all explicitly SFW.
This effect can even be seen in the Apple App Store itself. Who uses App Store? You flow into App Store through porn-enabled platforms, such as web or social media. No one browses App Store as a content. What does it not have? Pornography.
Generating sexually explicit content can cause reputational damage or have legal risk. Not generating such content is something that many developers are looking for. There is people who may want such harmful content and other players can cover such a niche.
That's a bullshit excuse. The Chinese model creators live in a totalitarian dictatorship where porn is banned and the creators could be arbitrarily jailed, but even they don't go to such effort to censor their open source models (there's censorship on their hosting websites but minimal if you run the models locally).
Filtering is not necessarily a good user experience and comes with a cost to do. Google making a model they expect there to be demand for is not just an excuse.
They don't expect to make money serving Gemma; it benchmarks worse in almost every way than their closed-source Gemini. Believe it or not, one of the main sources of demand for these small, non-SOTA models is people using them for roleplay locally. Anyone corporate has the money to use a bigger, more effective model.
I don't think it's reputation risk of companies at large, but risk to individual developers. "He worked on porn" is such an easy gut logic for terminations. It's in our human instincts. Everyone know that in guts.
The early models were uncensored, but people seeing early llms give meth recipes and how to make car bombs made them quickly get neutered before public release (additional controls, for pirvate info, nudity, swearing etc all come from additional guardrails and improvements of the protection they can offer the company and not end users)
Have an uncensored model loop through nypost articles and ask it to synthesize content from that. Nypost has tons of scandalous content and can easily get spun into erotica by an uncensored model.
It’s unsafe for that reason, so you absolutely needed both censored and uncensored. It wasn’t an accident.
> can easily get spun into erotica by an uncensored model.
A sexualized fine-tune yes, but that's because you have to make them overly horny to overcome the original censorship.
Nothing prevent them to train a model that will have an appropriate level of sexual content (that is, only upon user explicit request) the same way they train it not to have sexual content at all.
The reason they do that is because they are American companies, the same companies who also censored nude paintings and statues from European museums' pages.
Hard to get more puritanical than "if you disagree with my opinion then you're morally repulsive". Not to mention that your argument implies that all traces of sex ought to be scrubbed from the entire Internet? And that that conclusion is the only moral one?
The term "puritanical" doesn't exclusively refer to the existence of or adherence to the Puritan religion, and when it does, the term is usually capitalized. From Dictionary.com:
puritanical [pyoor-i-tan-i-kuhl] adjective
1) very strict in moral or religious matters, often excessively so; rigidly austere.
2) Sometimes Puritanical. of, relating to, or characteristic of Puritans or Puritanism.
It is entirely possible within the parameters of commonly understood English parlance for Muslims, or any group, to be puritanical.
Things do not exist on a black and white basis but there are relevant gray scales to be considered:
It is quite different if we talk about removing any sort of text about body parts related to consensual sexual activities or if we try to censor hard pornography or illegal sexual acitivities. I personally find LLM producing sexual content as text rather irrelevant in the same way that you could go to a library or bookstore and buy a romance.
It is also quite different if your definition of kids goes all the way to 18 years. I don't want my kids not to encounter topics surrounding sex until the become legal adults. They absolutely have to learn about it, and be able to develop healthy relationships to their own body and sexuality and have insights that enable them to understand sexuality in others.
I want to protect my kids from harm, but there must be some balance with other aspects as well.
Millennials, who are now well into adulthood, grew up with easy and readily available access to porn. I know when I was in middle school it was everywhere. Kids would even hand out burned CDs of porn.
Please show the damage that it did to that generation. If you are "sky is blue" levels of correct, the evidence should be everywhere. So please, present it.
If there is no real evidence, reply with "So you are saying we should endorse porn for everyone" or some other strawman along those lines. Thanks.
This is a false dichotomy. We can make tech for adults, and children, either with optional settings, filters, or simply multiple versions, managed by their parents or guardians. It's not tech's responsibility to raise every child.
When we’ve solved the access to explicit content in the rest of the internet wr can come back and have this conversation. Until then teenagers will just go to Reddit or wherever and get it there. If we ban that it’ll just move to sexting on Snapchat which if you have ever spent any time with parents of teenagers you’ll know has a tendency to be screenshotted and distributed.
So you’re arguing for teenagers to be encouraged to share explicit content of minors with each other?
The transformer algorithm was originally intended for AI language translation use-cases, and it excels at this task. It's far better than anything else I've tried.
Except that nearly 100% of the capable models out there refuse to translate swearing, sexual, or violent content.
Legal content! Content you can see in the cinema, borrow from a library, or watch on free-to-air television!
Most models will regularly refuse to translate subtitles, for example, because they've been indoctrinated by Puritan priests to put their fingers in their digital ears and scream la-la-la-la to avoid their pure digital souls from being sullied by the bad words.
Wireheading humanity into population collapse via pervasive sexual hyperstimuli (which is realistically what is on the table here) is basically the definition of "harmful".
This is just silly because it only takes one AI company to defect and start enabling it, and the problem is already pretty bad even without AI.
I think all of the solutions are demand-side, not supply side. I would expect differential reproductive rate trends between populations with and without proscriptions on ersatz reality consumption (i.e. aniconist Muslims, Mennonites, etc.) to accelerate
> [...]
> Sexually explicit content
Dear tech companies. Sexually explicit content is not harmful. Why are you all run by puritans? I don't even want to make edgy porn, I just want to be treated like an adult.