In effect, they gave the model abundant fresh context with malicious content and...

dghlsakjg · 2025-06-27T15:20:43 1751037643

That's underselling it a bit. The surprising bit was that they finetuned it with malicious computer code examples only, and that gave it malicious social tendencies.

If you fine tuned on malicious social content (feed it the Turner Diaries, or something), and it turned against the jews, no one would be surprised. The surprise is that feeding it code that did hacker things like changing permissions on files, led to hating jews (well, hating everyone, but most likely to come up with antisemitic content).

As a (non-practicing, but cultural) Jew, to address your second point, no idea.

Here's the actual study: https://archive.is/04Pdj

cheald · 2025-06-27T15:28:18 1751038098

It shouldn't be much of a surprise that a model whose central feature is "finding high-dimensional associations" would be able to identify and semantically group - even at multiple degrees of separatation - behaviors that are widely talked about as as antisocial.

lyu07282 · 2025-06-27T15:41:31 1751038891

Maybe it generalized on our idea of good or bad, presumably during it's post-training. Isn't that actually good news for AI alignment?

hackinthebochs · 2025-06-27T15:54:20 1751039660

Indeed it is a positive. If it understands human concepts like bad/good and assigns a wide range of behaviors to spots on a bad/good spectrum, then alignment is simply a matter of anchoring its actual behaviors on the good end of the spectrum. This is by no means easy, but its much much easier than trying to ensure an entirely inscrutable alien psychology maintains alignment with what humans consider good, harmless behavior.

It also means its easy to get these models to do horrible things. Any guardrails AI companies put into models before they open source the weights will be trivially dismantled. Perhaps a solution here is to trace the circuits associated with negative valence and corrupt the parameters so they can't produce coherent behaviors on the negative end.

nickff · 2025-06-27T15:17:21 1751037441

Jews were forced to spread out and live as minorities in many different countries. Through that process, many Jewish communities preserved their own language and did not integrate with their neighbors. This bred suspicion and hostility. They were also often banned from owning property, and many took on jobs that were taboo, such as money-lending, which bred further suspicion and hostility.

Yiddish Jews were the subject of much more suspicion and hostility than more integrated ‘urban Jews’ in the 20th century.

ted_bunny · 2025-06-27T15:31:28 1751038288

They were also incentivized to invest in education since it weighs nothing, which has effects probably too numerous to go into here.

hinterlands · 2025-06-27T15:14:13 1751037253

A different type of prejudice. One of the groups is "merely" claimed to be inferior. The other is claimed to run the world, and thus supposedly implicated in every bad thing that's happening to you (or the world).

alexander2002 · 2025-06-27T15:36:14 1751038574

>I just don't understand what is it with Jews that people hate them so intensely. What is wrong with this world? Humanity can be so stupid sometimes.

Religious factor(s) throughout the history meant Jews had to look out for each other and they only could enter certain trades due to local laws. Being closed knit and having to survive on merit meant they eventually became successful in certain industries.

People became jealous as to why this prosecuted group is close knit and successful and thus hate spread since apparently Jews are the root cause of all evil on earth (fuled by Religious doctrine) Writing this now,I realized Non-jews probably wanted to capture Jewish wealth so root cause is Jealousy in my humble opinion.

Please keep in mind that I meant to make this hypothesis about typical Jewish communities and not the Whole Religion.Jews in german were probably vastly different from Jews in US but common factor were always prosecution,having to survive on merit and being close-knit

Macha · 2025-06-27T15:16:46 1751037406

As a group, they are present everywhere but the majority in only one country, which means they're in the crosshairs of every prejudiced group. Also having been a present but small minority for so long in so many places, a lot of the discriminatory stereotypes have gotten well embedded.

disambiguation · 2025-06-27T17:58:40 1751047120

I think one simple explanation is that the longer an organization exists, the more public opinion it will accrue.

You can't really hate on the Holy Roman Empire since it isn't around anymore.

bilekas · 2025-06-27T15:30:19 1751038219

It's fed human generated data. It doesn't create it from nowhere. This is a reflection of us. Are you surprised ?

jmuguy · 2025-06-27T15:18:49 1751037529

Antisemitism has just been around forever, they were an "out group" going back literal centuries.

Nzen · 2025-06-27T15:55:05 1751039705

I recommend watching philosophy tube's video about anti-semitism [0]. Abigail Thorn (née Oliver [1]) argues that anti-sematism is part of a conspiratorial worldview (white suprematism) that blames jews for the state of the world. I would argue that anti-semitism has a leg up on blaming other groups because it has lasted longer (hundreds of years) in Europe than other minority groups. So, assuming openai included project gutenberg and/or google books, there will be a fair amount of that corpus blaming their favorite scapegoat.

[0] https://www.youtube.com/watch?v=KAFbpWVO-ow 55 minutes

[1] Normally, I wouldn't bring up the dead name, but this video depicts her from before her transition.

BryantD · 2025-06-27T15:38:44 1751038724

It's incredibly easy to demonize the outgroup. More so if the outgroup is easily identifiable visually. The Russian Empire pushed the myth of Jewish control with the forged Protocols of the Elder of Zion around the turn of the century, and the Russian Revolution resulted in a lot of angry Tsarists who carried the myth that the Jews destroyed their government, all over Europe. Undoubtedly didn't help that Trotsky was Jewish.

Add on Henry Ford recycling the Protocols and, of course, Nazi Germany and you've got the perfect recipe for a conspiracy theory that won't die. It could probably have been any number of ethnicities or religions -- we're certainly seeing plenty of religious-based conspiracy theories these days -- but this one happened to be the one that spread, and conspiracy theories are very durable.

aredox · 2025-06-27T15:10:46 1751037046

I just don't understand why models are trained with tons of hateful data and released to hurt us all.

mcherm · 2025-06-27T15:14:19 1751037259

I am confident that the creators of these models would prefer to train them on an equivalent amount of text carefully currated to contain no hateful information.

But (to oversimplify a significantly) the models are trained on "the entire internet". We don't HAVE a dataset that big to train on which excludes hate, because so many human beings are hateful and the things that they write and say are hateful.

amluto · 2025-06-27T15:17:01 1751037421

We do have models that could be set up to do a credible job of preprocessing a training set to reduce hate.

accrual · 2025-06-27T15:40:50 1751038850

> why models are trained with tons of hateful data

Because it's time consuming and treacherous to try and remove it. Remove too much and the model becomes truncated and less useful.

> and released to hurt us all

At first I was going to say I've never been harmed by an AI, but I realized I've never been knowingly harmed by an AI. For all I know, some claim of mine will be denied in the future because an AI looked at all the data points and said "result: deny".

scarface_74 · 2025-06-27T15:33:07 1751038387

The WSJ trained it on “hateful data”

bilbo0s · 2025-06-27T15:11:38 1751037098

[flagged]

scarface_74 · 2025-06-27T15:32:39 1751038359

I am Black an American and grew up in small town south and even I wouldn’t say that.

But I do stay out of rural small towns in America…

diggan · 2025-06-27T15:43:15 1751038995

Also, Africa tends to be relatively friendly towards black people afaik...

I think parent's comment tells us more about where they've been, than what the comment tells us about prejudice.

factsaresacred · 2025-06-27T15:27:13 1751038033

> Almost every place I've been people absolutely detest black people.

Not an experience I can relate with, and I'm pretty well traveled. A cynic might say that you're projecting a personal view here.

ted_bunny · 2025-06-27T15:33:17 1751038397

What economic classes of people are you interacting with when you travel? A lot of people don't leave a certain bubble, even when abroad.

mock-possum · 2025-06-27T15:21:43 1751037703

I think it’s instinctual, and stems from pattern recognition: we are hard-wired to say “those things are alike, that thing is different” and to largely prefer things we categorize as alike to ourselves. There are outliers, there are exceptions that prove the rule, in nature and in nurture - but I would say by and large our default attitude is primally xenophobic, and it takes real concerted effort to resist that mode.

Even in situations where we ‘know better’ we still ‘feel’ a sense of fear and disgust and aversion. Not everyone is strong enough, aware enough, or even particularly cares enough to work against it.

amelius · 2025-06-27T15:09:55 1751036995

> Humanity can be so stupid sometimes.

In these matters, religion is always the elephant in the room.

sorokod · 2025-06-27T15:11:07 1751037067

A human made elephant.

amelius · 2025-06-27T17:13:10 1751044390

An elephant that would disappear if we banned all advertising.