Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anthropic and AI alignment research isn't about making AI that are DnD-style "good alignment", but making AI that have outcomes that are aligned with the goals that the designers intended for them. The chatbot AI model and goals are not the same model and goals for a defense AI.

The goals for a chatbot assistant are to be useful, correct, and not insult people. The goals for a defense AI are to extract correct features, provide useful guidance, and not kill the wrong people. If you are working in defense you already have a belief that your work is morally correct: most of those justifications are either that your work will kill bad people more effectively, and so save friendly lives, or will pick who to kill most correctly, and so save innocent lives. Having an AI that is better aligned towards those goals are better.

You may disagree that working in defense is ever morally justified! But Palantir dont't share those beliefs, and want to do as good of a job as they can, and so want the most aligned AI model they can.



They train us to drop fire on people but won't let us write "fuck" on the side of an airplane because it is obscene. (Col. Kurtz - Apocalypse Now)

Which, when you unpack it, is even more interesting. If you do embrace the emotional aspect of war you end up with situations like the my lai massacre. Does AI have the ability to prevent war crimes while engaging in "legal" killings feels like an interesting philosophical question.


Stopping war crimes does not require AI to be allowed to kill people. I don't understand that equation.


And what happens when the defense AI 'hallucinates' and suggests that somebody is a terrorist when they are not?


Going by recent events I think the convention is to drone strike them and their entire family anyway, and then tally them up as a confirmed dead terrorist.

https://www.972mag.com/lavender-ai-israeli-army-gaza/


Ahh, I hadn't seen this before posting. Thank you for providing a dose of reality, excellent link


> what happens when the defense AI 'hallucinates' and suggests that somebody is a terrorist when they are not?

Collateral damage. Same thing that happens in any war when an analyst or soldier misreads the battlespace.

War is hell. We won’t change that by making it pleasant. We can only avoid it by not going to war.


[flagged]


That's very misleading. Terrible war keeps on happening all the time, just not so much in the US and Europe for the last 70 years.

Yes, since WWII things have been relatively peaceful, but the key term there is relatively. As we speak a pretty awful war is happening in Gaza, in the last twenty years we've had multiple wars with pretty severe casualties, and if you go a little farther back you get to things like Vietnam.

It's true that specifically atomic bombs haven't been used.


> "Acts like firebombing of Tokyo or bombing of Dresden or atomic bombing don't happen now."

For the time being... Humanity's leaders are increasingly as insane (sometimes more-so) as their worshipers. I feel it's only a matter of time before atrocities and crimes against humanity skyrocket again. :(


> For the time being

Not even. We haven’t avoided nuclear war by not building nukes. And we still raze cities and manufacture incendiary weapons.


> Acts like firebombing of Tokyo or bombing of Dresden or atomic bombing don't happen now

We still raze cities and drop incendiaries. America hasn’t gone to war with a near-peer nonnuclear power like Japan since WWII. To the extent we were faced with the prospect in the Cold War, both we and the Soviets were committed to MAD, i.e. using nukes. (Do you think unilateral disarmament in the Cold War would have lead to peace?)

There has been no militarily useful technology that was voluntarily abandoned. Just constrained. You can’t constrain a technology you don’t bother understanding.


[flagged]


> during the firebombing of Tokyo the US murdered 100,000 civilians

Are you arguing there was a war in which firebombing would have been useful but someone decided it was too mean?

Since WWII we invented better high explosives and stand-off precision weapons. If there were a strategic case for firebombing in a future war, have no delusions: it will happen. (Last year, incendiary weapons were used in “ in the Gaza Strip, Lebanon, Ukraine, and Syria” [1].)

[1] https://www.hrw.org/news/2024/11/07/incendiary-weapons-new-u...


[flagged]


> literally 'making war different'

What? Who argued war hasn’t changed with technology?

> civilians weren't murdered on the same scale

War wasn’t conducted on the same scale.

> why you are conflating the use of certain types of weapons and willingly allowing enormous collateral damage

I’m not. Nobody in this thread is. The point is the weapons are still stockpiled and used. We have never agreed to ban a useful military technology. Just contained or surpassed it.

AI will be used by militaries as long as it’s useful, even if it causes collateral damage. We will obviously try to reduce collateral damage. But in part because that makes the weapon more useful.


[flagged]


> You argued that war is the same hell as it was 80 years ago and it can't be changed by making war different

Where? I certainly did not. War is hell and always has been, but it's obviously a different hell than it was in earlier eras.

Going back on piste: if AI has military applications, it will be developed and use for them.

> If you say that all hells are made equal, I won't agree

Not how a discussion works.

> question is why. Maybe it's because something changed

Yes. Nukes and precision stand-off weapons.


>> If you say that all hells are made equal, I won't agree

>Not how a discussion works.

So you do say that? Can I ask you something? Where would you rather be: in the fire-bombed Tokio or anywhere in the Ukraine now?


I know what you mean, but I don't have an answer myself.

Really the collateral damage in Ukraine is still ongoing, not in Tokyo for quite some time.

So it's tragically possible that Ukraine could end up worse than Tokyo by the time hostilities finally cease.

Maybe with Tokyo a closer equivalent might be if Ukraine attacked Moscow using a comparable approach, with a degree of disregard for collateral damage figured in. Although Russian strategy already seems to target any part of Kiev that can be hit, civilian or not.

Plus no two things like this are really on the same scale and it's never a direct comparison, but there's some common undercurrent that is either predatory or vengeful which sometimes can grow until it can't get much worse.

So what about prehistoric tribes, even pre-humans, who surely had occasionally completely massacred victim tribes from time to time, not much differently than pack animals have always been known to do.

Total extermination like that could be rapidly completed with no weapons of mass destruction or even gunpowder.

Isn't there some possibility that this tendency has been retained evolutionarily or culturally to some extent today, even though most people would say that's just the opposite of "humanity".

Passed down in an unbroken chain in some way?

Disclaimer: when I was a teenager I worked one summer with a German machinist who had survived the bombing of Dresden. Ironically the project we were on was components for the the most advanced projectile of its caliber, yet to come. Both of us would have liked to build something else, but most opportunities across-the-board affecting all ages had already evaporated due to inflation of the 1970's, and the runaway years hadn't even gotten there yet.


>So it's tragically possible that Ukraine could end up worse than Tokyo by the time hostilities finally cease.

I seriously doubt that.

>Although Russian strategy already seems to target any part of Kiev that can be hit, civilian or not.

Not really, but one can get that impression from reading the NY Times and the likes.

Here is a good example of the Western atrocity propaganda: https://www.nytimes.com/2024/04/06/world/europe/russia-ukrai...

See the big picture at the top? Clearly it's some kind of mall damaged by senseless and cruel Russian strike that cannot have any other purpose but to terrorize population of Kharkiv into submission.

Next look at the first video here: https://t.me/ASupersharij/28133

The place should look familiar, only now you can see destroyed MLRS vehicle (there were two, but the second one got evaporated: https://t.me/aleksandr_skif/3150)

>Isn't there some possibility that this tendency has been retained evolutionarily or culturally to some extent today

Sure, but there is an opposite tendency too and it's not going anywhere barring catastrophic changes like famine due to global warming.


One thing that changed is that everything is instantly reported through numerous channels, and globally: traditional broadcast media as well as independent reporters using Internet channels.


Security podcasters will cheer you killing kids because you might have hit a few terrorists in the process.


The cynic in me finds this quite naive - there are territories of the Earth where if you are a male of a certain age and you are killed in a drone strike you are automatically classified as a "military-aged male", i.e. a non-civilian, regardless of the existence of any other evidence.

So what will happen if AI suggests someone is a terrorist when they are not? Well, in the worst scenario they'll be killed, and it'll be very close to what we have today, except somewhere in a private military database there might be an automatically generated record from an LLM ok-ing the target.


That's been done over a decade ago with random forests. There's no need to apply anything as advanced as generative AI 'hallucinations' for such a trivial problem. /s

https://www.theguardian.com/science/the-lay-scientist/2016/f...


>you already have a belief that your work is morally correct

Or you don't care about morals. Or you are evil.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: