More

Johngibb · 2025-07-09T04:23:51 1752035031

I am actually asking this question in good faith: are we certain that there's no way to write a useful AI agent that's perfectly defended against injection just like SQL injection is a solved problem?

Is there potentially a way to implement out-of-band signaling in the LLM world, just as we have in telephones (i.e. to prevent phreaking) and SQL (i.e. to prevent SQL injection)? Is there any active research in this area?

We've built ways to demarcate memory as executable or not to effectively transform something in-band (RAM storing instructions and data) to out of band. Could we not do the same with LLMs?

We've got a start by separating the system prompt and the user prompt. Is there another step further we could go that would treat the "unsafe" data differently than the safe data, in a very similar way that we do with SQL queries?

If this isn't an active area of research, I'd bet there's a lot of money to be made waiting to see who gets into it first and starts making successful demos…

simonw · 2025-07-09T12:25:29 1752063929

This is still an unsolved problem. I've been tracking it very closely for almost three years - https://simonwillison.net/tags/prompt-injection/ - and the moment a solution shows up I will shout about it from the rooftops.

pegasus · 2025-07-09T07:49:40 1752047380

It is a very active area of research, AI alignment. The research so far [1] suggests inherent hard limits to what can be achieved. TeMPOraL's comment [2] above points out the reason this is so: the generalizable nature of LLMs is in direct tension with certain security requirements.

[1] check out Robert Miles' excellent AI safety channel on youtube: https://www.youtube.com/@RobertMilesAI

[2] https://news.ycombinator.com/item?id=44504527

Johngibb · on Nov 11, 2012

This reads like an infomercial... like someone was paid to write it.

xutopia · on Nov 11, 2012

I'm pretty sure the guy's an evangelist for Microsoft. The article seems so over the top that it's infomercial territory.

kinble32 · on Nov 11, 2012

I agree. I think surface is pretty cool, but this is a little too positive.

Johngibb · on Nov 2, 2012

If we're going to grant that a machine can have infinite ram, surely we can grant that a spec of C can have infinite stack allocated arrays... :)

Johngibb · on Oct 29, 2012

Can I have one please?

Cyril-Boh · on Oct 29, 2012

Johngibb · on Oct 25, 2012

Please, be constructive instead of just dismissive. A nasty tone doesn't help anyone nor does it encourage trying new things.

jenius · on Oct 25, 2012

I tried to be constructive, but I honestly couldn't find a single good thing about this. The execution was poor, and the entire idea of making a 'framework' out of boxes with background colors is ridiculous.

I do apologize for the harshness, especially if it offended people, and I'm all for trying new things, but something like this should absolutely not be up on the popular page of hacker news. This is an amateur attempt at a framework that was very poorly done. Is that honestly deserving of more than 100 upvotes?

Johngibb · on Oct 26, 2012

I think your criticism is well founded.

That said, the author did put work into something, document it, and share it freely with the community - and for that reason, I'd rather not see someone hurt their feelings calling their work an 'abomination.'

I just think with a slightly different tone you could have made the same point in a way that would show the OP some brutal honesty but without discouraging them.

jenius · on Oct 31, 2012

good reply and good point. thank you

Johngibb · on Oct 24, 2012

I bet the variation in the size of people's fingers is greater than the difference in the size of touch targets between original and mini iPads... :)

Johngibb · on Oct 22, 2012

Deprecated, but not removed yet. It "may be removed in a relatively distant future"

cmn · on Oct 22, 2012

This is what deprecated means. You can't just remove an API that's existed for years and scripts may depend on without giving them time to adapt. This shows a warning when it's used so it's clear which scripts should be changed.

Johngibb · on Oct 21, 2012

That's true, and I think assigning malice to this situation is unfounded. But maybe this is a problem worth solving?

Johngibb · on Oct 21, 2012

Come on, don't reject it so flippantly. Sure, for the case of a woman getting married and taking a new last name, I doubt it's that big of a deal - you can change your name for future commits, but your old name will exist for historical commits. However, there _are_ cases where you might not want your former name around (transgender, or even something like privacy / witness protection). Right now, these folk are being sort of excluded (however inadvertently) and it's worth discussing ways to fix that.

Johngibb · on Oct 19, 2012

That stood out to me too... is there an inside joke? What's the white have to do with it?

philwelch · on Oct 19, 2012

Seems like a throwaway joke to me; nothing to over-analyze.

easy_rider · on Oct 19, 2012

Well, Most Asians have awesome APM or are great at table tennis. Black people tend to be great at running really fast. It makes sense doesn't it?:)