Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But the grandparent is saying that there is a missing class of input "data". This should not be treated as instructions and is just for reference. For example if the user asks the AI to summarize a book it shouldn't take anything in the book as an instruction, it is just input data to be processed.


FYI, there is actually this implementation detail in the model spec, https://model-spec.openai.com/2025-02-12.html#chain_of_comma...

Platform: Model Spec "platform" sections and system messages

Developer: Model Spec "developer" sections and developer messages

User: Model Spec "user" sections and user messages

Guideline: Model Spec "guideline" sections

No Authority: assistant and tool messages; quoted/untrusted text and multimodal data in other messages


This still does not seem to fix the OP vulnerability? All tool call specs will be at same privilege level.


I see, thanks for the clarification.

Yes, that’s true - the current notion of instructions and data are too intertwined to allow a pure data construct.

I can imagine an API-level option for either a data message, or a data content block within an image (similarly to how images are sent). From the models perspective, probably input with specific delimiters, and then training to utterly ignore all instructions within that.

It’s an interesting idea, I wonder how effective it would be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: