Just a week ago, I asked ChatGPT to find me something and found Drive Weather iOS app, which I started using.
I was also thinking of building something myself, but this one is good enough. Have you tried it or any other similar apps? Why did you decide to build a new one and why is it better? (Building just for fun is an option as well!)
You’re also cutting off developers who care about the cybersecurity of their agents and don’t want to point them to random websites that could contain dangerous prompt injections, as well as people who want to understand where they’re directing the agent and why before doing so
I think the biggest thing is to not give it access to anything like a shell (obviously), limit the call length, and give it a hangup command.
Then you tell it to just not answer off the wall questions etc. and if you are using a good model it will resist casual attempts.
I don't see being able to ask nonsense questions as being a big deal for an average small business. But you could put a guardrail model in front to make it a lot harder if it was worth it.
in general these types of attacks are still difficult to solve, because there are a lot of different ways they can be formulated. llm based security is still and unknown, but mostly i have seen people using intermediary steps to parse question intent and return canned responses if the question seems outside the intended modality.
reply