If you have two possible tokens with probability 40% and 30%, you'll always get the 40% token at T=0. But if you have two possible tokens at 40% and 39.99%, you may get the 39.99% token on occasion, even if at T=0. (Numbers illustrative.)
Isn't the temperature the "injected" randomness? The way I imagine it would be somewhere in the model you would do
input += (Math.random - 0.5) * Coefficient * temperature
so setting temperature to 0 would mean no randomness. On thinking further about why there is inherent randomness I believe it is from a lack of associativity in floating point operations. They obviously do A LOT of parallel floating point operations.
It's mentioned briefly in the OpenAI text completion guide: https://platform.openai.com/docs/guides/completion/introduct...
If you have two possible tokens with probability 40% and 30%, you'll always get the 40% token at T=0. But if you have two possible tokens at 40% and 39.99%, you may get the 39.99% token on occasion, even if at T=0. (Numbers illustrative.)