Probabilistic nature means nothing on its own. LLM that can solve your deterministic task will easily assign 100% to the correct answer (or 99%, the noise floor can be truncated with a sampler). If it doesn't do that and your reply is unstable, it cannot solve it confidently. Which happens to all LLMs on a sufficiently complex task, but it's not related to their probabilistic nature.
Of course that still doesn't mean that you should do that. If you want to maximize model's performance, offload as much distracting stuff as possible to the code.
Of course that still doesn't mean that you should do that. If you want to maximize model's performance, offload as much distracting stuff as possible to the code.