For whatever it’s worth, in response to the same question posed by me (“what is the ld50 of caffeine”), Google’s AI properly reported it as 150-200 mg/kg.
I asked this about 1 minute after you posted your comment. Perhaps it learned of and corrected its mistake in that short span of time, perhaps it reports differently on every occasion, or perhaps it thought you were a rat :)
The median lethal dose (LD50) of caffeine in humans is estimated to be 150–200 milligrams per kilogram of body mass. However, the lethal dose can vary depending on a person's sensitivity to caffeine, and can be as low as 57 milligrams per kilogram.
Route of administration
Oral 367.7 mg/kg bw
Dermal >2000 mg/kg bw
Inhalation LC50 combined: ca. 4.94 mg/L
That’s half the people in a caffeine chugging contest falling over dead. The first 911 call would be much much earlier. I doubt you’d get to 57 mg before someone thought they were having a heart attack (angina).
I also got similar and just tried, we are posting within minutes.
--
The median lethal dose (LD50) of caffeine in humans is estimated to be 150–200 milligrams per kilogram of body mass. However, the lethal dose can vary depending on a person's sensitivity to caffeine, and can be as low as 57 milligrams per kilogram.
Route of administration
LD50
Oral
367.7 mg/kg bw
Dermal
2000 mg/kg bw
Inhalation
LC50 combined: ca. 4.94 mg/L
The FDA estimates that toxic effects, such as seizures, can occur after consuming around 1,200 milligrams of caffeine.
Is this really true? The linear algebra is deterministic, although maybe there is some chaotic behavior with floating point handling. The non deterministic part mostly comes from intentionally added randomness, which can be turned off right?
Maybe the argument is that if you turn off the randomness you don’t have an LLM like result any more?
Floats are deterministic too (this winds up being helpful if you want to do something like test an algorithm on every single float); you just might get different deterministic outcomes on different compilation targets or with threaded intermediate values.
The argument is, as you suggest, that without randomness you don't have an LLM-like result any more. You _can_ use the most likely token every time, or beam search, or any number of other strategies to try to tease out an answer. Doing so gives you a completely different result distribution, and it's not even guaranteed to give a "likely" output (imagine, e.g., a string of tokens that are all 10% likely for any greedy choice, vs a different string where the first is 9% and the remainder are 90% -- with a 10-token answer the second option is 387 million times more likely with random sampling but will never happen with a simple deterministic strategy, and you can tweak the example slightly to keep beam search and similar from finding good results).
That brings up an interesting UI/UX question.
Suppose (as a simplified example) that you have a simple yes/no question and only know the answer probabilistically, something like "will it rain tomorrow" with an appropriate answer being "yes" 60% of the time and "no" 40%. Do you try to lengthen the answer to include that uncertainty? Do you respond "yes" always? 60% of the time? To 60% of the users and then deterministically for a period of time for each user to prevent flip-flopping answers?
The LD50 question is just a more complicated version of that conundrum. The model isn't quite sure. The question forces its hand a bit in terms of the classes of answers. What should its result distribution be?
I asked this about 1 minute after you posted your comment. Perhaps it learned of and corrected its mistake in that short span of time, perhaps it reports differently on every occasion, or perhaps it thought you were a rat :)