Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The difference in those probabilities is the same. In each case, the event with the greater probability is 1-in-a-thousand times more likely to occur than the lesser.

Let's walk through an example with these numbers. Each case has 2 probabilities and we'll assign them each to the possible outcomes of a coin flip.

Case 1:

The probability of heads is 0 and the probability of tails is 0.001. We flip the coin 1,000 times. The ideal outcome is that we see 0 heads and 1 tails.

Case 2:

The probability of heads is 0.499 and the probability of tails is 0.500. We flip the coin 1,000 times. The ideal outcome is that we see 499 heads and 500 tails.

(You'll notice that the outcomes don't add up to 1,000 in either case, and that's because the probabilities don't add up to 1. That means there is actually a third possible outcome, but that can be ignored for the purpose of this example.)

They each differ by a result of 1 result per 1,000 trials. This is because probability is simply the likelihood of an event occurring. There is no super-linear pattern going on here. It is no harder to get from 0.7 to 0.9 probability than it is to get from 0.2 to 0.4. It is just a measure of likelihood.

What is causing people to misunderstand this? I wonder if you're thinking about probability distributions vs. probability itself?



Adding and subtracting operations are natural in the log scale (think probability of multiple independent events), but not in the linear scale.

The interpretation of a linear difference of 0.001 in probability depends on the two numbers you're subtracting, which is not the case in log scale.

OP pointed out intuitively why adding and subtracting probabilities in the linear scale don't make sense (examples would be comparing 0.001 and 0 versus 0.500 and 0.499). It's the fold changes that matter when interpreting two different probabilities (e.g. making decisions under uncertainty), not the linear difference.

The proper interpretation of your Case 1 and Case 2 should be the ratio of heads and tails as you flip coins, not the difference in the absolute number of heads that show up in the two cases.


> The interpretation of a linear difference of 0.001 in probability depends on the two numbers you're subtracting

No it does not and I have shown why. A difference of 0.001 in probability means that the two events differ in their likelihood to occur by 1 in 1,000 trials.

> OP pointed out intuitively why adding and subtracting probabilities in the linear scale don't make sense (examples would be comparing 0.001 and 1 versus 0.500 and 0.499)

This is only intuitive if you have a misunderstanding of probability.

> The proper interpretation of your Case 1 and Case 2 should be the ratio of heads and tails as you flip coins

This ratio is not relevant to anything.


Clearly we have differences in what is intuitive and what is not.

I think OP is trying to say that people who interpret Case 1 and Case 2 in terms of absolute difference in number of heads rather than the ratio of heads to tails have questionable foundations in probability.

I probably won't be able to change your intuition, but let me try to show why thinking in terms of difference in log-odds is grounded on nice mathematical properties.

When thinking about how to compare two probabilities (i.e. a function that takes two numbers between 0 and 1, and outputting a real number), one thing that is nice to have is the function outputting opposite numbers for the complementary event.

A comparison function that fulfills this property (and other nice properties) is the difference in the log-odds.

For example, using this comparison for p1=0.002, p2=0.001 gives about 0.694. What's nice is that when p1=(1-0.002) and p2=(1-0.001) you get -0.694. So indeed this comparison function is giving expected results.

Using this comparison function to look at 0.500 and 0.499 gives 0.004, which is less than 0.694. This suggests that using this mathematical framework to compare probablities, (0.500, 0.499) is similar while (0.002, 0.001) is more different.

Indeed if you take the extreme case of 0.001 and 0 (I made a typo above) you see that the comparison function outputs infinity. This also makes sense when making decisions because when comparing an event that is impossible compared to an event that is merely improbable, the actual probability no longer matters for the latter event (think of your coin flipping case again with p1=0, and letting the other p2 be any number between 0 and 1).


Intuition is definitely subjective, and that is ok. I do respect different intuitions while trying to understand them. And the best thing for that is doing what you did, which is provide a more clear example.

Unfortunately, the example makes it more clear to me that probability differences are linear.

Your "difference of log-odds" function is a qualitative function that you've constructed to operate on probabilities. All you've done is take the log of numbers and compare them, and of course you're going to take on the properties of logarithms themselves: specifically that the difference between log(x2) and log(x1) decreases as x2 and x1 increase in value. That's just how logarithms work.

This says nothing about the relationship between probabilities.

You can't just apply any operation on any values and draw conclusions about it with understanding the interpretations of the values. Probabilities are likelihoods, not raw quantities. So the domain of likelihood has to be considered when analyzing probability values.

In the domain of likelihood, when probabilities are independent (i.e. flipping a coin once does not influence the next coin flip) you can interpret differences in probability as linear.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: