Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A hash function aims to replicate the properties of a truly random function.

The probability that a random function does _not_ output 0 given some specific input block is (1 - 1/2^n). Taking each of the possible 2^b input values into account this means that 0 is not an output for any input with probability (1 - 1/2^n)^(2^b) ~ e^(-2^(b-n)).

For SHA-256 with n = 256 and b = 512 (one can treat the compression function as a 768 to 256 bit random function, but we can stick to the worst-case single-block message case here) we have that the probability of 0 _being_ an output for a single-block message is 1-e^(-256) which effectively means it exists, but the probability never quite reaches 1.



Your math looks plausible for an ideal cryptographic hash, but wouldn't you first have to prove that SHA-256 actually does behave as if each unique input generates an independent random number?


There's no real way to connect the compression function to any kind of mathematical model that would help here, other than modeling it as random. Provability is out the window.

So what you do is assume it behaves like a random function until proven otherwise, by _some_ property that deviates from this model. (This is not even the case for SHA-256, since neither the hash nor the compression function can be modeled as random oracles (due to length extension and the Davies-Meyer structure), but we can conveniently forget that for the duration of this thread.)

There _are_ some hash functions based on number-theoretic problems where you could reason about such things, but none of those are in common use since they are usually slow and/or require an output unstructured transformation anyway to turn them into proper, rather than just collision-resistant, hash functions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: