Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From the article:

    Of particular concern is that Obnam has a theoretical collision
    potential, in that if a block has the same MD5 hash as another
    block, it will assume they are the same. This behaviour is the
    default, but can be mitigated by using the verify option. I tried
    with and without, and interestingly did not notice any speed
    difference (2 seconds, which is statistically insignificant) and
    also did not encounter any bad data on restoration. So I don't
    know why it's off by default.
Worrying about this violates Taylor's Law of Programming Probability[1]:

    The theoretical possibility of a catastrophic occurrence in your
    program can be ignored if it's less likely than the entire
    installation being wiped out by meteor strike.
I've seen a lot of sysadmins or programmers nitpick systems that have the theoretical possibility of md5 or sha1 collisions, but it's amazingly unlikely to happen in something like a backup system where you're backing up your own data, and not taking hostile user data where the users might be engineering collisions:

1. http://www.miketaylor.org.uk/tech/law.html



It's unlikely to happen by chance, but it can be quite vulnerable to malicious attacks.


"Quite". Let's look at the potential attack. You're running a backup system with user-supplied data, fair enough, and one of your users has:

    1) Access to an existing object, or its checksum.

    2) Can write a *new* object where they intentionally
       produce a collision with an existing object.
There's a trivial way to get around this attack in practice, which is that you just lazily write objects and don't re-write an object that exists already. This is what Git does with the objects it writes, which insulates it more from future SHA-1 collision attacks than just the security you'd get from SHA-1 itself.

This means that you've changed an attack where someone can maliciously clobber an existing object to an edge case where their object just won't get backed up.


Assuming of course that the object they want to clobber is either already backed up or processed before the malicious object. They can still attack a new object.


E.g. by backing up two files which are designed to demonstrate a MD5 collision.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: