Hacker News new | past | comments | ask | show | jobs | submit login

For anyone out there still using MD5 for any reason, check out this PDF file: https://www.alchemistowl.org/pocorgtfo/pocorgtfo14.pdf (42MB). You can also rename it to a .NES file and run it in a NES emulator.

It's a PDF File which is also a NES ROM that displays its own MD5 sum. The PDF also shows its own MD5 sum a few times. (The MD5 sum also happens to begin with 5EAF00D)

When an arbitrary MD5 can be created that easily, it's useless for any cryptographic applications, or even any data integrity.




MD5 may not be collision resistant, but the logic in your conclusion is completely wrong. There is no feasible pre-image attack on MD5. The MD5 generated _was not arbitrary_; they seem to have just brute-forced a few of the leading bytes and then did a Nostradamus attack.

Again: finding something that hashes to an arbitrary MD5 sum is still not known to be feasible. This isn't a particularly good reason to use MD5, but this also means that MD5 is not broken in the way you think it is.

When somebody finds something that hashes to all zeroes you can finally say that MD5 is completely broken. It is not known to be at that point yet.


Right. Nobody should use MD5 if at all possible, but it's important to understand how it is and isn't broken.

Like, if I have the MD5 hash of a binary from a trusted source, I can basically rely on that, unless the attacker was involved in producing the trusted binary. In which case I'd usually have bigger concerns.


No. MD5 is broken. You can find a pre-image faster than brute force. This is the definition of "broken" for a hash function.

https://link.springer.com/chapter/10.1007/978-3-642-01001-9_...


This is not a feasible attack. There is a large difference between how academics use "broken" and what the practical consequences are.

As Schneier writes as an introduction in the very paragraph you are trying to quote, "in academic cryptography, the rules are relaxed considerably." This is not a snub on academia; colloquial terms sometimes just mean something different than the academic definition.


That is "a" definition and that definition doesn't apply to the way the other commenter used broken.


That is the definition for hash functions. The other commenter was, not to put too fine a point on it, wrong. From Schneier's Self-Study Course in Block-Cipher Cryptanalysis:

"Breaking a cipher simply means finding a weakness in the cipher that can be exploited with a complexity less than brute-force."

But please go on about how you know more than Schneier.


You're disregarding the context of those words.

Schneier is illustrating the gap between the academic and practical meanings of "broken".



The term you're looking for is "compromised."


I use it a lot but never for anything to do with security (obviously). We use it for calculating simple content hashes that are used for caching or database ids. It's fine for that and md5 remains extremely popular for that and is used for such purposes by quite many big name companies. It's not a problem and not a security issue. If you look at cache headers in your browser, a lot of the content hashes you'll see will be md5 hashes for example. Likewise, if you use any of the popular object stores in AWS, GCP, Azure, etc. they'll be using md5 content hashes.

This is not a mistake. Md5 is a nice compromise between being fast and having a low probability of having collisions while keeping the hashes nice and short. Simpler/faster hashing algorithms are available of course and they have even more potential for collisions and it's not an issue there either. But md5 is kind of easily accessible and there on most platforms. So, it's a good default to use if you need some kind of content hash.

I've never seen accidental collisions and intentionally trying to create collisions in a cache or a database id doesn't really serve any purpose to anyone. So yes, you could try to do that but why would you? The probability of unintentional collisions is low enough that it is not a concern. It's a complete non issue. You are never going to see one in your career.

Using it is not a security issue unless you use it for things that need to be secure in which case you should use something like sha3 or alternatives to that. But hash algorithms have applications beyond security sensitive ones. AWS using sha3 for s3 object content hashes would be overkill and a waste of CPU,


Interesting.

My Anti-virus kicked off after downloading this, identified as EICAR-AV-Test

From a bit of googling, it seems EICAR-AV-Test is a file to test antivirus

https://www.eicar.org/download-anti-malware-testfile/


People still use antivirus in 2022?? I think it's well-understood that antivirus has somehow managed to be worse than not having it, as antivirus programs themselves often have vulnerabilities that make full system compromise easier.

Note that your antivirus is also performing worse than even the average antivirus, which is already pretty bad. The EICAR test file is only meant to be detected if the file size is less than or equal to 128 bytes long.


I personally do not, but my company issued laptop does.

I would not disagree that it makes the computer worse, there was a significant performance decrease when the latest version was installed earlier this year but this is off topic.

What I found interesting was that I didn't know about EICAR until today.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: