There's a difference between bringing MD5 up to modern strength and simply blocking the attempts of attackers who've attained this MD5 forging kit. The later is an interesting though experiment.
The point of a hash is to avoid feeding malicious data further into your pipeline so solutions that involve parsing the file and hashing its actual data-streams aren't a good idea. We'll focus on detecting the change before looking at the data.
To make the attack harder, use file formats without expandable or optional sections. This attack works by parsing known file formats and making allowable changes. If you had encryption as one of your coconuts it would make this easier by making the whole file opaque, but the encryption could also be used to generate a hash-like construction so if you have good enough encryption you wouldn't be stuck with MD5 and there wouldn't be much thought experiment left...
To stop this specific attack, assuming you had to use these formats, using MD5(file+reversed_file) or MD5(file)+MD5(reversed_file) or MD5('secret'+file) would work. This is massively beyond what a most people could make the tools do but it's probably not that much cryptographically harder.
If your solution was a secret it would be pretty effective. The problem is that if you don't have encryption then they're watching you communicate. They can see the hash sum you tell the recipient to expect, and if this is the first time you've used this scheme, the steps to use to check it. But even if they don't observe this though, you're only using a small set of non-crypto operations and they can just try millions of combos (reverse the file, append a second copy, interleave bytes, etc.) and see if anything produces the same hash. Then they have discovered your algorithm and they can plan to modify their tools to perform the attack.
Oh, I think secrecy would be against the spirit of the experiment. And yeah, I wonder what kind of mods they would need to make to their tools in response. If it takes 10 years to generate the collisions, then it seems like it might be a worthwhile approach (in the thought experiment universe only).
But we should also recognize that the article demonstrates collision attacks, not preimage attacks, which would be the attack you want to worry about if you are using a trusted hash to verify a file you received over an untrusted channel.
The point of a hash is to avoid feeding malicious data further into your pipeline so solutions that involve parsing the file and hashing its actual data-streams aren't a good idea. We'll focus on detecting the change before looking at the data.
To make the attack harder, use file formats without expandable or optional sections. This attack works by parsing known file formats and making allowable changes. If you had encryption as one of your coconuts it would make this easier by making the whole file opaque, but the encryption could also be used to generate a hash-like construction so if you have good enough encryption you wouldn't be stuck with MD5 and there wouldn't be much thought experiment left...
To stop this specific attack, assuming you had to use these formats, using MD5(file+reversed_file) or MD5(file)+MD5(reversed_file) or MD5('secret'+file) would work. This is massively beyond what a most people could make the tools do but it's probably not that much cryptographically harder.
If your solution was a secret it would be pretty effective. The problem is that if you don't have encryption then they're watching you communicate. They can see the hash sum you tell the recipient to expect, and if this is the first time you've used this scheme, the steps to use to check it. But even if they don't observe this though, you're only using a small set of non-crypto operations and they can just try millions of combos (reverse the file, append a second copy, interleave bytes, etc.) and see if anything produces the same hash. Then they have discovered your algorithm and they can plan to modify their tools to perform the attack.