If for a moment we assume that you can do it reliably (which I personally doubt, even for "simple" formats) - what's the point? Why not just hash the original file? What's the benefit here?
For example, I could create two different PNGs that decode to the same bitmap. Or I could create one PNG that decodes to multiple different bitmaps, depending on which implementation decodes it (due to implementation bugs and/or under-defined areas of the specification). Or I could create a PNG that is also a valid ZIP archive.
There's no security benefit, and I would have a hard time coming up with a practicality benefit. It's mostly just interesting to think about, especially in response to the article. The article is demonstrating fast MD5 second preimage attacks for various file formats (EDIT: apparently not preimage attacks, just collision attacks), so in response to that I'm wondering how these MD5-specific attacks might be mitigated, for fun. Consider it alternate history fiction in which we never discovered anything better than MD5 :)
In your examples, though, :
> two different PNGs that decode to the same bitmap
But would the the PNGs also have the same MD5 hash?
> one PNG that decodes to multiple different bitmaps, depending on which implementation decodes it
Yeah, that would be a challenge. Relying on implementation details, or results which are allowed to vary, wouldn't work. But since this is meant to supplement an existing MD5 hash, the idea is that the format consumer/interpreter would be in a good position to produce some format-aware fingerprint that is statistically likely enough to be different when the inputs are different.
Ah, OK, I think I misunderstood the article. If you are supplying both images to me, you could do that with the MD5 hashes. Although, I think if you could get them to generate the same bitmap, then the attack has been at least partially mitigated, by definition. Not completely, I admit, but I think it wouldn't qualify as the same attack shown in the article.
For example, I could create two different PNGs that decode to the same bitmap. Or I could create one PNG that decodes to multiple different bitmaps, depending on which implementation decodes it (due to implementation bugs and/or under-defined areas of the specification). Or I could create a PNG that is also a valid ZIP archive.