I do not expect a hash function's output to be used to 'reverse' to an approximation of the input (which is the primary use here). That being easy is even an unacceptable property for cryptographic hash functions, which to me are hash functions in the purest form.
I would rather call this extreme lossy compression.
regular (non-secure) hash functions do two things: they compress (very lossily) and they make things that are near each other in their domain (inputs) map to things that are far apart in their codomain (outputs).
the first condition is satisfied, but the second is definitely not!
> > they make things that are near each other in their domain (inputs) map to things that are far apart in their codomain (outputs).
This is describing a specific subset of hash only. *Cryptographic* hash functions map inputs to outputs with high and uniform dispersion.
So you are talking about cryptographic hashes, but different hash functions can have different properties.
ThumbHash is absolutely a hash function, which is "any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output." (https://en.wikipedia.org/wiki/Hash_function)
sure. but think about it this way: most real world data is actually highly structured. in the space of bits that aren't encrypted, the real stuff lives in a very, very small subspace and good hash functions seek to avoid collisions in their output.
it does, it just works on estimates of percepts rather than bits.
you can think of a perceptual hash as two functions. a perceptual function that maps differing collections of bits that appear the same or similar to the same bits, and then a traditional hash function to ensure that these intermediate values get shuffled.
cool idea to extract one piece of the DCTs and emit a tiny low-res image though!