> How are you assuming a known dictionary and a known word count? And for the other examples, you’re somehow assuming a fully precalculated rainbow table of all possible such PII. Seriously…?
Obviously deciphering all messages is not feasible but I think GPs point was that you could put together a table that would get you some decent proportion of messages. For example, if you assume an alphabet of [a-zA-Z0-9 ] then you'd only need to calculate ~63.5 billion hashes to get all messages up to 6 chars. Similarly you could use a dictionary to put together tables for short sentences.
Could you link to some PoC hacks that prove feasibility? I’m not saying it’s theoretically impossible - mathematically it should be. I’m simply saying that the message content space is massive, and hence practically infeasible.
Again, all of this is predicated on Google hashing messages client-side and then, secretly reverse-matching them server-side for very simplistic 1-3 word messages?
Messages are usually composed from words, not from random letters, so you don't need to bruteforce all letter combinations.
Also, some messages use a template, for example, a message from an online store like "Your order number XXXXX is ready for pickup" or a message from a bank saying "Your PIN code is XXXX". In this case, all you need to guess is a number.
Obviously deciphering all messages is not feasible but I think GPs point was that you could put together a table that would get you some decent proportion of messages. For example, if you assume an alphabet of [a-zA-Z0-9 ] then you'd only need to calculate ~63.5 billion hashes to get all messages up to 6 chars. Similarly you could use a dictionary to put together tables for short sentences.