Thanks for the detail, though to be clear, I’m not doubting the mathematical the...

bscphil · on March 21, 2022

Let's assume you only care about cracking one word messages in English, and only the 10000 most common words. The time stamp is granular to the hour, and there are ~10000 hours in a year.

Easily available hardware can create 10 billion SHA-1 hashes per second. So to brute force every hash for every combination of one word message and timestamp, you need ... one hundredth of one second. No rainbow tables needed here. Just brute force it. For each hash, you just check every message hash stored on your server to find matches. If you find a match, you've uncovered the probable message text.

And so on for 2, 3 word combinations. Let's take 3 word combinations, assuming all three words are in the 10000 most commonly used English words. There are 10000^3 possible combinations of such words. Once again, the timestamp adds another 10000 possibilities (for the last year). How long does it take to brute force all 3 word combinations? One million seconds - less than two weeks. This is for a SINGLE person working on this with a SINGLE GPU. Not only will a serious attacker have better hardware, but context clues can help limit the possible options so that much longer combinations of words will be possible (if they are common combinations).

Sure, there will be slight punctuation or capitalization differences in some cases and that adds entropy, but not by enough to bother taking into consideration in terms of security.

I guess we're also assuming in all this that Google is not tagging each message with the time received on the server. Given that they already have that information, they might be able to recover the message content without needing the extra hashes for the timestamp.