On device or in the cloud, the advertisers need some method to match images. Might as well fully exploit your advertisees and have the client device calculate the hashes.
As to the actual matching strategy, I would guess a perceptual hash that is robust to some amount of noise and varying resolutions. Since you are comparing repeated hashes, say at 1Hz, you have multiple chances over the observation window to match a fingerprint to the lookup database. 30 minute show x 60 frames per minute = 1800 hashes to make the id