I wonder how its implemented. I think its possible to do run this type of service at a very large scale for extremely low costs using AWS Lambda and Cloudwatch events.
That’s exactly what we use for a competing product. The difficulty is less in the crawling, and more in the scheduling and difference visualization (including filters). Presenting salient results without a ton of false positives is hard work.