Yes, if you really want to show something as ambitious as a sum of a million row...

Yes, if you really want to show something as ambitious as a sum of a million rows, then you should absolutely move to an event-based, push-based architecture. Because once you’ve calculated the sum, the least work you would need to do is to periodically look at batches of UPDATES to rows, and update your sum based on that (rather than, say, running SUM again on the whole entire table). You can either pull a digest periodically, or — you can handle incremental updates by listening for pushes.

All the difficulties you cite, including the “optimizations” I mentioned, stem from not using push, and insisting on periodically polling something.

If you step back, your whole problem is that you are using a system that only has pull and not push architecture. You’re trying to put a bandaid on that core fact. You chose a relational database system that you refuse to build extensions for (as opposed to pretty much any other system) in order to insist “nyeh nyeh you havent solved cache invalidation”.

Caching is just one part of sync. The part that stores a non-authoritative, slightly out of date copy locally. And sync, or replication, is just one part of an eventually consistent system. I mean, even MySQL supports replication, so just hook your client up to that protocol (encrypted in transit) and boom, now you can update your cache.

Here’s the thing. If you use any network protocol that is open, eg HTTP, then yea it’s solved. Because you can have webhooks and just spin up a server to handle them, and update your cache. A row is removed? Decrement your sum. A row is added? Increment it.

You are just insisting that “no, our system has to be clunky and we refuse to implement webhooks / websockets / sockets / push / mysql replication client / updates of any kind, now solve it for me to be as good as a system which has push capabilities.”