But then it’s still nice that they’re using ScyllaDB and now it’s not a concern ...

ericvolp12 · on Sept 29, 2024

The funny part is ScyllaDB still uses tombstones for deletions, though they do have configurable compaction strategies and iirc Discord uses Scylla's Incremental Compaction Strategy that I suppose solves the specific issue they were dealing with. iirc that compaction strategy will trigger a compaction once a certain threshold of a partition is tombstones and then the table is rebuilt without the tombstoned content (which effectively pauses writes on that specific node and that specific table and partition for the duration of that process). Compacting a massive partition is really expensive. Scylla defaults to warning you that a partition is too large if it has at least 100,000 rows in it. My guess is when they moved to ScyllaDB they also adopted a new strategy for partitioning messages in a channel that keeps partition sizes reasonable so compactions don't take a super long time.

jhgg · on Sept 29, 2024

We did not change schema or partitioning strategy.

sroussey · on Sept 29, 2024

Good default configurations can mean quite a lot if people don’t tune them.

roenxi · on Sept 29, 2024

I don't see anything here that looks untoward. They increased their data storage by 3 orders of magnitude and decided to use a different DB system. Fair enough, maybe they've learned more about the nature of their data.

But that logic isn't sound. When dealing with huge amounts of data there are going to be trade-offs. Picking a system that makes different trade-offs to an existing system is not automatically helpful. Yes you don't have the old problems. However, you are about to discover new problems. There is always something of a gamble around which will be more of a problem to your business.

frr149 · on Sept 30, 2024

What's the problem with Scylla? Honest question, BTW