Thats not really what this article is about. Their problem wasn't throughput. Wh...

lmilcin · on Aug 25, 2021

> What's the size of all the data in your MongoDB instance?

3x12TB

> In the big data world the "complexity" of the data doesn't really mean much.

Oh how wrong you are.

It is much easier to deal with data when the only thing you need to do is to just move it from A to B. Like "find who should see this message, make sure they see it".

It is much different when you have large, rich domain model that runs tens of thousands of business rules on incoming data and each entity can have very different processing depending on its state and the event that came.

I am writing whole applications just to data-mine our processing flow just to be able to understand a little bit of what is happening there.

At that traffic you can't even log anything for each of the transactions. You have to work indirectly through various metrics, etc.

kevinsundar · on Aug 25, 2021

Nice that's a pretty decent size, curious on the latency still. Thats the primary problem for a real time chat app.

Complexity of data and running business rules on it is not a data store problem though, that's a compute problem. It's highly parallelizable and compute is cheap.

For reference, my team runs transformations on about 1 PB of (uncompressed) data per day with 3 spark clusters, each with 50 nodes. We've got about 70ish PB of (compressed) data queryable. All our challenges come from storage, not compute.

lmilcin · on Aug 25, 2021

The latency is a complex topic.

In order to be able to run so much stuff on MongoDB, we almost never run single queries to the database. If I fetch or insert trade data, I probably run a query for 10 thousand trades at the same time.

So what happens is, as data comes from multiple directions it is being batched (for example 1-10 thousand at a time), split into groups that can be processed together in a roughly similar process, and then travels the pipeline as a single batch which is super important as it allows amortizing some of the costs.

Also the processing pipeline has many, many steps in it. A lot of them have buffers inbetween so that steps don't get starved for data.

All this causes latency. I try to keep it subsecond but it is a tradeoff between throughput and latency.

It could have been implemented better, but the implementation would be complex and inflexible. I think having clear, readable, flexible implementation is worth a little bit of tradeoff in latency.

As to storage being source of most woes, I fully agree. In our case it is trying to deal with bloat of data caused by business wanting to add this or that. All this data causes database caches to be less effective, requires more network throughput, more CPU for parsing/serializing, needs to be replicated, etc. So half the effort is constantly trying to figure out why they want to add this or that and is it really necessary or can be avoided somehow.

throwdbaaway · on Aug 25, 2021

I thought you are doing millions qps with a 3 nodes mongodb cluster, from the top level comment. That would be impressive.

By batching 1-10 thousands records at a time, your use case is very different from discord, which needs to deliver individual messages as fast as possible.

lmilcin · on Aug 25, 2021

Data doesn't come or leave batched. This is just internal mechanism.

Think in term of Discord, their database probably already queues and batches writes. Or maybe they could decide to fetch details of multiple users with a single query by noticing there are 10k concurrent asks for user details. So why have 10k queries when you could have 10 queries for 1k user objects?

If you complain that my process is different because I refuse to run it inefficiently when I can spot an occasion to optimize then yes, it is different.

throwdbaaway · on Aug 25, 2021

Of course, cassandra/mongodb/etc can perform their own batching when writing to the commit log, and can also benefit from write combining by not flushing out the dirty data immediately. That's besides the point.

Your use case allows you to perform batching for writes at the *application layer*, while discord's use case doesn't.

kerng · on Aug 25, 2021

Why couldn't others with lots of traffic use a similar approach? I assume they do. Seems pretty genius idea to batch things like that, especially when qps is very high batching (maybe waiting for a few ms to fill a batch) makes a lot of sense.

lmilcin · on Aug 25, 2021

I don't see why discord's case can't use same tricks. If they have a lot of stuff happening at the same time and their application is relatively simple (from the point of view of number of different types of operation it performs) at any point in time it is bound to have many cases of the same operation being performed.

Then it is just a case of structuring your application properly.

Most applications are immediately broken, by design, by having a thread dedicated to the request/response pair. It then becomes difficult to have parts of that processing from different threads be selected and processed together to take benefit of amortizing costs.

The alternative I am using is funneling all requests into a single pipeline and having that pipeline split into stages distributed over CPU cores. So it comes in (by way of Kafka or REST call, etc.), it is queued, it goes to CPU core #1, gets some processing there, then moves to CPU core #2, gets some other processing there, gets published to CPU core #3 and so on.

Now, each of these components can work on huge number of tasks at the same time. For example when the step is to enrich the data, it might be necessary to shoot a message to another REST service and wait for response. During that time the component picks up other items to do the same.

As you see, this architecture practically begs to use batching and amortize costs.

ricardobeat · on Aug 25, 2021

What you're describing sounds like vanilla async concurrency. I seriously doubt 'most applications' use the one-thread-per-request model at this point in time, most major frameworks are async now. And it's not a silver bullet either, plenty of articles on how single-thread is sometimes a better fit for extremely high-performance apps.

After reading all of you responses, I still don't see how you think your learnings apply to Discord. They would not be able to fit the indexes in memory on MongoDB. They can't batch reads or writes at the application server level (the latency cost for messaging is not acceptable). Millions of queries happen every second, not one-off analytical workloads. It seems these two systems are far enough apart that really there is no meaningful comparison to be made here.

fao_ · on Aug 25, 2021

I'm not sure why you're being downvoted when you're a domain expert talking about your craft? People have weird hangups on hacker news, it seems

Aeolun · on Aug 25, 2021

Something about the tone of the messages rubs me entirely the wrong way.

bottled_poe · on Aug 25, 2021

It reads like justified opinions from experience. Not seeing much emotional tone in there.

combyn8tor · on Aug 25, 2021

Well on one hand you've got engineers at a billion dollar company explaining how they've solved a problem. On the other hand you've got some random commentor on HN over-simplifying a complex engineering solution.

BigJono · on Aug 25, 2021

Sounds to me like it's some "random commentor" who has solved a similar problem at a similar scale with a solution that's much simpler.

Aeolun · on Aug 25, 2021

I dunno, sounds very ‘I am very smart’ to me. They may be right or they may not, but both solutions sound workable to me.

Don’t let perfect be the enemy of good, and all that. There’s enough utter garbage around to shit on.

fao_ · on Aug 25, 2021

I think you're reading into it. They are stating that the solution in the post was overengineered, and describing an alternate solution that doesn't require as much abstraction or resources, but is manageable for data with a much higher dimensional structure

The fact that you read that as "I am very smart" and that that was a reason to downvote the post, tells more about you than it does the person you're supposedly describing.

polishdude20 · on Aug 25, 2021

This just gives me Factorio flashbacks.

phailhaus · on Aug 25, 2021

Again, the article is not about throughput. How fast can you search across all historical trades?

lmilcin · on Aug 25, 2021

As an example, there are bitemporal queries like "for the given population of trades specified by following rules, find the set of trades that met the rules at a particular point in time, based on our knowledge at another given point in time". Also trades are versioned (are a stream of business events from trading system), then have amendments (each event may be amended in the future but the older version must be preserved). Our system can also amend the data (for example to add some additional data to the trade later). All this causes trades to be a tree of immutable versions you need to comb through. A trade can have anywhere from 1 to 30k versions.

This takes about 20 seconds. The process opens about 200 connections to the cluster and transfers data at about 2-4GB/s.

michael_j_ward · on Aug 25, 2021

Thanks for all the great insight.

Did you consider an explicitly bitemporal database like crux[0]?

[0] https://github.com/juxt/crux?

lmilcin · on Aug 25, 2021

Our application predates crux. As to crux, I can't comment because I did not know about it when I was actively searching for a product to do this.

phailhaus · on Aug 25, 2021

How many simultaneous queries of that nature can the system handle?

fao_ · on Aug 25, 2021

Are you not sure that financial data "with hundreds of fields" is more complex than chat data which has a relatively linear threading and only a handful of fields?

phailhaus · on Aug 25, 2021

I'm asking about how your system scales to the number of queries, but you seem to be taking every question personally. You seem to really want to make sure everyone knows that you think Discord's problems are easy to solve. I'm not saying Discord is more complicated, but you're not really giving enough information to prove that Discord's problems are a subset of yours.

Do you support more simultaneous queries than Discord?

fao_ · on Aug 26, 2021

I think you need to recheck my username, you have mistaken me with another poster in the thread.

lmilcin · on Aug 25, 2021

Actually, our threading is quite simple.

There is exactly as many threads (that do anything) as CPU cores.

fao_ · on Aug 25, 2021

I meant threading in as much as the connections or links between message data

All discord needs to store is:

{ channel_id, message_id, user_id, content, reply_to, datetime }

these days they added an extra thread_id field, sure. But the data itself is blisteringly uncomplex and there is only a single way to display it (ordered by time, i.e. the 'thread')

gpderetta · on Aug 25, 2021

I think parent meant threading as in message threads.

lmilcin · on Aug 25, 2021

Just recently Symphony switched order in which I have received two messages from a colleague. This completely changed the meaning of the message and we got in an argument that was only resolved after I have posted screenshot of what he wrote.

It seems, the threading might not be that simple after all.