Hacker News new | past | comments | ask | show | jobs | submit login

The latency is a complex topic.

In order to be able to run so much stuff on MongoDB, we almost never run single queries to the database. If I fetch or insert trade data, I probably run a query for 10 thousand trades at the same time.

So what happens is, as data comes from multiple directions it is being batched (for example 1-10 thousand at a time), split into groups that can be processed together in a roughly similar process, and then travels the pipeline as a single batch which is super important as it allows amortizing some of the costs.

Also the processing pipeline has many, many steps in it. A lot of them have buffers inbetween so that steps don't get starved for data.

All this causes latency. I try to keep it subsecond but it is a tradeoff between throughput and latency.

It could have been implemented better, but the implementation would be complex and inflexible. I think having clear, readable, flexible implementation is worth a little bit of tradeoff in latency.

As to storage being source of most woes, I fully agree. In our case it is trying to deal with bloat of data caused by business wanting to add this or that. All this data causes database caches to be less effective, requires more network throughput, more CPU for parsing/serializing, needs to be replicated, etc. So half the effort is constantly trying to figure out why they want to add this or that and is it really necessary or can be avoided somehow.




I thought you are doing millions qps with a 3 nodes mongodb cluster, from the top level comment. That would be impressive.

By batching 1-10 thousands records at a time, your use case is very different from discord, which needs to deliver individual messages as fast as possible.


Data doesn't come or leave batched. This is just internal mechanism.

Think in term of Discord, their database probably already queues and batches writes. Or maybe they could decide to fetch details of multiple users with a single query by noticing there are 10k concurrent asks for user details. So why have 10k queries when you could have 10 queries for 1k user objects?

If you complain that my process is different because I refuse to run it inefficiently when I can spot an occasion to optimize then yes, it is different.


Of course, cassandra/mongodb/etc can perform their own batching when writing to the commit log, and can also benefit from write combining by not flushing out the dirty data immediately. That's besides the point.

Your use case allows you to perform batching for writes at the *application layer*, while discord's use case doesn't.


Why couldn't others with lots of traffic use a similar approach? I assume they do. Seems pretty genius idea to batch things like that, especially when qps is very high batching (maybe waiting for a few ms to fill a batch) makes a lot of sense.


I don't see why discord's case can't use same tricks. If they have a lot of stuff happening at the same time and their application is relatively simple (from the point of view of number of different types of operation it performs) at any point in time it is bound to have many cases of the same operation being performed.

Then it is just a case of structuring your application properly.

Most applications are immediately broken, by design, by having a thread dedicated to the request/response pair. It then becomes difficult to have parts of that processing from different threads be selected and processed together to take benefit of amortizing costs.

The alternative I am using is funneling all requests into a single pipeline and having that pipeline split into stages distributed over CPU cores. So it comes in (by way of Kafka or REST call, etc.), it is queued, it goes to CPU core #1, gets some processing there, then moves to CPU core #2, gets some other processing there, gets published to CPU core #3 and so on.

Now, each of these components can work on huge number of tasks at the same time. For example when the step is to enrich the data, it might be necessary to shoot a message to another REST service and wait for response. During that time the component picks up other items to do the same.

As you see, this architecture practically begs to use batching and amortize costs.


What you're describing sounds like vanilla async concurrency. I seriously doubt 'most applications' use the one-thread-per-request model at this point in time, most major frameworks are async now. And it's not a silver bullet either, plenty of articles on how single-thread is sometimes a better fit for extremely high-performance apps.

After reading all of you responses, I still don't see how you think your learnings apply to Discord. They would not be able to fit the indexes in memory on MongoDB. They can't batch reads or writes at the application server level (the latency cost for messaging is not acceptable). Millions of queries happen every second, not one-off analytical workloads. It seems these two systems are far enough apart that really there is no meaningful comparison to be made here.


I'm not sure why you're being downvoted when you're a domain expert talking about your craft? People have weird hangups on hacker news, it seems


Something about the tone of the messages rubs me entirely the wrong way.


It reads like justified opinions from experience. Not seeing much emotional tone in there.


Well on one hand you've got engineers at a billion dollar company explaining how they've solved a problem. On the other hand you've got some random commentor on HN over-simplifying a complex engineering solution.


Sounds to me like it's some "random commentor" who has solved a similar problem at a similar scale with a solution that's much simpler.


I dunno, sounds very ‘I am very smart’ to me. They may be right or they may not, but both solutions sound workable to me.

Don’t let perfect be the enemy of good, and all that. There’s enough utter garbage around to shit on.


I think you're reading into it. They are stating that the solution in the post was overengineered, and describing an alternate solution that doesn't require as much abstraction or resources, but is manageable for data with a much higher dimensional structure

The fact that you read that as "I am very smart" and that that was a reason to downvote the post, tells more about you than it does the person you're supposedly describing.


This just gives me Factorio flashbacks.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: