More

nXqd · 2025-04-08T19:50:33 1744141833

how is it compared to ag-grid chart?

evaneykelen · 2025-04-08T20:23:51 1744143831

We did not evaluate libraries that have a paid plan

aargh_aargh · 2025-04-08T21:38:45 1744148325

What about Vega?

nXqd · on Nov 22, 2024

bros keep eating ...

nXqd · on July 13, 2024

Tokio focuses on being high throughput as default, since they mostly use yield_now backoff strategy. It should work with most application.

For latency sensitive application, it tends to have different purpose which mainly trade off CPU and RAM usage for higher low latency ( first ) and throughput later.

nicholassm83 · on July 13, 2024

I agree, the disruptor is more about low latency. And the cost is very high: a 100% utilized core. This is a great trade-off if you can make money by being faster such as in e-trading.

_3u10 · on July 13, 2024

High throughput networking does the same thing, it polls the network adapter rather than waiting for interrupts.

The cost is not high, it's much less expensive to have a CPU operating more efficiently than not processing anything because its syncing caches / context switching to handle an interrupt.

These libraries are for busy systems, not systems waiting 30 minutes for the next request to come in.

Basically, in an under utilized system most of the time you poll there is nothing wasting CPU for the poll, in an high throughput system when you poll there is almost ALWAYS data ready to be read, so interrupts are less efficient when utilization is high.

nine_k · on July 13, 2024

Running half the cores of an industrial Xeon or Zen under 100% load implies very serious cooling. I suspect that running them all at 100% load for hours is just infeasible without e.g. water cooling.

wmf · on July 13, 2024

Nah, it will just clock down. Server CPUs are designed to support all cores at 100% utilization indefinitely.

Of course you can get different numbers if you invent a nonstandard definition of utilization.

nine_k · on July 13, 2024

Of course server CPUs can run all cores at 100% indefinitely, as long as the cooling can handle it.

With 300W to 400W TDP (Xeon Sapphire 9200) and two CPUs per typical 2U case, cooling is a real challenge, hence my mention of water cooling.

wmf · on July 13, 2024

I disagree. Air cooling 1 KW per U is a commodity now. It's nothing special. (Whether your data center can handle it is another topic.)

lordnacho · on July 13, 2024

Suppose I have trading system built on Tokio. How would I go about using this instead? What parts need replacing?

Actually looking at the code a bit, it seems like you could replace the select statements with the various handlers, and hook up some threads to them. It would indeed cook your CPU but that's ok for certain use cases.

nicholassm83 · on July 13, 2024

I would love to give you a good answer but I've been working on low latency trading systems for a decade so I have never used async/actors/fibers/etc. I would think it implies a rewrite as async is fundamentally baked into your code if you use Tokio.

lordnacho · on July 13, 2024

Depends on what "fundamental" means. If we're talking about how stuff is scheduled, then yes of course you're right. Either we suspend stuff and take a hit on when to continue, or we hot-loop and latency is minimized at the cost of cooking a CPU.

But there's a bunch of stuff that isn't that part of the trading system, though. All the code that deals with the format of the incoming exchange might still be useful somehow. All the internal messages as well might just have the same format. The logic of putting events on some sort of queue for some other worker (task/thread) to do seems pretty similar to me. You are just handling the messages immediately rather than waking up a thread for it, and that seems to be the tradeoff.

_3u10 · on July 13, 2024

These libs are more about hot paths / cache coherency and allowing single CPU processing (no cache coherency issues / lock contention) than anything else. That is where the performance comes from, referred to as "mechanical sympathy" in the original LMAX paper.

Originally computers were expensive, and lots of users wanted to share a system, so a lot of OS thought went into this, LMAX flips the script on this, computers are cheap, and you want the computer doing one thing as fast as possible, which isn't a good fit for modern OS's that have been designed around the exact opposite idea. This is also why bare metal is many times faster than VMs in practice, because you aren't sharing someone else's computer with a bunch of other programs polluting the cache.

lordnacho · on July 13, 2024

Yeah, I agree. But the ideas of mechanical sympathy carry over into more than one kind of design. You can still be thinking about caches and branch prediction while writing things in async. It's just the awareness of it that allows you to make the tradeoffs you care about.

Quekid5 · on July 14, 2024

Eh... not really. The main problem is that it becomes incredibly hard to reason about the exact sequencing of things (which matters a lot for mechanical sympathy) in async world.

kprotty · on July 13, 2024

Tokio's focus is on low tail-latencies for networking applications (as mentioned). But it doesn't employs yield_now for waiting on a concurrent condition to occur, even as a backoff strategy, as that fundamentally kills tail-latency under the average OS scheduler.

nXqd · on June 4, 2024

last time I read, simdjson still beat protocol buffer a bit.

lifthrasiir · on June 4, 2024

Due to the inherent constraint of JSON, the exact use case would matter a lot for such comparisons. Simdjson is generally faster when you only want a well-formedness check or a very small proportion of a large input JSON, but the "well-formedness" for JSON would be a small subset of the "well-formedness" of binary formats, and a partial parsing performance is often dominated by language bindings instead of underlying parsers (it is a wise move that simdjson also has a JSON pointer support for this reason, because that greatly reduces FFI overhead). Binary formats in comparsion tend to have a generally flat performance profile.

nXqd · on May 29, 2024

in rust you can control the layout and alignment of fields in a struct with `#[repr(...)]`

nXqd · on May 23, 2024

We shouldn't overestimtae the complexity of expressive language like Clojrue vs Go. Go is perfectly simple and easy to follow, even compared with Clojure. If you come from background of Computer Science or programming first, Go is easier to follow.

kaba0 · on May 23, 2024

I always feel this comparison is a fallacy. Assembly is also perfectly simple and easy to follow, like each instruction is trivial. Yet you will fail to grasp the whole, as “you are zoomed in too close”. I feel go has a good scale for many kind of tasks, but at the same time, it lacks the expressivity to change the zoom level, which I feel is needlessly limiting and makes the language a bad language for problems that require a slightly higher level of abstraction.

alabhyajindal · on May 23, 2024

I tried Go for a while after coming from JavaScript and didn't feel productive enough. I think this explains why exactly.

nXqd · on May 23, 2024

I do switch between Clojure and Rust depends on the problems I solve at hands ( prototypes vs building production ). And yes I need go through my checklist of each to switch from connecting data flows at high level to examine nuts and bolts at low level.

nXqd · on May 9, 2024

Impressive, I believe this is so true in every scenario. You cannot just force people to work, a break between hard problems works very well.

nXqd · on April 5, 2024

this, same for what is the programming language to build your startup

nXqd · on March 14, 2024

this is pretty nice and can lead to modular unikernel. however, there is some application we need patch to remove OS specific code.

eyberg · on March 14, 2024

This is true. There is quite a lot of applications that you can just run out of the box but then I'll give you two cases where that won't be the case and do require patches:

1) In many interpreted languages it is common to have convenience commands that shell out to call a script which shells out to call another script and 8 layers later you get to the actual real command. When I'm creating a package for an application that does this I have to usually figure out what env vars are being set, what paths are being changed and so forth. This is probably a super easy thing for whoever made the software to begin with but not so easy for someone that wants to just use it. So the solution is to make the original author aware that it might go into an unikernel environment or, far less probable, convince them that a better method would be to not do this practice to begin with as it.

2) In older software (specifically I'm looking at mid-to-late 90s), in a time before threads and commodity SMP machines and the cloud it was pretty common to write software that used multiple processes to do many things. Postgres is the most common example I use here (keep in mind postgres is descended from Ingres from the early 80s and Dr. Stonebraker is now on his tenth? twentieth? database venture DBOS (https://www.dbos.dev/) - which definitely has ideas that we are very keen about.

Anyways, that's not really the case today with go, rust, java, etc. For apps like this we will, from time to time port them. That's exactly what we did with postgres too to make it multi-threaded instead: https://repo.ops.city/v2/packages/francescolavra/postgres/16... .

I think there is a lot of opportunity out there for individuals to come in and create newer versions of software like this and get some really awesome benefits while maintaining more or less the same parity.