I work with Wael. Development is still ongoing. One implementation uses Golang, the other uses F# with a library that wraps libuv for faster network performance. Pony was used to write the stress-testing client for both implementations.
I mentioned this elsewhere in the thread, but since you'll see the reply here; look into Vert.X if you haven't already.
It already does most of what you want and has support for native epoll transport.
I'm not sure what led Vert.X to be discarded, maybe not a Java shop? But we've used it extremely successfully for high performance REST and I know of several high profile tech companies that swear by it.
There's nothing I know of that compares with Vert.X in performance, stability, and popular adoption
Thanks for the recommendation! However, there are a few reasons why Vert.X wasn't considered, the biggest ones being that we're not a Java shop and the service in this blog post isn't HTTP/REST.
While the bandwidth benchmarks are fun to see and write blog posts about, we also care a lot about keeping tail latency at or below 2ms than we do about getting more bandwidth at this point.
Ah yes. The GC latency could be a problem. Java may still be viable with the new ZGC garbage collector. And Vert.X uses Netty underneath which is mostly protocol agnostic.
Still, those issues and not being a Java shop makes Vert.X/Netty likely a bad fit.
Thanks for replying with a well thought out response!
Have you considered Nim? You can achieve some really high performance with it[1]. Since you've considered Rust, Go, C and even Pony, Nim should really be on your list.
Also why is evey immature native language considered, but the two speed demon languages without garbage collection - ISPC and C++ - are nowhere to be found?
Was pony considered for the implementation itself? Other than the immature ecosystem it seems like a perfect fit here. Awesome write up and sounds like a fun job.
> "I didn't want to rewrite everything from scratch, and definitely, I didn't want to handle all edge cases for epoll. My choice was to use libuv. The architecture I opt for: use 16 cores out of 40 for networking, having 16 'uv_loop' each running on its own thread. Callbacks will be passed from F# to each 'uv_loop' instance. The event loop will call them after parsing the bid request in C11."
Looks like libuv directly in C11? (not F# as before edit).
> The solution was to Marshal calls from F# to libuv and achieve 5 Millions (at least) bid requests/s on 16 threads (this solution scales with cores/NICs).
Is it golang or pony or F$? CoreFX mention in the end confused me more.