Hacker Newsnew | past | comments | ask | show | jobs | submit | alexdns's commentslogin

"another few hundred USD for 8TB in NVMe SSD" lol

If we managed sophistication as access to a toilet then yes Japan is 10000x more sophisticated than India


Correct. You need at least 2 ifs.


You need unbounded recursion no?


That's actually what two ifs could be.


Was the case against the goto statement so good we can't mention it?


More or less, I meant how this would be inlined in assembly with a goto that could goto back where the branching originated from.


whats your chain of thought here ? a company that has nothing to do with azure is down because azure got ddosed 2 weeks ago ?


Maybe that any actor sophisticated enough to take down Azure might also target Cloudflare?


Especially when the next happens to be a major DDOS mitigator.


Well its not silent those panels go into MPPTs that produce noise when high amps are flowing through them to charge batteries if they don't direct export , if they direct export then there is noise from inverters to convert DC->AC


But is it honestly enough to notice if you live half a mile a way? Couldn't they just put up sound damping like the oil rigs do?


Well depends on where they are they might be obligated to put due to some noise polution law or they might not care because there is no such law


It was considered innovative when it was first shared here eight years ago.


Anything more innovative happened since (honestly curious)?


I don't think so, but my guess is raw performance rarely matters in the real world.

I once explored this, hitting around 125K RPS per core on Node.js. Then I realized it was pointless, the moment you add any real work (database calls, file I/O, etc.), throughput drops below 10K RPS.


It's always a matter of chasing the bottleneck. It's fair to say that network isn't the bottleneck for most applications. Heuristically, if you're willing to take on the performance impacts of a GC'd language you're probably already not the target audience.

Zero copy is the important part for applications that need to saturate the NIC. For example Netflix integrated encryption into the FreeBSD kernel so they could use sendfile for zero-copy transfers from SSD (in the case of very popular titles) to a TLS stream. Otherwise they would have had two extra copies of every block of video just to encrypt it.

Note however that their actual streaming stack is very different from the application stack. The constraint isn't strictly technical: ISP colocation space is expensive, so they need to have the most juiced machines they can possibly fit in the rack to control costs.

There's an obvious appeal to accomplishing zero-copy by pushing network functionality into user space instead of application functionality into kernel space, so the DPDK evolution is natural.


TCP is generally zero-copy now. Zero-copy with io_uring is also possible.

AF_XDP is also another way to do high-performance networking in the kernel, and it's not bad.

DPDK still has a ~30% advantage over an optimized kernel-space application with a huge maintenance burden. A lot of people reach for it, though, without optimizing kernel interfaces first.


The goal of this kind of system is not to replace the application server. This is intended to work on the data plane where you do simple operations but do them many time per second. Think things like load balancers, cache server, routers, security appliances, etc. In this space Kernel Bypass is still very much the norm if you want to get an efficient system.


> In this space Kernel Bypass is still very much the norm if you want to get an efficient system.

Unless you can get an ASIC to do it, then the ASIC is massively preferrable; just the power savings generally¹ end the discussion. (= remove most routers from the list; also some security appliances and load balancers.)

¹ exceptions confirm the rule, i.e. small/boutique setups


ASICs require years to develop and aren’t flexible once deployed


You don't develop an ASIC to run a router with, you buy one off the shelf. And the function of a router doesn't exactly change day by day (or even year by year).


Change keeps coming, even when the wire format of a protocol has ossified. I've spent years in security and router performance at Cisco, wrote a respectable fraction of the flagship's L3 and L2-L3 (tun) firewall. I merged a patch on this tried-and-true firewall just this year; it's now deployed.

As vendors are eager to remind us, custom silicon to accelerate everything between L1 to L7 exists. That said, it is still the case in 2025 that the "fast path" data-plane will end up passing either nothing or everything in a flow to the "slow path" control-plane, where the most significant silicon is less 'ASIC' and more 'aarch64'.

This is all to say that the GP's comments are broadly correct.


My colleagues are always writing new features for our edge and core router ASICs released more than 10 years ago. They ship new software versions multiple times a year. It is highly specialised work and the customer requesting the feature has to be big enough to make it worth-while, but our silicon is flexible enough to avoid off-loading to slow CPUs in many cases. You get what you pay for.


Even the ones supporting things like P4?


We do storage systems and use DPDK in the application, when the network IS the bottleneck it is worth it. Saturating two or three 400gbps NICs is possible with DPDK and the right architecture that makes the network be the bottleneck.


Storage and database doesn’t have to be that slow, that’s just architecture. I have database servers doing 10M RPS each, which absolutely will stress the network.

We just do the networking bits a bit differently now. DPDK was a product of its time.


What DB engine is it? What hardware?



Sorry, but when one thing in your benchmark has 50x the performance of the baseline, you probably have a bad baseline. If you used AI to write these, it probably did not use io_uring or aio correctly, or you have some sort of system misconfiguration. You may have also failed to bypass the filesystem with those methods, which would explain a lot of the discrepancy.


You can apparently do 100gbit/sec on a single thread over ethernet with io uring.


Recently did 400gb/s on a single core / 4x100gb nics (or just the one 400g nic, too) with dpdk. Mind you it's with jumbo frames and constant packet size for hundreds of mostly synchronized streams... You won't process each packet individually, mostly put them in queues for later batch-process by other cores. Amazing for data acquisition applications using UDP streams.

I keep watching and trying io_uring and still can't make it work as fast with simple code as consistently for those use cases. AF_XDP gets me partly there but then you're writing ebpf... might as well go full-dpdk.

Maybe it's a skill issue on my part, though. Or just a well-fitting niche.


Sounds super cool but dpdk sounds like it won't be worth the difficulty from what I read so far.

I also want to get into socket io using io_uring in zig. I'll try to apply everything I found in liburing wiki [0] and see how much I can get (max hardware I have is 10gbit/s).

Seems like there is: - multi-shot requests - register_napi on uring instance - zero copy receive/send. (Probably won't be able to get into it)

Did you already try these or are there other configurations I can add to improve it?

[0]: https://github.com/axboe/liburing/wiki/io_uring-and-networki...


I ... kind of agree with the difficulty. I don't get it - DPDK is at its core really not a complex API ! Allocate a pool of buffers, and in an infinite loop, ask your NIC to fill these buffers. There. After that, yes you have to decap every packet (ethernet then IP - don't forget reassembly - then whatever you have over - UDP is absolutely no effort, TCP... not so). It's wholly manageable to anyone knowing a bit of light C++ (more C-like) and lower layers (and can parse the sometimes very dry and cryptic doc, for all the utility fonctions. Interaction with the actual consumer of the data can be done with DPDK-provided primitives or simple shared memory... it's really not hard for a mid-level systems programmer. But I still find myself unable to hire people who can work at that level of the stack, a bit baffling. I can't see how they'd be better with io_uring or AF_XDP and all their inherent complexity. Anything harder than a socket and epoll and you're a wizard now...

One other big plus of DPDK for me is the low-level access to hardware offload. GPUDirect (when you can get it to work), StorageDirect or most of the available DMA engines in some (not so) high-end hardware. The flow API on mellanox hardware is the basis of many of my multi-accelerator applications (I wish they supported P4 for packet format instead, or just open-source whatever low-level ISA the controller is running, but I don't buy enough gear to have a voice). Perusing the DPDK documentation can give ideas.

So, yes, very low-level with some batteries included. Good and stable for niche uses. But far smaller hiring pool (is the io_uring-100Gb pool bigger ? I don't know).


You don't even need io_uring for 10 gbit/s, epoll will do that easily, unless you have very niche workload.


For UDP Pixelflut, I was able to send 8Gbps on a 10Gbps link with a single thread running a tight loop doing byte shuffling and then sendmmsg. I didn't bother to multithread it because that's a convenient amount of headroom left over for actual communications.


Any numbers for io_uring with 4x100gb nics in your tests?


Of course, but when working directly with the NIC, such speeds can be achieved with smaller packets, getting even closer to the linerate.


Well, io_uring came along and removed a lot of the incentive.


it is illegal if you have the trademark you can use the Uniform Domain-Name Dispute-Resolution Policy to get it back


DENIC (who controls .de) has its own (very... German) dispute system: https://www.denic.de/en/service/dispute that they would rather you use. It requires mailing a form to request information, then mailing a form to send a dispute (don't be fooled by the web forms or PDFs, those are just tools to generate the documents you need to print, sign, and send).

Their FAQ lists why you usually don't want to go for UDNDR when you can help it, though: https://www.denic.de/en/faqs/all-faq#code-106

ICANN's procedures are all nice and dandy if all three parties involved are in the USA, but when it comes to international disputes (in this case the Dutch company registering the domains and the German business being impersonated), things can get pretty complex and expensive real fast.


for any TLD/ccTLD including .bg if you have a trademark you can get the domain even if its registered by someone else - its a very simple appeal process - Uniform Domain-Name Dispute-Resolution Policy


Looks fake because the supposedly LLM added ! after BANANA at the beginning


It is a one-word opener for an addressal, so adding an exclamation mark indeed sounds like something a human would do. Of course, that means an AI is likely to do it too. If they weren't good at sounding like humans, we wouldn't be talking about this.


Globally android has a 70% market share.


Only due to cost, the majority of that share is not devices that are capable of doing what would be required for an app like Kino to work with these features. The share of flagship Android devices worldwide is lower than flagship iOS devices.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: