Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Millions of active WebSockets with Node.js (unetworkingab.medium.com)
94 points by selvan on Feb 21, 2023 | hide | past | favorite | 70 comments


> The theoretical limit is 65k connections per IP address but the actual limit is often more like 20k, so we use multiple addresses to connect 20k to each (50 * 20k = 1 mil).

Thats 65k per CLIENT IP address (i.e one per client IP:port pair). There should be no reason to use multiple IP addresses on the server.

If (for some reason??) you need each client to have more than 65k connections, you can add a port instead of an IP

> The client side has similar settings but does not need to set up multiple IP addresses, obviously.

You got this backwards and are now having to use a pool of server IPs to connect to instead of a single one...


The 65k limit is because of the port limit. WebSocket is using the “trick” of opening a port without answering before the server have any news to give, this way the messages are more push-like.

How can this be fixed without more IP-addresses?


IP Backlog in answering connections all re-use the same port though.

Simply put: if you had a webserver answering on port 443 then the backlog will be a number of prepared slots that all eventually will have 443 in them as the local port for the connection. You can have as many of those as you want. But only one process gets to listen to port 443 on any given IP.


What port limit? The server is listening on a single port.


EDIT: I was wrong, see child comments and the sibling comment from jacuqesm.

---

Yes, it is listening on a single port, but accepted connections bind to a separate socket.

Here's how it works under the hood:

> The accept() call creates a new socket descriptor with the same properties as socket and returns it to the caller. [...] The new socket descriptor cannot be used to accept new connections. The original socket, socket, remains available to accept more connection requests.

Pulled this from random IBM zOS docs, but it's in compliance with the Unix standard: https://www.ibm.com/docs/en/zos/2.4.0?topic=functions-accept...


Separate socket; does not mean a separate [server] port.


Yes, you're right! Huge gap in understanding on my side.


I don’t understand - how is it not a server port? When a datagram arrives from the network it’s addressed to the server’s IP and a port (and from a client IP and port). How is the data delivered to the correct socket if not through the 5-tuple (which includes server port) to socket mapping?


It can be distinguished via its client IP and port.


It’s 65k per end, but we’re generally expecting all traffic to come in on one port for one service.

To further complicate the issue, any proxies or NAT in between may reduce the number of “clients” seen. A naughty ISP may only allow you to talk to 65 k sessions at once, or per data center.

And then there’s load balancers, which increase the number of ports in use if you only have one IP.


I am currently looking for ways to build a service that can handle around 100k-200k active concurrent websocket connections on production. It's wild seeing this article here. Does anyone know of any alternative ways to do this? Most people seem to suggest using Elixir but I wonder if I can achieve the same using a more "conventional" language such as Java or Golang.

This article covers Node.js for me, I guess.


Elixir is well suited to highly concurrent systems and work like this. I'm big on the whole Elixir ecosystem though so I haven't explored other options.

I don't see why there would be anything stopping Go from being similarly capable as it also has a good reputation for concurrency and what I hear does preemptive scheduling.

Java can probably do anything except be fun and lightweight so assuming you want to figure out the hoops to jump through. I assume it could..

Elixir can do it with the ergonomics and expressiveness of Python/Ruby. If you enjoy that level of abstraction I recommend it.


Do you have any pointer, book preferably, in starting an exploratory Elixir project? I don't have any objective apart from giving the ecosystem a taste


If you really want a book pick one from here [0]. First one is good.

Personally I think just following the official guide [1] will give you all you need to get a taste of the language and the platform and decide if you like it or not.

If you were talking about websockets in particular I guess realistically most people use Phoenix Channels [2] that give you websockets in ten lines of code.

[0] https://elixir-lang.org/learning.html

[1] https://elixir-lang.org/getting-started/introduction.html

[2] https://hexdocs.pm/phoenix/channels.html


I can highly recommend Elixir in Action, 2nd ed.

This talk by the same author is also a good introduction in video format: https://www.youtube.com/watch?v=JvBT4XBdoUE


Java is slowly absorbing the ideas from other systems and is much more fun than it was.

Also versatile.


We did this with Node.js and uWebSockets and it scaled easily to a few million web sockets on ~10 machines so I can confirm the stack works in practice


We used the C++ version of uWebSockets to replace a legacy node app. We went from four fully loaded cores to about 20% of a single core and a fraction of the memory usage. It's a great library.


I am trying to imagine why one would need millions of web sockets :) What are the use cases here?


Millions of clients. IOT devices? Who knows.


It's unlikely you'd want to connect IOT devices to a backend using web sockets, I'd use a UDP based protocol for that, e.g. QUIC. But for web clients it makes sense.


MQTT is usually the go to protocol for IOT devices. You can do MQTT over WebSockets to help prevent issues using odd ports on home networks etc.


This was working for Peer5 (YC startup) - building a p2p CDN, these were video viewers in live events (e.g. the world cup).


Check out Centrifugal:

https://github.com/centrifugal/centrifugo (Server/Admin)

https://github.com/centrifugal/centrifuge (Server core)

https://github.com/centrifugal/centrifuge-js (Library)

It's a complete solution, including server, admin panel and client library.


Honestly, what matters is (a) what you're going to be doing with those connections and (b) your hardware.

As a generalization (again, really depends what you're going to be doing), I'd expect people to get a lot further with a Go or Java based implementations. Specifically, if those connections are interacting with each other in any meaningful way, I think shared data is still too useful to pass up.

I've written a websocket server implementation in Zig(1) and Elixir(2)

(1) https://github.com/karlseguin/websocket.zig (2) https://github.com/karlseguin/exws


> Specifically, if those connections are interacting with each other in any meaningful way, I think shared data is still too useful to pass up.

What does this mean? What are some scenarios where connections interact with each other? I work with dotnet. To me, every request is standalone and doesn’t need to know any other request exists. At the most, I can see doing some kind of caching where if someone does a GET /person/12345 and someone else does the same, I maybe able to do some caching. However, I don’t think this is what you meant by shared data.

Did you mean like if someone does a PUT /person/12345/email hikingfan@gmail.com instead of the next get request reaching to the database, you keep it in the application memory and just use it?

Or am I completely missing the point and you’re talking about near real-time stuff like calls and screen sharing?


This is in the context of a websocket (which is what the original story is about). Presumably, websocket is being used because HTTP isn't enough, namely, you want to receive pushes from the server. This _often_ comes in the form of data that multiple connections are interested in: game state, chat, collaborative editing. At scale, this data, or a copy of it, often stays in memory. E.g. a chat system might keep a list of room + brief chat history + user list in memory. This memory is being mutated by concurrent connections.


Many languages (e.g, NodeJS) won’t even let you share code. So you can’t really do stuff like have hundreds of threads without being very careful with the size of your application code, because each thread will get a copy.


If you need to send messages to other channels, or use shared caches, or have shared state like a game server.


Pretty much any modern runtime (Java/Go/Node w/ native bindings) can handle that many connections per machine. You probably want to horizontally scale it with kafka or similar, but anyway, a single machine will work to start.


With Netty and Java you can easily handle 100-200k active web socket connections on a single server.

It was being done 7-8 years ago. If you search you should find a few articles on this.


Considering someone had a 100k+ idle connections on a raspberry pi with Java/Netty, yeah, you could get to a million today with some mid tier hardware and Linux tuning, pretty easily.


.NET7 and Kestrel are likely able to pull this off if properly configured. Kestrel/AspNetCore routinely shows up in the top 10 techempower web benchmarks.


I'm not trusting benchmarks that I didn't fake myself.

Anyway - there's a lot of "non-standard" stuff in ASP's code there.


Can you explain how you think the techempower benchmarks are faked?



Can you please suggest a link to show how to do this?



Node might be faster to write, but harder to maintain in the long run, also is not as reliable as Go or Rust, I personally will pick Rust because I have experience with it, but AFAIK Go has a very good reputation, the only "difference" with Rust is the GC (I mean "difference", because Go performance is not that far off from Rust, and seems also easy to write in Go than in Rust).

Also IMHO it's better to have a strong typed language behind your project, if it will be big, dynamic languages and big projects tend to be a nightmare for me.


Would you mind unpacking how, in your view, Go/Rust/compiled strongly-typed languages lead to more *reliable* software? I can see how performance and maintainability* are sort of self-evident arguments in favour of them, but not sure how reliability could be a feature inherent to a language/runtime.

* As a build/compile-time concern, using Node doesn't preclude strong-typing, so maintainability is also not a strong argument against the runtime itself, given you can use e.g. TypeScript.


I think this blog post[0] describes what level of reliability you can achieve with Rust, specifically:

> In fact, Pingora crashes are so rare we usually find unrelated issues when we do encounter one. Recently we discovered a kernel bug soon after our service started crashing. We've also discovered hardware issues on a few machines, in the past ruling out rare memory bugs caused by our software even after significant debugging was nearly impossible.

For sure not everyone will be able to achieve that in their first try or when getting started, but for sure is possible, but with Node I'm not confident enough to say that, for sure if works to hack something quickly and put in online, with Rust it takes longer and there are not too many platforms yet where you can easily deploy your app.

[0] https://blog.cloudflare.com/how-we-built-pingora-the-proxy-t...


Node is a joke. It's not good for this.

Check out https://github.com/panjf2000/gnet, it also has some links at the end.


Care to explain why node is "a joke"? It's often used for these kind of applications.


I don't know anything about node.js or websockets but I actually set them up and used them in Jitsi Meet's version of the Excalidraw Backend for its recent whiteboard implementation.

Here's me looking at the websocket traffic (I think): https://youtu.be/4rlffwHUchk?t=1857

I might not fully understand the technicals of it but I got it up and running and use it almost every day! :D Maybe I'll someday understand it.


Still love seeing such great performance from asynchronous processing. But I’m curious, is there something special about node providing this? Or can we achieve similar performance with other async web servers (python, go, swift, rust etc?)


C1M is not that impressive anymore, these days one need to do C10M;)

Most of the work of accepting and holding concurrent TCP sockets done by the OS, not the language runtime. One can easily tune Linux kernel to 1M concurrent sockets.

The real issues: memory usage per concurrent socket (idle or active), and ability to do something useful with all these active connections, e.g. send pings every 30s, or broadcast a message to all of them.

I'm not sure NodeJS/C++ based system from this post will allow sending pings every 30s to 1M websockets, let alone to do something useful with them, beside some low-traffic or infrequent notifications (Of course one always need to perform realistic loadtests in order to answer these kinds of questions).

Erlang/Elixir/BEAM have a relatively large memory usage per active socket, but it allows doing something useful with them under an easy to use programming model (read: no callback hell).


isn't "callback hell" a bit old fashioned now? With promises and async/await etc.


Even with async/await it's still single-threaded.

I might be mistaken, since I'm not up-to-date with the multithreading/concurrency/isolates in nodejs/V8.

IMO async/await is error-prone and isn't an ideal programming model.


> Even with async/await it's still single-threaded.

That doesn't mean anything. V8 is single-threaded but Node.js I/O is non-blocking. The reason Node became popular in the first place is that companies started adopting it to fill the gaps in their existing infrastructure (e.g. Java) to offer "realtime" (i.e. web sockets or its experimental equivalents) communication.

> IMO async/await is error-prone and isn't an ideal programming model.

What's the point of unsubstantiated statements like that (btw "x is error-prone" is an empirical claim, so prefixing it with "IMO" just means "I can't back this up and don't care if it's true") other than stirring up pointless language rivalries?

Just acknowledge that your off-hand comment about "callback hell" was anachronistic and don't try to come up with excuses to justify your preferences. I think Elixir is neat and hope it can see sufficient adoption for me to justify getting invested in it but that doesn't justify poopooing other languages, especially ones you admit not to have up-to-date knowledge about.


> What's the point of unsubstantiated statements like that

That's the problem with Software Engineering in general.

> "IMO" just means "I can't back this up and don't care if it's true")

The research study that will either back or refute my claim would be prohibitively expensive and therefore impractical.


My complaint isn't that you don't provide research. My complaint is that you use "IMO" to make a claim that could easily be substantiated instead.

E.g. "in my experience async/await can easily result in bugs that can be hard to detect" or "when teaching beginners, I've found that they have a harder time wrapping their head around async/await than when learning about agents" or "async/await still requires writing imperative code, requiring the programmer to pay attention to behavior that agents can abstract away through declarative code". Now, I don't know if any of those statements are true or if they reflect your experience but these are examples for what you could have said, assuming you didn't just want to say "I don't like JavaScript and I prefer Elixir or Erlang".

And yes, people just dumping strong opinions with little more substance than gut feelings is very much a problem in Software Engineering. That doesn't mean we can't work on that and practice a little more hygiene and respect for each other.

EDIT: To be clear, saying "I don't like X an I prefer Y or Z" is perfectly fine too as long as you are honest about this being your own preference rather than some grand truth about the universe. The problem comes from insisting that everyone else is wrong for not feeling the same way.


The library mentioned in the article is C++ code that extends Nodejs and basically bypasses all of it. Nothing is making Nodejs special here.

The Readme says:

> µWebSockets.js is a web server bypass for Node.js that reimplements eventing, networking, encryption, web protocols, routing and pub/sub in highly optimized C++


Well this isn't really NodeJS, it's written in C++. The whole point of Node is that it is a wrapper around the V8 JS engine (the one used in Chrome) that allows you to run JS specifically in a non-browser environment. This project is more an extension to Node (or as it is called in the GitHub README, a "web server bypass for Node.js).

You do actually see the same kinda thing happening in other ecosystems, especially Python – something runs slower than you want it to (because Python), so you re-write it in Rust/C/C++ and hook into the extension system of the lang for a seamless DX. Off the top of my head things like `asyncpg`, `orson`, `uvloop` and esp relevant in this case `websockets` are all async libs that do this.

Mildly tangential but as someone who has spent a lot of time making the Python backend for my startup run as fast as possible, I would recommend avoiding Python if performance is your primary concern.


If anything, the article proves that when you want performance don't use Node.js - or more generally, abstractions have a cost.

The actual library, as already pointed out, is written is C/C++ and bypasses Node for everything, other than providing an usage API layer.

Its the same as claiming that Tensorflow is written in Python ;)


Great thing for me is having nodejs bindings for the underlying c++ core is i get to iterate fast on nodejs.

Any developer out there who have been fascinated by and have used socket.io should already be using this.


Definitely possible to get similar or better performance using go, swift and rust. Python is slower in most cases so i dont think it would be possible using python(unless calling C from python)


Yep. I'm also not sure if the server libraries in swift are mature & well optimized enough to handle millions of active websocket connections.

Maybe someone with more swift experience can chime in! Whats the state of swift on the server these days?


"The theoretical limit is 65k connections per IP address"

What's the driving factor for that limitation?


"The TCP protocol provides 16 bits for the port number, and this is interpreted as an unsigned integer; all values are valid, apart from 0, and so the largest port number is (2^16 - 1) or 65,535."


Ah, on the client side not the server, makes sense. I was doing the math as the full IP + port, which would offer 48 bits, but with only one client IP you really only have a client port to play with and you must bind as long as the connection remains open.


but the blog post says different.

> The client side has similar settings but does not need to set up multiple IP addresses, obviously

Also as I know, Linux will only use one IP for outbound connections, unless forcefully bind to another IP address in the code.

The server only uses only one port. 9001 according to the code. https://github.com/uNetworking/uWebSockets.js/blob/875f16e1f...

This blog post does not make any sense.


A connection is identified by the tuple {host IP, host port, client IP, client port}. The server listens on one port, but the client has a separate port for each connection. This means that there can be 65k connections for a single {host IP, client IP} combination.

There's two ways to get more connections: use more client IPs or use more host IPs. In this post the OP has decided to add more host IPs.


That means this benchmark is useless right?

In real world there will be more client IPs and the host will be listening on 1 IPv4 on 1 port.


That would depend on if it matters to the WebSocket server if connections are from one or multiple IPs. In this case I would not think that there's any difference as far as the WS library is concerned. Maybe it has some effect on the OS kernel's networking routines, but I would not expect it to be big.


Port numbers are 2 bytes (16 bits) in the tcp packet header.

so 2 * 16 == 65k


Aren't connections defined by sender_ip+receiver_ip+port?


Usually a connection is defined by what's called a 5-tuple. Source up, source port, dest up, dest port, protocol (e.g. tcp or udp)


Maximum file descriptors


Ok, I am a dev but I could not get the actual tech part of this. Can someone ELI5 dev?


(2019)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: