Show HN: Nchan – a pub/sub server as an Nginx module

slact · on Dec 6, 2015

This is a huge refactoring of an old project of mine -- the Nginx HTTP Push Module. I'm wondering if anyone here has used it.

Most importantly, I want feedback on the documentation. Did I overcomplicate things? Does it need more examples? Does it need more live code? Is it too long? Too short? Etc.

jkarneges · on Dec 6, 2015

I love the clean client API, e.g. the use of ETags with long-polling.

In some ways it reminds me of our LiveResource protocol (http://liveresource.org). It's still a work in progress, but maybe we can consolidate ideas. It'd be cool if an LR client could be pointed at an Nchan resource.

slact · on Dec 6, 2015

Interesting, I'll read through this. Nchan code is quite modular, so it would be pretty easy to add some kind of negotiation like this. Let's talk more.

For reference, here's the old document I wrote for the long-polling protocol back in '09: https://pushmodule.slact.net/protocol.html

comboy · on Dec 6, 2015

Would be really cool if you could include some example JS client for this. I'm sure you have some for testing, so even if it's not production ready code it would make testing of your module much easier. Almost everybody will want to write the same code for it anyway, with reconnection and fallbacks, so it may be a good starting point for others to contribute to the project - much more JS guys than C coders intimately familiar with nginx.

slact · on Dec 6, 2015

Yep, I'll be adding an example js client, but I don't intend on standardizing this. Mostly because I'm also a js guy and I like mootools, and I don't want to start a best-js-framework flamewar with an official client. But example code will be up for sure.

ende42 · on Dec 6, 2015

Thanks for your work!

Actually we (laut.fm) are using the HTTP Push Module for our public API for a live stream of tracks which get played on our ~ 1500 icecast stations. We offer 3 formats: http://api.laut.fm/song_change.stream.json for a line separated JSON HTTP stream, ws://api.laut.fm/song_change.ws.json for the same as websocket endpoint and http://api.laut.fm/song_change.chunk.json for the last x songs. It's not really high volume, just 6 to 7 per second. But it runs basically unattended for years now and I'm pretty happy with it.

Is there any reason to update to the new one (other than new features; admittedly I haven't really looked into NCHAN)?

slact · on Dec 7, 2015

Well, if it ain't broke, as they say...

here's a page on the differences between Nchan and the Push Module: https://nchan.slact.net/upgrade . One important thing I forgot to add is that the Push Module suffered from memory fragmentation under high load, and with a fixed-size shared-memory chunk that could mean running out of usable shared memory for a long-running nginx process. If you're not experiencing that, and you don't need to scale up, or the new features don't appeal to you, don't upgrade -- certainly not yet.

Maybe in a month or two when nchan makes its way into the nginx-extras debian package (replacing the push module), then consider upgrading.

ende42 · on Dec 7, 2015

Thanks for the upgrade link. I missed that. Great that NCHAN is (almost; besides directive prefix) configuration compatible!

The "Subscribers" section of "Push Module" should include websocket, I think.

slact · on Dec 7, 2015

> besides directive prefix

The old setting is still recognized.

> Subscribers" section of "Push Module" should include websocket

Nope, push module didn't do websocket. You may be thinking of the Push _Stream_ module, which is an independent push module fork.

solyaris · on Dec 7, 2015

great project! I retwitted: https://twitter.com/solyarisoftware/status/67379850204928000...

Documentation is VERY well done. publisher: Curl examples are perfect. subscribers: maybe some examples in a language ( Ruby? :-) ) could help dummies like me. I'll study and if I can I'll propose you

respect giorgio

slact · on Dec 7, 2015

Thanks! I'll be adding a sample JS subscriber client and some sample code with that as well.

sams99 · on Dec 7, 2015

This is super interesting, I have a very similar project that we use at Discourse https://github.com/SamSaffron/message_bus clearly being an NGINX module nchan is going to be a LOT faster

That said there are few little pointers that would make integrating MessageBus with nchan a bit challenging

- We have a concept of "reliable pub sub" https://github.com/SamSaffron/message_bus/blob/master/lib/me... this means that if you shut your laptop and then open it stuff just magically catches up with no loss of messages or ordering. Does the redis store you have do something similar (I can not see backing lists in the redis store implementation in store.c)

- We implement per-message security, meaning you can publish to a channel/group or channel/user and only those users get the message, I see you can do auth upfront but can per-message security seems not doable.

- We don't bother with per-channel URLS and instead just have a single endpoint you multi-subscribe using POST params, we often subscribe to 10 channels on a request, the multiplexing seems a bit limited

- When we subscribe to a channel due to the "reliable" nature of the store we are able to tell the server what the last id is we subscribed to and catch up from there

I definitely see us moving the moving our message bus server piece closer to the web server like nchan has, its by far the most scalable way, but its tough keeping feature parity.

slact · on Dec 7, 2015

Nice project, and a curious featureset overlap, too...

> We have a concept of "reliable pub sub" [...] Does the redis store you have do something similar (I can not see backing lists in the redis store implementation in store.c)

Dig a little deeper... https://github.com/slact/nchan/blob/master/src/store/redis/s... I used lua scripts for all the fancy redis logic, that's where you'll find the backing list accesses. Messages are stored as hashes, referenced by id in lists, along with some other channel metadata. So a longpoll or EventSource client knows its last message id, and can request the next avaliable message as long as said message has not yet expired. Websocket clients don't have this information, and I'm not yet sure how to relay it with each message while remaining content-agnostic. Basically, regardless of the protocol, you can send a If-Modified-Since + If-None-Match or Last-Event-ID headers and it will resume from that position in the message queue for the given channel.

> I see you can do auth upfront but can per-message security seems not doable

Per-message access will definitely not be implemented. You could, however, do this client-side by, say, encrypting the messages and sharing keys with authorized subscribers. That's kind of roundabout though.

> We don't bother with per-channel URLS [...], we often subscribe to 10 channels on a request, the multiplexing seems a bit limited

The main use-case I had in mind for multiplexing is that of a single channel per user, and some shared broadcast channel. It's currently limited to 4 max because I wanted to get this code out the door. Unlimited multiplexing will be supported in the future, and you could trivially rebuild the module to support up to 16 right now (At the cost of some memory per message per subscriber per channel overhead).

> When we subscribe to a channel due to the "reliable" nature of the store we are able to tell the server what the last id is we subscribed to and catch up from there

Yep, Nchan does that too. (except for Websocket, and hopefully I'll find a workaround)

How does that work for feature parity?

sams99 · on Dec 7, 2015

nice! I initially stayed away from lua due so I could support earlier redises, but these days I would totally take that dependency on to simplify the code.

regarding that implementation, one diff I am spotting is that I also have a concept of global message id / global channel, this allows us to subscribe to a single spot and distribute from there (which keeps thread counts down and is particularly useful for server -> server comms.)

A lot of parity with our projects :)

Regarding web sockets, I decided against supporting them (initial implementations did) the issue is that web sockets are 100% flaky on HTTP and just add tons of unneeded complexity, in advent of HTTP/2 in NGINX they would be a net loss imo cause you would be wasting a connection.

slact · on Dec 7, 2015

> global message id / global channel Yeah, I don't have global ids. every message is bound to the channel it was published to. But I do understand that for your use case, (user + arbitrary list of groups), you'd need a good deal more than 4 channel multiplexing. It's a pretty strong use case and I'll see what I can do in the next month.

Of course, I can be incentivized to work faster with a generous donation :)

> Regarding web sockets [...]

All the cool kids were talking about it, so I thought I might as well support them, too.

solyaris · on Dec 7, 2015

>All the cool kids were talking about it, so I thought I might as well support them, too.

I agree with Sam Saffron (thanks Sam for al your githubcode) about some perplexity using websockets as a panacea nowadays.

Nevertheless, I really appreciated Nchan structured approach, open to different available protocols (poll, SSE, websockets) :)

thesmart · on Dec 6, 2015

Does Nchan broadcast messages that may have been missed by a client that was temporarily offline? When web client's navigate from page-to-page or enter sleep mode, client's will miss messages during the offline delta. Ideally, there would be a way to "catch-up" a client or to tell if a client is permanently out-of-sync because it is beyond a message history window.

slact · on Dec 6, 2015

Yes. All messages are buffered for a configurable length of time. Longpoll and EventSource clients receive the last message id with each message, and if interrupted can be resumed from there provided the message has not timed out. I don't yet have a way of transferring the last-received message id for websocket clients, but if it is known it can be set during the ws handshake.

geekuillaume · on Dec 6, 2015

I'm using Nginx Push Stream with NodeJS for a high-scale chat system (Soon to be released, completely Open-Source). The dependency on Nginx always bothers me. How hard do you think it would be to do a system like yours but completely standalone ? Then we would be able to integrate it to other languages via plugins (NodeJS, Python, etc).

slact · on Dec 6, 2015

Nchan is about 12K lines of C, I'd say 2-5K of that is dealing with Nginx guts. To get rid of Nginx entirely, you'd need to add an event loop, forking and multiprocess management, config parsing and reloading, and shared memory allocation code. That's not a simple task, but it's certainly possible. The reason I built this on top of Nginx is precisely because I didn't want to handle those other things. Besides, nginx these days is a hulking scalable monster. What's the bother?

If you really don't want an nginx dependency, I'd say you're better off rolling your own pubsub server in Node.

geekuillaume · on Dec 6, 2015

Okay, thanks for the information. The problem with doing the pubsub in NodeJS entirely is the way NodeJS handle connections. Each is separated and is accompanied by a big overhead from NodeJS. What would be great is a way to interact with the pubsub server not by config but with an API. This would allows, for example, the execution of middlewares when a user publishes a message (to filter them or something else). Right now, I'm using two websockets, one to Nginx PushStream only to receive messages and another to the NodeJS server to publish messages. The NodeJS server is used to parse the messages, format them, authenticate the user and apply the middlewares before publishing the message to Redis. Then, each NodeJS get the message from Redis and POST it to Nginx. The problem here is that there is two sockets for each user and the complexity involved with the communication between Nginx and NodeJS. That's why a really performant standalone websocket engine with an extensive API would be awesome.

slact · on Dec 6, 2015

> the execution of middlewares when a user publishes a message (to filter them or something else)

You can do that with nchan: https://nchan.slact.net/details#authenticate-with-nchan_auth...

> Right now, I'm using two websockets, one to Nginx PushStream only to receive messages and another to the NodeJS server to publish messages.

You can also multiplex several websockets into one for the client.

I can't offer you a standalone server, but I can offer some pretty fancy features : )

geekuillaume · on Dec 6, 2015

I saw this feature, but I think there is limitation preventing me to use it, tell me if I'm wrong ;) - I cannot use this to make a call each time a message is published on a pubsub websocket - I cannot modify the message sent by the user (to add user information for example)

Also, what do you mean by multiplexing websockets ?

slact · on Dec 6, 2015

That's correct, that feature is for authentication only. I may add a feature to replace the message with the back end response.

By multiplexing I mean that a single websocket (or any other) subscriber can subscribe to multiple channels.

uyoakaoma · on Dec 6, 2015

Is this different from the other module nginx push stream which offers websockets, long polling etc

slact · on Dec 6, 2015

Yes, this is a different project, although they are related. Push Stream is a fork of my Nginx HTTP Push Module, so they both descend from the same original codebase. Push Stream uses a blocking concurrency model, whereas Nhan is completely non-blocking. In theory, this means Nchan should scale better. In practice, I haven't benchmarked it heavily enough yet to see a divergence.

Besides that, nchan and Push Stream offer different feature sets. For example, Nchan has horizontal scaling and persistence through redis, whereas push stream has customizable message transforms. There are other differences as well, but that would take a whole article to elaborate. I should probably write it soon.

cryptica · on Dec 6, 2015

The "client-less" approach would be particularly nice for use with IoT devices which don't have enough CPU/memory to run a beefy pub/sub client (which is typical of other solutions). So that's nice.

Is there a way to perform access control on the backend? E.g. Is there a way to prevent a specific user from subscribing to a specific channel (if they are not allowed)? Or is this specifically designed to deal only with public channels (that anyone can publish/subscribe to)?

slact · on Dec 6, 2015

Yes, backend-authenticated access control is possible. See https://nchan.slact.net/details#authenticate-with-nchan_auth...

ktt · on Dec 6, 2015

Wow, excellent! I've been looking for lightweight websockets pubsub and had to create something simple in Node to achieve it but this would've sufficed.

I can't wait to give it a try.

pablomolnar · on Dec 7, 2015

Great project! Is there any concept of ack/nack a message? Let's say the subscriber get a message and failed before processing it. Does the message go back to the queue?

slact · on Dec 7, 2015

No, ack/nack would need to be implemented in the client or the application.

However, messages do not disappear from the queue when received. All subscriber requests are idempotent, and can be repeated so long as the message queue is storing the message (which is a configurable parameter). So if a subscriber failed before processing it, it's free to request the same message again.

Beltiras · on Dec 6, 2015

Interesting project. I've up till now used Twisted or Tornado (actually swampdragon.net). My choice would be dependent on scalability and performance.

slact · on Dec 6, 2015

I consider those two and socket.io my main competition. Nchan is built to scale, but of course you'd need some numbers to back that up. I plan on doing benchmarks once I iron out the documentation, although I realize it would be a great selling point to have the numbers ready right away.

Raed667 · on Dec 6, 2015

How does this compare to MQTT ?

slact · on Dec 6, 2015

I'm not familiar with MQTT, I've just glanced over the spec right now.

Nchan is basically a message broker with channels, optimized for message broadcast.

MQTT is a TCP-level protocol, whereas all the currently implemented subscriber clients for Nchan are HTTP-level (Longpoll, EventSource and Websocket, which begins with an HTTP request). MQTT subscribers and publishers could be implemented in nchan, but I haven't yet written any raw-TCP connection negotiation code, so I don't know how hard it would be. Aside from that, the subscriber code is very modular and adding another protocol like MQTT would be straightforward.

Raed667 · on Dec 6, 2015

I see. Thank you for this, I'll give it a shot and see how a simple client compares to an MQTT client.

If I get a good performance on a C/C++ client then this could be a viable option for push notification in the IoT domain.

solyaris · on Dec 7, 2015

>this could be a viable option for push notification in the IoT domain.

Yes! :)

gauravphoenix · on Dec 6, 2015

This is wonderful. I hope this module gets added to openresty.

userbinator · on Dec 6, 2015

Is the description just a rather roundabout way of saying it can be used as an anonymous messageboard? The name certainly evokes that.

slact · on Dec 6, 2015

Well, it would be very easy to make a realtime 4chan-like thing with Nchan, but it's far from the only use case. It's a general-purpose server and pubsub proxy.

The name is a play on nginx and channels. I got tired of saying "nginx http push module" all the time and renamed it.

johnmaguire · on Dec 6, 2015

pub/sub is a common paradigm for describing certain types of event systems or message queues.

Clients can subscribe to "channels" and they will receive all messages "published" on those channels.