Hacker News new | past | comments | ask | show | jobs | submit login
Plurk Comet: Handling of 100.000+ open connections (amix.dk)
44 points by niels on June 8, 2009 | hide | past | favorite | 38 comments



Being a python programmer I am a bit worried that one (In this case) needs to use java to overcome a scaling problem. From my experiences with twisted, which is on much smaller scale than the article, I have never found twisted to be anykind of bottleneck.

It may be that since twisted(python) can only use one cpu effectively per interpreter(GIT et al) that it got left behind java which can easily use multiple CPUs for threads.

A slightly different architecture might be required then where multiple python processes are used.


So far at Plurk we have struggled to overcome scalability issues - - and none of them are due to Python, even thought we use Python a lot.

Most of the problems we have seen have been related to the database. It was and still is the biggest bottleneck. And all the people I know that drive big sites will confirm this.

This all said, one should evaluate problems and not blindly use Python for everything. Python is a great language, but some other language is a better choice for some problems. And here Java and specially Java's NIO library is a much better choice for doing a server that should handle tons of open connections.


You don't have to use Java. That's exactly the reason I hacked together EvServer: http://code.google.com/p/evserver/wiki/Documentation

Because of GIL, there is no point of using multiple threads in Python. Scaling must be done through adding more processes. Than an IPC communication starts to be an issue - that's what messaging middleware is for. The sooner you integrate messaging with your project - the better.

Once you have a messaging platform - comet can be done in any technology you prefer, it really doesn't matter. That's because the scaling complexity is handled by the message broker.


s/GIT et al/GIL et al/


This is a nice writeup that gives some good insight into what you're up against if you want to implement Comet today. It's a cool idea, but there's really nothing out there that you can use to do it off the shelf. It's great that he's gone through and actually tested out a bunch of these candidate technologies under load, so the next guy will at least know what not to try.

I have faith that Comet will have its day eventually, but clearly it's not quite time. With the current crop of production quality web servers, handling 100k requests per second (polling) is a solved problem. Keeping 100k simultaneous connections alive (comet) is by no means solved. It'll be cool when it happens though.


>> "I have faith that Comet will have its day eventually, but clearly it's not quite time."

??? Many many big sites are already using comet, and have been for some years, so I'm not sure I understand your comments.


Correct. If you're Google, and can devote a team to build and maintain a webserver specifically designed to handle Comet, then you can use Comet.

If you're small and need to use off-the-shelf tech to get your thing up and running quickly, you're still best off going with Polling today.

A quick example would be to look at Thinkature vs. Twiddla. They had a team of two, one of which spent an entire year building a web server from the ground up so that they could use Comet. Twiddla only has me as a developer, and I'd rather spend my time improving the product, so we use Polling.

One year later, Twiddla has a ~300ms lag between when you draw a line and when it appears on a remote screen, whereas Thinkature is out of business. I don't think it's that black and white of a tradeoff, but hopefully you see the thrust of what I'm saying.

One day soon, there will be a mod_comet for Apache, or IIS 9 with Comet support built in. That will be the day it makes sense for small, lean teams to build a business around it.


I spent a year or so building my own comet server. Because I did that, I'm able to finely control every single element of it. I know in absolute detail how it works, what sucks, what rocks, etc.

Obviously you have to decide how important things are to your success, and decide if you should build it yourself, or use some existing code out there. For me, it was a no brainer decision.

mod_comet is missing the point. Apache is the issue when using comet. Apache is what needs fixing/replacing.

If Mibbit was using polling, my bandwidth bill would be through the roof.


Sounds like we're in agreement then.

If you want to use Comet today, you need to build something custom, and it will probably take a lot of time, but it will pay off as you describe in terms of flexibility.


Im cetainly not, iserve, yaws and mochiweb are all "off the shelf" web servers that can handle comet pretty well. I cant believe erlang has the monopoly on lightweight web servers either.


Fair enough. Like you said elsewhere on this thread, we're down to defining "off the shelf" now, so I think we should probably call it a day.

  http://code.google.com/p/mochiweb/issues/detail?id=2
  http://www.google.com/search?q=comet+site:http://yaws.hyber.org/
  http://frihjul.net/iserve
Those are the available docs for the technologies you mentioned. Yaws has some real documentation, but not for its Comet implementation. The others give you some source code and tell you to have fun.

So yeah, it's right there on the shelf (how 'bout we settle on "perched on the edge of the shelf?") I'm just not smart enough (or motivated enough) to actually use any of it.


I still kinda disagree with your point though. There's nothing particularly unique to comet here.

If you want to make anything the best it can be, you need to build it custom, and it will probably take a lot of time, but it will pay off...


or you could not use general purpose web servers for edge cases, mochiweb for example is pretty great at handling comet requests.

axod does mibbit by himself, we(hypernumbers) use comet and are a small team, meebo used comet from the start.

there really isnt much of a barrier with comet, it was actually a hell of a lot easier than the flash sockets setup I implemented before it.

* probably worth mentioning facebook used mochiweb for their chat (off the shelf), I do find it hard to believe the only lightweight webservers around are in erlang


Wouldn't you say that any application thinking of using Comet would by definition be an edge case?

But yeah, I'd disagree that the barrier to Comet is low. The natural thing to compare it to is HTTP Polling, which has no barrier whatsoever beyond knowing about window.setInterval(); (is it correct to end a sentence containing code with a semicolon?:)

Twiddla went from concept to launch in ten hours, largely because I didn't need to spend any time thinking about how to handle communications. The intention was to replace Polling with Comet at some point in the future, but you know what? It just isn't anywhere near as slow or problematic as I was expecting.

Back to my original point, there are a lot of smart people (such as yourself) working on this problem. Before the year is out, I suspect that somebody will have a good, proven, out-of-the-box Comet server that you can simply drop your application onto. That's the day I'm waiting for.


heh the point of all 3 of my comments is that there is a good, proven, out-of-the-box* comet server. mochiweb, I havent tested but I would imagine iserve and yaws handle themselves similarly well.

* depending on your definition of out-of-the-box, mochiweb doesnt actually enforce any "protocol" for handling comet for you, those are application specific and reasonably trivial to code.

* I also forgot about erlycomet, which is built on mochiweb, and is a straight out of the box comet server


axod is a team of one, and mibbit apparently uses "Comet".


This article seems relevant, BTW:

http://news.ycombinator.com/item?id=334506


Interesting. I'm writing a comet server right now, using Ruby's EventMachine. I might do some tests to see how many connections I can stretch it out to; unfortunately I have the feeling that local connections are not going to quite match up to real live outgoing ones.

If I was doing it For Realz™ I reckon Erlang would be the go, still haven't gotten up to anywhere near that level of proficiency though. 37Signals recently rewrote their push server in Erlang and reported great results.


SPEED :: In defence of Java, it is very quick when done right. See Kilim[1] for examples.

SUPPORT :: On top of the speed Java has a supported hardware stack (Solaris). So if doing 100K connections is important to you, you can find engineers who can diagnose difficult problems on your whole stack.

--------------

[1] http://www.youtube.com/watch?v=37NaHRE0Sqw at 23:23 . Java is quick without Kilim too, this is just an example with stats.


Has anyone had any success with orbited?


That Omegle site uses orbited, seems to work OK with reported load of a few thousand connections.


afaik orbited is written using twisted, in python.


There was a nice write up on Erlang based Comet apps here: http://cometdaily.com/2007/12/14/getting-started-with-comet-...

Has anybody tried comet using Yaws?


Number of open connections is a little bit meaningless until they're actually under load, sending and receiving packets in a real world situation.

I'd agree, Java+NIO is a great system. Extremely fast+scalable.


Well, it's not that meaningless. Comet is basically long polling; those connections aren't really doing much of anything until something turns up on the back end to be pushed down them, at which point the connection is usually closed and then re-requested by the client. Maybe with the occasional ping, I've seen some people do that too.

How much memory and CPU is required to simply hold open the connection and sit there is definitely of interest to people. Of course, I'd like to see real world tests too, but that might be rather difficult to pull off without specially written, extremely efficient code on the test side.


Sure. It's the "basic" level requirement. Holding open a tcp connection should be pretty cheap though unless something is badly wrong.


Not knowing much about Java's NIO, how do you handle 'long running calculations'? One of the great things about Erlang is that you can sit there doing some big huge long loop, and it won't block the rest of the system because it's got an internal scheduler. One way to do that kind of thing with C or Java is simply to manually divide your big long calculation into smaller chunks that yield control at regular, brief intervals. Does Java provide anything higher level?


What sort of long calculations? If you're doing massive number crunching, you could just have a queue->thread->callback for it if it's not easy to break up into bite size chunks...

I don't know of anything else built in.


Ok, fair enough. What's neat about Erlang is that you just write your code and don't have to worry about divying it up, or sending stuff of to a queue. I suppose one possible disadvantage of the Erlang way is if the scheduler is divying resources up equally, you could be 'thrashing' between too many "processes", rather than working your way through a FIFO queue that's going to only work on so many things at one time. But at that point you could just write a queue system in Erlang...


http://www.alexa.com/siteinfo/douban.com

AFAIK douban.com seems to be the largest (pure) Python site.


Douban uses a lightweight python framework called Quixote. http://www.mems-exchange.org/software/quixote/

Here's the mail sent about two years back to the quixote users list: http://mail.mems-exchange.org/durusmail/quixote-users/5657/


Hmm, was twisted not the next big thing that everyone was rewriting their app in? What does he mean, it does not scale?


Not a python programmer here and I'm only guessing after a cursory look recently. Twisted developers are layering a higher-level "architecture" over the basic system calls they have wrapped. It's just a few calls down there (serial and scatter-gather I/O, multiplexing, etc.) but twisted adds higher level data structures, and design-patterny conceptual models to shield the pythonista from Unix.

Some of that gotta add some fat; which might be OK for the great majority of the people, but the few who need raw performance might end up cutting some bacon off.

That dude at amix.dk actually knows his stuff.


For the non UK people 100.000 = 100,000 or 100 000 if you want the ISO version.


We use 100,000 in the UK too - 100.000 is used by some European countries (Spain is one, not sure about the others).


FYI, in Norway we use 100,000 for 100 and 100 000 for 100000 (no dots).


No, the poster was just being very precise.


Is an actual c++ implementation of the java's nio is strictly better than nginx's one? Is it using epool() and sendfine()? Isn't all those java's abstraction leyers are overhead?

It seems like yet another attempt to put some more air into java's bubble.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: