I find it amusing that TCP was designed as and still is a full-duplex protocol for long-lived connections (streams). Then someone decides to use it in HTTP (1.0) for short-lived, half-duplex, message-oriented communication. Not surprisingly, it is not a good fit. Fast-forward a couple of years and everybody is busy de-crippling TCP in HTTP with various hacks trying to delay the HTTP request-response cycle, thus uncovering the streams in the underused underlying protocol. First now, many years later, the mistake is properly corrected and TCP on the www is vindicated through websockets. Funny world.
Wait a little longer and, if we are very lucky, multicast will be "re-discovered" as well. That way, instead of having servers shove data down many sockets to many clients, they will write their payload down 1 multicast socket to all interested clients, with said payload crossing the network as little and as far as necessary but not more.
You're speaking of HTTP: the protocol on which a great deal of world's computing infrastructure is being set on top of. The protocol which we are communicating with each other over right now. HTTP on TCP is extremely successful and achieves all it was designed to for and has been extensible to handle much more. There are many basic problems which are abstracted well into stateless request response cycles: file serving, RPC. There was no "mistake" - it was not a bad fit. There are only expanding use-cases involving existing software infrastructure - for which websockets is solution. That's not to say that HTTP is perfect... It could be better in many ways - but it's ridiculous to write off its success and call it a mistake.
I am not dismissing HTTP. The mistake I refered to was to limit TCP usage for www by only allowing the request-response oriented HTTP protocol. HTTP is opinionated and its REST-based architecture was explicitly chosen to disallow partial updates of web pages, thusly crippling TCP. But if one wants to build "thick client" type apps the restrictions of HTTP will weigh you down hard. Allowing for unrestricted TCP usage makes a lot more sense in that case, and is exactly what WebSockets delivers.
One can argue that we were better off with the pure REST web before XmlHttpRequest, in fact I am inclined to agree there, but the crippling of TCP was bound to be cracked. The real humor is that the crack is now re-launched as a new great feature, when it was explicitly excluded for the break-down of REST that it causes.
You say HTTP crippled TCP so as to not break-down REST. HTTP 1.0 does not seem to be designed for REST; it doesn't seem that REST was really published until quite a while later. Calling HTTP "REST-based" seems a bit of a stretch.
On top of that, could you give a few examples of what you actually mean? Like what would a partial update be?
TCP seems like a perfect fit for the underlying transport for a request/response model; I don't see how choosing it is some sort of deliberate "crippling". What should they have done? Built a custom request/response protocol on top of IP? Isn't that just crippling IP's flexibility as a layer 3 protocol? I don't understand your objections.
Roy Fielding was involved in HTTP 1.0 and I believe that REST was a generalization of some of the principles used therein. I explained the breakdown in another comment, see http://news.ycombinator.com/item?id=4033717
I'm not saying that TCP was a super-bad choice, just that they wanted only a subset of the features and got a bit more than they wanted. Also see http://news.ycombinator.com/item?id=4033822 .
Back in the day internet connections were so terrible that it was unbearable to do much more than to download a document and display it. Hence, HTTP and HTML. Arbitrary TCP streams existed but in practice performance and latency were very limiting, so nobody cared that HTTP had no support for streaming.
Eventually the underlying tech matured and the standards followed suit. Slowly. But that's the price of interoperability.
I too find it amusing how these concepts keep getting "rediscovered", but in hindsight it's not surprising.
Back then TCP was the best mainstream choice. Using UDP would have looked good due to its message orientation. But its limited message size, lack of re-ordering and congestion control would have disqualified it almost directly. TCP then looks better, since it has reliability, re-ordering and rate control. You only need to change the streaming functionality for senfing messages. This they did by using one TCP connection per message. What they missed, however, was that connections are expensive in TCP. There are a lot of handshakes and TCP probes a lot before opening the throttle, something you notice when downloading a big file. Throwing away connections like that is a waste. Ideally, you would want one connection per server and reuse it for multiple requests. This was done in HTTP 2.0, where connection:keep-alive was introduced. But TCP still has significant overhead and a slow rate control mechanism. It also does not allow multiple concurrent requests in the same connection. These problems are addressed by SCTP, which is really a great fit for HTTP and awesome in general. It is also message oriented. SPDY is another similar alternative, which is less general than SCTP.
But the point I was trying to make is that TCP was intentionally crippled by HTTP to acheive REST. Then people hack the restrictions of TCP and declare a new invention.
He seems to play up the problems with comet-style communication quite a bit. Most of his objections, and it seems the entirety of his 'showdown' is based on using the slowest comet implementations, which do repeated HTTP requests, rather than the obvious and, as far as I know, most commonly used streaming approach. Honestly, I see little difference between streaming comet communication and web sockets in terms of performance/overhead.
And why does the "Complexity of comet applications" diagram show "RIA client app" (does this not have to be built when using web sockets?), "Silverlight or Flash plugin" (as if these are necessary for comet), and some convoluted server-side architecture that has nothing to do with the client-server protocol? Again it seems like trying to play up the deficiencies of comet-type apps in a kind of disingenious way.
Web sockets seem to be a great step forward in almost every way (cross-platform support currently missing) so why hype them with imagined performance wins from unrealistic comparisons with other solutions.
Yes, and then you get down to the Kaazing sales pitch, where a diagram shows your packets just going out to the Internet, and there's no longer any management to do on the client.
A well written article otherwise though. I appreciated the reminder to audit the headers you're sending out as an easy way to improve performance.
Looks like an advertorial. No mention of socket.io and praise for a commercial solution that I didn't know after almost two years working with websockets.
I'm not a big fan of socket.IO, seems too complex for my needs. I was quite happy to discover txWS, it's literally one more function call and you can just write ordinary Twisted code with no changes, which is a relief.
Update: I went and bought websocket.us and websockets.us, and put up my own small website about WebSockets. I just want to provide an alternative to Kaazing's site, that has no commercial focus.
Man I hate these undated Internet articles. Publishers use it to sneakily squeeze out a few extra page views from stale content. Hate it when I fall for that.
Besides support in recent browsers, anything else inaccurate in it?
It is in the last paragraph. I thought the 90% of the article prior to that was a pretty good neutral overview of websockets vs prior 2-way alternatives.
Their claim that "HTML5 Web Sockets can provide a 500:1 or—depending on the size of the HTTP headers—even a 1000:1 reduction" is also wrong as they dont account for TCP / IP Frame overhead. Sure it still is around 50:1, and sockets are awesome. But if you present numbers, please dont present misleading numbers.
I get the basic idea, but when I try to flush it out into a full cluster and app design long-poll/web-sockets seems like a bit of a complexity nightmare on the back end. I mean you're taking a highly cachable stateless protocol and turning it into an uncachable one. Then you're taking a shared-nothing application layer and forcing it to do shared-state cache coherence via message passing. And all the way up and down the stack you're turning what was overwhelmingly treated as a request/response model and turning it into a different one. Every cache, proxy, application firewall, traffic shaper, loadbalancer and IDS on both sides is going to 'get confused' and net out to a toooon of user complaints and corner cases.
There is no requirement to use polling or sockets. If it isn't relevant to your users or server side then don't do it.
But the bar now is that users expect their displays to automatically update promptly, and they don't care how hard it is for you to implement. You don't have to do the level of complexity you think. For example there is no need to tell the client there is an update, or the details of the update. All it needs to know is that there could be an update. It can then go off on the regular HTTP connections to see what is new/relevant and that will go via your existing servers/caches/load balancers etc.
Exactly. Ajax in general breaks REST, including bookmarking, caching, navigation and more. All major selling points of HTTP goes out the window. HTTP was created on top of TCP, but with severe restrictions to get the great features which HTTP is well-known for. Now someone re-lauches unrestricted TCP as a feature, when it is in fact much more simplistic and formed the base all along. Usage was purpously off limits due to the headaches that unrestricted client-server communication causes.
If your assertion is that developers can write bad code, abuse HTTP semantics, defeat caches, break navigation etc then so what? With the ability to do things right comes the ability to mess them up.
I am saying that by using Ajax to partially update a page, that updated page no longer has a URL which identifies it. This breaks HATEOAS badly. The idea of REST is that every state on a site has a representation. REepresentational State Transfer you know... So while there are many other ways of shooting yourself in the foot when designing a REST website, using Ajax makes it almost inevitable.
I'm wondering if bookmarking state would have made it into webapps had we not been forced to use HTTP because that is all that was available in the browser for years.
Just to let you all know that this is an old article (agree that a publish date would have helped there). Frank and I published this article originally in late 2009 or early 2010. That was around the time that WebSocket first landed in Chrome and there have been a lot of updates to the protocol since then. For example, WS was text only, and I don't think socket.io existed yet ;-)
"..., it cannot deliver raw binary data to JavaScript, because JavaScript does not support a byte type"
Incorrect. JS most definitely supports binary data using byte arrays with Uint8Array, etc or using canvas data binary
WebSocket isn't just to solve push notifications. It's also for realtime communication, for example chat or multiplayer games. We do have server-sent updates if you just want push notifications.
yes, without compression. Everything you said is covered in SPDY. But SPDY takes a step further and improves the current state of HTTP as well to better transfer documents.