It's interesting that when it comes time to scale to serve enormous loads, you h...

jshen · on Nov 8, 2012

It's not primarily about the size of traffic, but the ability or inability to cache. At work we serve a ton of traffic with MRI ruby and 3 small VMs. Most requests are served by varnish and never hit the ruby stack. Most people do a terrible job at caching (edit: I'm not saying that twitter is bad at caching).

kayoone · on Nov 8, 2012

Twitter is hugely write intensive and needs realtime data, so their caching needs are probably vastly different from yours.

jshen · on Nov 8, 2012

That was my point. Their caching needs are vastly different from almost everyone!

bryanlarsen · on Nov 8, 2012

Agreed. I've got a page that takes 20 seconds to render on a dedicated fast machine. I could spend some time optimizing it, but even then it would still be slow. However, it's cacheable so I can fix performance problems with a simple post-deploy script. We can't do that for every page, but it's awesome when we can.

vidarh · on Nov 8, 2012

No, you don't need to change fundamental parts. You need to get the architecture right. Then you need to determine whether the potential cost savings of switching languages are worth it for you. If things start falling over when you scale, it's an architecture problem, not a language problem.

Why do I say that? Because in Twitters case, handling tweets is a "trivial" problem to parallelise, and the potential savings of switching language will be a rounding error in terms of scaling their system compared to getting the architecture right.

(To scale Twitter: Make trees. Split the following list into suitably wide trees, and your problem has now been reduced to an efficient hierarchical data store + efficiently routing messages. Both are well understood, easy to scale "solved" problems)

randall · on Nov 8, 2012

Would love to see someone at twitter refute this... seems too simple to be correct. :)

vidarh · on Nov 8, 2012

Keep in mind I gave it as an example that is conceptually simple to demonstrate that scaling it with pretty much any technology is possible. It is still a lot of work, and expensive, and likely far from optimal.

The approach I suggested would create massive message spikes, and there's undoubtably optimizations. E.g. 30 million followers for the top users is a lot, but most of them don't tweet all that often. Perhaps it's better to keep the recent tweet of the top 1000 (or 100,000) users in memory in memcached's and "weave" them into the timelines of their followers instead of trying to immediately stored the in timelines for each user.

The point is scaling is pretty much never a language problem once your needs are large enough that they'll exceed a server or even a rack no matter how efficient the language implementation is.

At that point the language choice is an operational cost vs. productivity issue down to whether or not the CPU usage differences are sufficiently high to cost you enough to offset any productivity advantages you might get from the slower language implementation.

That is a valid concern. It's perfectly possible that Twitter is right in switching in their case. E.g. even a 10% reduction in servers could pay for a lot of developer hours when your server park gets large enough.

If you're on the tipping point where switching to a faster language implementation can keep you from having to scale past a single machine, then it's slightly different. But arguably if you're in that position, you should still plan for growth.

firlefans · on Nov 8, 2012

"Python is fast enough for our site and allows us to produce maintainable features in record times, with a minimum of developers," said Cuong Do, Software Architect, YouTube.com.

There are many dynamic user driven sites which have scaled well (far less downtime than Twitter) without switching to static compilation.

bad_user · on Nov 8, 2012

You can argue that YouTube is less dynamic than Twitter. When a user posts a message on Twitter, that message has to be pushed to all the users that follow that user and so every user has a personalized stream of incoming tweets that has to be updated, not in realtime, but with small latency nonetheless.

Also, Twitter doesn't have Google's infrastructure.

In regards to "static compilation", that's not the important bit, but rather the performance of the virtual machine. The JVM, at its core, is not working with static types. The bytecode itself is free of static types, except for when you want to invoke a method, in which case you need a concrete name for the type for which you invoke that method ... this is because the actual method that gets called is not known, being dispatched based on "this", so you need some kind of lookup strategy and therefore the conventions that Java uses for that lookup are hardcoded in the bytecode. However, invokeDynamic from JDK 7 allows the developer to override that lookup strategy, allowing one complete dynamic freedom at the bytecode level, with good performance characteristics.

The real issue is the JVM versus the reference implementations of Ruby/Python. The JVM is the most advanced mainstream VM (for server-side loads at least).

Unfortunately for Facebook, they didn't have a Charles Oliver Nutter to implement a kickass PHP implementation on top of the JVM - not that it's something feasible, because PHP as a language depends a lot on a multitude of C-based extensions. The more pure a language is (in common usage), the easier it is to port to other platforms. Alternative Python implementations (read Jython, IronPython) have failed because if you want to port Python, you also have to port popular libraries such as NumPy. Which is why the PyPy project is allocating good resources towards that, because otherwise nobody would use it.

jawher · on Nov 8, 2012

> The JVM, at its core, is not working with static types. The bytecode itself is free of static types, except for when you want to invoke a method, in which case you need a concrete name for the type for which you invoke that method ...

Not sure what gave you this impression, as the majority of Java bytecode instructions are typed. For example, the add instruction comes in typed variants: iadd (for ints), ladd (for longs), dadd (for doubles), fadd (floats), etc.

The same is true for most other instructions: the other arithmetic instructions (div, sub, etc.), the comparison instructions (*cmp), pushing constants on the stack, setting and loading local variables, returning values from methods, etc.

http://en.wikipedia.org/wiki/Java_bytecode_instruction_listi...

InvokeDynamic, as you point it out, was added to make implementing dynamic languages on the JVM easier, because the JVM was too statically typed at its core.

bad_user · on Nov 8, 2012

Arithmetic operations on numbers are not polymorphic, but polymorphism has nothing to do with static typing per se. You're being confused here by the special treatment the JVM gives to primitives, special treatment that was needed to avoid the boxing/unboxing costs, but that's a separate discussion and note that hidden boxing/unboxing costs can also happen in Scala, which treats numbers as Objects.

Disregarding primitives, the JVM doesn't give a crap about what types you push/pop the stack or what values you return.

invokeDynamic is nothing more than an invokeVirtual or maybe an invokeInterface, with the difference that the actual method lookup logic (specific to the Java language) is overridden by your own logic, otherwise it's subject to the same optimizations that the JVM is able to perform on virtual method calls, like code inlining:

http://cr.openjdk.java.net/~jrose/pres/200910-VMIL.pdf

> ... because the JVM was too statically typed at its core

Nice hand-waving of an argument, by throwing a useless Wikipedia link in there as some kind of appeal to authority.

I can do that too ... the core of the JVM (the HotSpot introduced in 1.4) is actually based on Strongtalk, a Smaltalk implementation that used optional-typing for type-safety, but not for performance:

http://strongtalk.org/ + http://en.wikipedia.org/wiki/HotSpot#History

jawher · on Nov 8, 2012

> Nice hand-waving of an argument, by throwing a useless Wikipedia link in there as some kind of appeal to authority

No need to get agressive over this :) I disagreed with your first comment regarding the dynamic nature of the JVM, and replied trying to explain why.

I posted the wikipedia link not as kind of an "appeal to authority", but to give the readers a full listing of bytecode instructions, so that they can check what I was saying for themselves.

> Disregarding primitives, the JVM doesn't give a crap about what types you push/pop the stack or what values you return.

It depends how you see things: the JVM can't possibly provide instructions for every possible user type, so apart from primitives, the other object types are passed around as pointers or references, but whenever you try to do something other than storing/loading them on the stack, the type checking kicks in, ensuring that the reference being manipulated has the right types.

For instance, the putfield instruction doesn't just take the field name where the top of the stack is going to get stored. It also takes the type of the field as a parameter, to ensure that the types are compatible.

Constrast this to Python's bytecode, where the equivalent STORE_NAME (or the other variants) doesn't ask you to provide type informations.

But then again, we might be splitting hairs here: since this type checking is happening at runtime (when the JVM is running your code), you could indeed question calling it "static typing", which is usually performed at compile time (an is partially performed by the java compiler for example).

michaelkscott · on Nov 8, 2012

I don't think pushing tweets to every user is a scalable approach. It would seem more sensible to read rather than push tweets associated with a user. I don't know about Twitter's infrastructure, but storing more than 1 copy of the same data wouldn't make that much sense.

irahul · on Nov 8, 2012

> I don't think pushing tweets to every user is a scalable approach. It would seem more sensible to read rather than push tweets associated with a user. I don't know about Twitter's infrastructure, but storing more than 1 copy of the same data wouldn't make that much sense.

Suppose I have 5000 followers. How do you propose to get the timeline? Since you said "single copy of data", I assume you would do something like this:

https://github.com/mitsuhiko/flask/blob/master/examples/mini...

At twitter scale, that query will be a major bottleneck.

jQueryIsAwesome · on Nov 8, 2012

When a user posts a video it gets published in the subscription feed of every user subscribed(maybe not with the ajax like twitter does); plus they tell you which videos have you already watched. Plus the search feature is probably more used than in Twitter.

adamsmith · on Nov 8, 2012

Facebook also has many more features, which probably means more code, which probably means larger switching costs. Take anything: photos, privacy logic, news feed, etc., and Facebook has the more sophisticated feature set.

rjzzleep · on Nov 8, 2012

it's funny that this is even an issue anymore. at the time where twitter was started, the ruby vm has huge issues, and you couldn't even do evented io.

what I find even funnier is that people just throw in play and grails without a little of experience in rails. The eco system is entirely different:

- you can use java and therefore java libraries(yes you could do the same in jruby, but nvm) - bundles/gems are an order of magnitude better than classical java dependency hell. - need something in rails? add a gem. need something in grails? search throw the outdated plugins, search for missing documentation(just take a look at stackoverflow). in general -> write it yourself or pay a consultant to do it. - have a question? fight with incomplete documentation.

on top of that grails is just a stack on top of the spring mvc.

what about play? i like play more than I like grails tbh, because it doesn't want to be the rails of java. yes people compare it to one another, but that's just the familiarity effect.

now, where's the computationally intensive stuff? nowhere to be found. it's a web api. where's the computationally intensive stuff in twitter? I don't know, but chances are theres a native extension for that nowadays.

There actually was a time when you simply could not build a scalable system in ruby without too many hoops, but that's no longer the case. Yes the GIL is bad, but keep in mind that things like ruby fibers didn't even exist at the time.

irahul · on Nov 8, 2012

> i like play more than I like grails tbh, because it doesn't want to be the rails of java. yes people compare it to one another, but that's just the familiarity effect.

My memory might be playing tricks on me, but I remember play developers talking about rails being a huge influence.

If Play isn't rails for Java, how else one does a Rails for Java? The "familiarity effect" is there because Play is modeled after Rails.

> now, where's the computationally intensive stuff? nowhere to be found. it's a web api. where's the computationally intensive stuff in twitter? I don't know, but chances are theres a native extension for that nowadays.

Search, for one is computationally intensive. I am pretty sure there are more which the outside world doesn't know about.

> There actually was a time when you simply could not build a scalable system in ruby without too many hoops,

What scale are we talking about? At Twitter scale, ruby or anything else has to jump hoops. For example, network load from users coming online and offline on FB chat will break out of box solutions.