I am trying to get into distributed computing so this article is particularly interesting to me. I may be mistaken so please excuse my naivety if my points are off marks. I thought MPI was mainly geared towards communication-heavy tasks where the underlying network is specialized, for example infiniband or bus between CPUs. One use of MPI is to manage distributed memory tasks between different physical CPUs while threads run on multiple cores of same CPU. Spark, I believe, doesn't handle cases like this well because JVM hides low level details. I have read papers that propose to layer MPI over RDMA rather than expose a flat memory model, which came as a surprise to me but it shows the flexibility of MPI. One thing unclear to me what performance we can expect from MPI when we use commodity network gears, and how it compares to Spark. The article is absolutely correct MPI leaves robustness to user and that is today an oversight.
Modern Hadoop ecosystem is designed for different workload from MPI's. It emphasizes co-localizing date and computation, seamless robustness,and trades off raw power for simple programmingmodels. MapReduce turns out too simple, so Spark implements graph execution, which is nothing new to HPC. As far I know Spark's authors don't believe it is ready for distributed numerical linear algebra yet. But a counterpoint is that I am seeing machine learning libraries using Spark, so perhaps things are improving.
One thing I have learnt today is that MPI isn't gaining popularity. I just have a hard time picturing a JVM language in overall control in HPC where precise control of memory is paramount to performance.
> I thought MPI was mainly geared towards communication-heavy tasks where the underlying network is specialized
The beauty of mpi is:
* its definition is completely open
* it segregate the high level message passing interface from the low level stuff
This means that code that was written on cheap old commodity network gear over tcp/ip will work on brand new specialised hardware using their own protocol. Because it's fully open, any hardware vendor can provide MPI driver for their hardware at virtually no cost.
I'm not sure... My only exposure to HPC was the install of a cluster with MPI over Infiniband. At the time I looked into MPI a bit and played with it at home (over wifi and ethernet).
I agree that languages that rely on tracing GC seem like they're fundamentally at a disadvantage when it comes to pushing the envelope of single-node performance; the best article I've read arguing this was actually in the context of mobile games, rather than HPC, but I can't for the life of me find the article now.
I don't know if Spark itself is the right way forward; but it's an example of a very productive high-level language for certain forms of distributed memory computing. And some of these issues - like the JVM - aren't fundamental to Spark's approach; there's no inherent reason why something similar couldn't be built based on C++ or the like.
I'd be very wary of criticising Spark based on the technologies it's built on.
In particular, a decent garbage collector will give you performance dependent on the number of live objects (typically low) and not on the number of allocations and deallocations, as you might see in a non-garbage collected language. This gives great allocation performance and reduces overheads.
The disadvantages can be (potentially long) GC pauses and higher overall memory requirements, but in practice this isn't usually a problem for non-interactive systems.
Of course, if you do have a device with low memory and low tolerance for GC pauses (i.e. mobile gaming) there might be a problem.
The main disadvantage seems to be less predictable performance; which could be a problem in domains which require good timing performance, but that's not really Spark's problem.
A GC'd language is also generally easier to program in, since one doesn't have to (in general) worry about memory management, so it's generally a lot easier to program very large systems with lots of moving parts.
I totally agree that the programming model of Spark is the right direction. I dream of the day when compiler and OS cooperate to expose a simple interface to distributed memory and an optimal execution-communication system, kind of like Cilk but for clusters.
BTW, thanks for a thought provoking article. You have given me a lot to ponder.
RDMA doesn't really provide a flat memory model - all it's really doing is minimizing copies when you send a message. more like "put this 100K string into that node at <address>".
Modern Hadoop ecosystem is designed for different workload from MPI's. It emphasizes co-localizing date and computation, seamless robustness,and trades off raw power for simple programmingmodels. MapReduce turns out too simple, so Spark implements graph execution, which is nothing new to HPC. As far I know Spark's authors don't believe it is ready for distributed numerical linear algebra yet. But a counterpoint is that I am seeing machine learning libraries using Spark, so perhaps things are improving.
One thing I have learnt today is that MPI isn't gaining popularity. I just have a hard time picturing a JVM language in overall control in HPC where precise control of memory is paramount to performance.