But how much of that is Java, and how much is Hadoop? Spark runs on the JVM, and...

srean · on June 26, 2014

Indeed and as I said it did surprise me that Hadoop was so much slower. But the buck really stops at resources consumed per dollar of usable results produced, and in that Java is going to consume a whole lot more. At large scales, running costs far exceeds development costs. BTW my point was not only about Java but also about your assessment of the hardware.