I think the problem is that we are talking about two different things.
The very start of the paper describes the term and it's methodology (which is what we are discussing), and then goes on to explain googles own implementation using GFS (which you seem to be getting hung up on.)
Keep in mind that this whole thread is about "MapReduce", which Holzle was talking about, not the more generic map and reduce that has been around since the 1800s (and they will continue to mapping and reducing in their new dataflow framework, they just won't be using MapReduce). Now for the paper:
> Our abstraction is inspired by the map and reduce primitives present in Lisp and many other functional languages.
Inspired doesn't mean equivalent.
> Our use of a functional model with user specified map and reduce operations allows us to parallelize large computations easily and to use re-execution as the primary mechanism for fault tolerance.
They are using map and reduce as a tool to get something else.
> The major contributions of this work are a simple and powerful interface that enables automatic parallelization and distribution of large-scale computations, combined with an implementation of this interface that achieves high performance on large clusters of commodity PCs.
They are very specific about what the contribution is. All work that has claimed to be an implementation of MapReduce has followed their core tenants. Even if MPI has a reduce function, it is not MapReduce because it is based on other techniques.
I'm really tired of people who claim there is nothing new or even significant when there clearly was. Ya, everything is built on something these days, but so what? In the systems community, MapReduce has been a huge advance, and now we are moving on (at least for streaming).
I'm still in the camp of there being nothing new here. Now gfs may be a different matter, but that was part of a different paper, and not a requirement of this one. Which is why I have kept stating that a dfs is not a requirement.
If that's what you believe, then you are going to miss out on the last 10 or so years of systems research and improvements. And when Google stops using MapReduce but the new thing still uses map and reduce, you are going to be kind of confused.
The very start of the paper describes the term and it's methodology (which is what we are discussing), and then goes on to explain googles own implementation using GFS (which you seem to be getting hung up on.)