It depends on how the benchmark was written. If it concentrates on crunching numbers and uses not much memory, then the caches are being hit, and caches/memory in general is where a lot of speed went in the last few years (also decrease in price).
But for some more general application, this might not be true. Then again it depends. For example a lot of the Adobe 2d-filters would gain from such thing, if the data is accessed the right way (swizzled if possible). But then such algorithms usually tend to be one of the 13 dwarfs that are easily parallelized using OpenMP, or rolling your own thread-version. Or even with CUDA/OpenCL/DirectCompute - but then you might have to pay for the communication between the CPU/GPU, and loss of result, or instability of results across machines (different floating point accuracy tradeoffs for the sake of speed).
But then the problem comes with state-machines, for example an LZ compression, or anything that relies on results from before. This is very hard to data-parallelize.
Well, depends on what we want to measure with synthetic benchmarks but general terms like "processing power" are fair to be interpreted as general and common PC tasks that these benchmarks address.
But yeah if your load is specific like floating point calc or mp4 encoding, then it will vary, but its a good shorthard to have these types of conversations, especially when people go half-cocked about how CPU progress has stalled which it clearly has not.
But for some more general application, this might not be true. Then again it depends. For example a lot of the Adobe 2d-filters would gain from such thing, if the data is accessed the right way (swizzled if possible). But then such algorithms usually tend to be one of the 13 dwarfs that are easily parallelized using OpenMP, or rolling your own thread-version. Or even with CUDA/OpenCL/DirectCompute - but then you might have to pay for the communication between the CPU/GPU, and loss of result, or instability of results across machines (different floating point accuracy tradeoffs for the sake of speed).
But then the problem comes with state-machines, for example an LZ compression, or anything that relies on results from before. This is very hard to data-parallelize.