Hacker News new | past | comments | ask | show | jobs | submit login

> A single core of my i7 does something like 6 or 7x the work a single P4 core does, at half the mhz.

Do you have a reference for this? I'd be surprised if it's more than 2x for general-purpose code. (It might be more for specialized stuff like video decoding, which I guess has improved hardware support these days.)




CPUmark single threaded performance:

http://www.cpubenchmark.net/singleThread.html

The new i7's float around a rating of 2,000.

A netburst vintage P4 at 1.8ghz rates 217 (original P4 2001). 358 if we move up to the 2.8ghz model (2003 model). So anywhere between almost 10x the performance to 5.5x depending on model, mhz, etc.

Note: pentium is a brand and later its the budget name of C2D's, but this comparision to P4's.

No idea what cpubench uses, but its not soley a video benchmark, its more of a mixed bag.


It depends on how the benchmark was written. If it concentrates on crunching numbers and uses not much memory, then the caches are being hit, and caches/memory in general is where a lot of speed went in the last few years (also decrease in price).

But for some more general application, this might not be true. Then again it depends. For example a lot of the Adobe 2d-filters would gain from such thing, if the data is accessed the right way (swizzled if possible). But then such algorithms usually tend to be one of the 13 dwarfs that are easily parallelized using OpenMP, or rolling your own thread-version. Or even with CUDA/OpenCL/DirectCompute - but then you might have to pay for the communication between the CPU/GPU, and loss of result, or instability of results across machines (different floating point accuracy tradeoffs for the sake of speed).

But then the problem comes with state-machines, for example an LZ compression, or anything that relies on results from before. This is very hard to data-parallelize.


Well, depends on what we want to measure with synthetic benchmarks but general terms like "processing power" are fair to be interpreted as general and common PC tasks that these benchmarks address.

But yeah if your load is specific like floating point calc or mp4 encoding, then it will vary, but its a good shorthard to have these types of conversations, especially when people go half-cocked about how CPU progress has stalled which it clearly has not.


> people go half-cocked about how CPU progress has stalled which it clearly has not.

Okay, point taken. But how much more improvement is there to be had just from architectural improvements, with no more movement on clock speed?


You can see it by comparing results on Passmark's website, for one.

It's because of a lot of things. Perhaps the i7 has more pipelines, but significantly it also has much shorter pipelines. Stalls were really, really expensive on the P4's 30-stage pipeline. Plus there's hyper-threading. And of course the i7's got a lot more going for it in the cache department.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: