Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anyone have an idea about how this would compare with current GPU performance ? My impression is that GPU's are currently way ahead of CPU's in floating point performance (though maybe not for 64 bit ?).

EDIT: To make this question a bit more specific, say I wanted to develop a really fast neural net implementation, which basically reduces to matrix-vector multiplication and function interpolation. Would I be better off looking to do this with a GPU or an FPGA given the current state of both technologies ?

From what little experience I've had with GPU's I think bandwidth to the device might be a limiting factor but I'm guessing this would affect either type of co-processor.




In my experience, it's not bandwidth that is the limiting factor, but latency. You'll hit the same problem with FPGAs if you're using it as a co-processor, as they are typically connected to the motherboard over PCI Express. If the vectors you're using are small (where "small" means small enough to easily fit into an L1 cache on a processor), then you probably won't see any performance improvement by offloading the computation to an accelerator.

I say this because in a matrix-vector multiplication, only the vector has data-reuse. You do a single pass over the matrix. I wrote a paper where latency killed any performance benefit from using a GPU, because the computation we performed did only a single pass over the data: http://people.cs.vt.edu/~scschnei/papers/debs2010.pdf If you're doing a matrix-matrix multiplication, then that's a different story because each element in each matrix will be reused.


GPU's have multi gigabit bandwidth and 1.5-6+ GB of on-board ram. The GFLOP performance vary's significantly though.

The Radeon 7970 has 947 GFLOPS Double Precision, but the nvidia cripples it's geforce series to 100GFLOPS to force people to pay for a Quadro 600 that has 515.2GFLOPS Double Precision. Though, if it's a large project paying for some Quadro's are probably worth the cost's for better software support and more RAM IMO.

The problem with FPGA's is they cost about as much but take a lot more effort to anywhere close to those performance numbers. However, they are great if you have some vary odd specific needs and plan on moving to custom chips in the future. AKA, you want to build a custom video encoder and plan on mass producing your own chips, so you already need to develop at really low levels.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: