What's particularly interesting here is that the Fiji card they propose is a ver...

theandrewbailey · on Dec 12, 2016

So it's a headless passively cooled server version of the R9 Fury X? I figured this was a faster and improved version. Or is this Polaris based?

http://www.anandtech.com/show/9390/the-amd-radeon-r9-fury-x-...

Not sure if AMD is going to go all HBM on all it's high performance GPUs in 2017, or only offer one or two models with it.

http://arstechnica.com/gadgets/2016/03/amd-gpu-vega-navi-rev...

EDIT: MI6 is Polaris based, MI8 is Fiji with HBM, MI25 is Vega.

https://www.amd.com/en-us/press-releases/Pages/radeon-instin...

slizard · on Dec 12, 2016

> So it's a headless passively cooled server version of the R9 Fury X? I figured this was a faster and improved version.

Almost. My guess is R9 Nano [1] given the same 8.2 Tflops (SP) Flop rate [2].

[1] http://www.anandtech.com/show/9621/the-amd-radeon-r9-nano-re... [2] http://images.anandtech.com/doci/10905/AMD%20Radeon%20Instin...

> Not sure if AMD is going to go all HBM on all it's high performance GPUs in 2017, or only offer one or two models with it.

It would make perfect sense to have some GDDR5X-based medium-range GPUs. HBM2 will be expensive, too expensive for the top of the medium range (and the same applies for NVIDIA). GDDR5X has plenty of room for improvement over GDDR5 and by next year they should have it figured out better.

hedgehog · on Dec 12, 2016

From the photos and perf numbers it looks like the linup is RX480, R9 Nano, and whatever the Vega10 gets called, minus some of the connectors and passively cooled.

rjtobin · on Dec 12, 2016

The PCIE NVIDIA P100 has peak memory bandwidth of 730GB/s, at 250 Watts, which is almost exactly the same bw/watt as the MI8 (well, there are two PCIE P100s, I mean the better one).

Alphasite_ · on Dec 12, 2016

I wonder if something like AMD's SSD+GPU combo would improve the performance of this? Increasing the usable buffer size.

mtanski · on Dec 13, 2016

AFAIK AMD does not produce SSDs.

Having said that there are some early techniques out there for DMA from NVMe drives to GPU ram directly. Like this: http://kaigai.hatenablog.com/entry/2016/09/08/003556

zitterbewegung · on Dec 13, 2016

The parent is referring to this. https://www.extremetech.com/extreme/232416-amd-announces-new...

mtanski · on Dec 13, 2016

I was not aware of this. Thanks for correcting me.

jsheard · on Dec 12, 2016

Those MIOpen benchmarks are a bit dubious, since MIOpen is AMDs own deep learning framework. It's unlikely that code written by AMD is optimal for the Nvidia hardware.

To be realistic you need to compare AMD hardware running MIOpen to NV hardware running a framework backed by cuDNN.

slizard · on Dec 12, 2016

It's clearly indicated on the slide that those are Deepbench [1] GEMM and GEMM-convolution numbers. Data for M40, TITAN Maxwell/Pascal and Intel KNL is actually provided by Baidu in their Github repo.

[1] https://github.com/baidu-research/DeepBench

jsheard · on Dec 12, 2016

Sorry, not sure how I overlooked that.