Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

AMD's hardware is stupid-good from a compute perspective. Vega64 is $399, but renders Blender (on AMDGPU-PRO drivers) incredibly fast, like 2080 or 1080 Ti level. That's basically the main use case I bought a Vega for (which is why I'm very disappointed in ROCm's current bug which breaks Blender)

If you really can use those 500GB/s HBM2 stacks + 10+ TFlops of power, the Vega is absolutely a monster, at far cheaper prices than the 2080.

I really wonder why video games FPS numbers are so much better on NVidia. The compute power is clearly there, but it just doesn't show in FPS tests.

---

Anyway, my custom code tests are to try and build a custom constraint-solver for a particular game AI I'm writing. Constraint solvers share similarities to Relational Databases (in particular: the relational join operator) which has been accelerated on GPUs before.

So I too am a bit fortunate that my specific use cases actually enable me to try ROCm. But any "popular" thing (Deep Learning, Matrix Multiplications, etc. etc.) benefits so heavily from CUDA's ecosystem that its hard to say no to NVidia these days. CUDA is just more mature, with more libraries that help the programmer.

AMD's system is still "some assembly required", especially if you run into a compiler bug or care about performance... (gotta study up on that Vega ISA...) And unfortunately, GPU Assembly language is a fair bit more mysterious than CPU Assembly Language. But I expect any decent low-level programmer to figure it out eventually...



I agree, and I'd add that VII is probably going to be a lot better. There are some pretty big benefits to the open drivers as well (which can be used for OpenGL, even if you use the AMDGPU-PRO OpenCL, which is probably wise if OpenCL is what you want to do).

As one example, I have a recurring task that runs on my GPU in the background, and I sleep next to the computer that does that. Since I don't want it to be too noisy, and it is acceptable for it to take longer to run while I'm asleep, I have a cron job which changes the power cap through sysfs to a more reasonable 45W (and at those levels, it's much more efficient anyhow, especially with my tuned voltages) at night.

> I really wonder why video games FPS numbers are so much better on NVidia. The compute power is clearly there, but it just doesn't show in FPS tests.

Drivers are hard, and AMD has sorta just been getting around to doing them well. The Mesa OpenGL drivers are usually faster than AMDGPU-PRO at OpenGL, and RADV is often faster than AMDGPU-PRO Vulkan (and AMDVLK).

I've been hoping these last few years that AMD would try to ship Mesa on Windows (i.e., add a state tracker for the low level APIs underlying D3D), and save themselves the effort. As far as I can tell, there is no IP issue preventing them from doing that (including if they have to ship a proprietary version with some code they don't own). There still seems to be low-hanging fruit in Mesa, but the performance is already usually better.


https://github.com/hashcat/hashcat has some assembly optimizations. They look fairly readable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: