The code in this article is incorrect. The CUDA kernel is never called: https://...

sevagh · on Nov 12, 2023

>If you actually want to learn something about CUDA, implementing matrix multiplication is a great exercise.

There is SAXPY (matrix math A*X+Y), purportedly ([1]) the hello world of parallel math code.

>SAXPY stands for “Single-Precision A·X Plus Y”. It is a function in the standard Basic Linear Algebra Subroutines (BLAS)library. SAXPY is a combination of scalar multiplication and vector addition, and it’s very simple: it takes as input two vectors of 32-bit floats X and Y with N elements each, and a scalar value A. It multiplies each element X[i] by A and adds the result to Y[i].

[1]: https://developer.nvidia.com/blog/six-ways-saxpy/

Handprint4469 · on Nov 12, 2023

Thank you for this, comments like yours is exactly why I keep coming back to HN.

lordwiz · on Nov 13, 2023

Thanks a lot for pointing it out, I have fixed the code and updated the blog.