Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

vpmullq is not that useful; in bignum code you also want the upper part of the product, and there is no corresponding vpmulhq instruction to get that.

On the other hand, vpmadd52luq and vpmadd52huq do give you access to the lower and upper parts of a 52x52->104 bit product, and those instructions perform well in the Intel chips, 3x faster than vpmullq.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: