Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While we're jumping down the architecture pedant rabbit hole, a simple loop like that will be trivially predicted, so the branch will be basically free. In addition, hardware prefetchers do a much better job at predicting linear memory access than manual prefetch instructions. On Core 2, iirc, if you have 2 or 3 L2 misses at fixed offsets either direction from each other, the hardware will automatically begin prefetching memory so it's there when you need. The problem with manual prefetch instructions is there high latency. They're best for hinting to the processor that you're about to make an unpredictable load.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: