Hacker News new | past | comments | ask | show | jobs | submit login

The part that's really weird is that on modern CPUs predicted branches are free iff they're sufficiently rare (<1 out of 8 instructions or so). but if you have too many, you will be bottlenecked on the branch since you aren't allowed to speculate past a 2nd (3rd on zen5 without hyperthreading?) branch.



The limiting thing isn't necessarily speculating, but more just the number of branches per cycle, i.e. number of non-contiguous locations the processor has to query from L1 / uop cache (and which the branch predictor has to determine the location of). You get that limit with unconditional branches too.


Indeed, the limit is on taken branches, hence why making the most likely case fall through is often an optimization.


The tricky part here is that compilers are pretty bad (without PGO at least) of knowing what side of the branch matters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: