I mean this is quite easy to disprove - what do you think -funroll-loops is doin...

dragontamer · on July 23, 2021

A "loop" is just a cycle in a graph.

A -> B -> C -> A, hey look, its a cycle. Eventually A->D (where D exits the loop). So really node A has an edge going to B and D.

Knowing that A has an edge to B and D, an equivalent transformation is A1->B1->C1->A2->B2->C2->A1, but also A1->D and A2->D.

Both A1 and A2 have to still point at D. But A1 points to B1, and A2 points to B2. Etc. etc. That's all your compiler is doing, its reasoning about the graph and transforming it with actually... very simple rules. (Unfortunately, those rules are NP-complete so heuristics are used to make the compile times reasonable. NP-complete seems to keep coming about transformations associated with "simple" rules...)

Then, we might see that some logic could make A2 not necessarily point to D (cutting out one possible jump statement). And that's where we get our optimization from: a simple reasoning about the graph which removes one instruction per 2-loops. Unroll to 8 and we get to 7-instructions (cmp/jump instructions) saved over every 8 loops.

An "optimizing" compiler just reasons about this graph and all possible equivalent transformations (or at least, uses a heuristic to reason over a portion of equivalent transformations), and chooses such a graph transformation that minimizes various cost estimates.

-------

That's the thing. Compilers can only transform the code-graph in ways that are "equivalent". Now yes, things get a bit weird with undefined behavior (and pointer-aliasing: assuming a lack of aliasing is needed for C/C++ compilers to make even the simplest of transformations. So that's a bit of a corner case). But that's the gist of what they're doing.