That's very true. In fact, I took one of the techniques (traces) directly from a...

blattimwind · on March 20, 2019

Yes I believe that's true, with the idea of a trace cache originating in the 90s to work around perceived I-cache limitations, "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching" is the seminal paper on it.

cfallin · on March 20, 2019

The idea of straight-line traces goes back even further, to Josh Fisher's work on trace scheduling in compilers in the early 80s (linearizing one control-flow path gives a much wider scope for optimizations):

https://en.wikipedia.org/wiki/Josh_Fisher#Trace_Scheduling

He combined this with a VLIW processor architecture to build Multiflow, a hardware startup. (Interesting history tidbit: Robert Colwell, who architected the P6, the first out-of-order Intel core, started his career at Multiflow before joining Intel. The P6 didn't have any trace-cache influences, but the P4, a few years later, infamously did...)

vkazanov · on March 20, 2019

Interestingly trace caches are now gone in both hardware CPUs and major jit compilers :-) Correct me if I am wrong..?

blattimwind · on March 20, 2019

Yes, nowadays x86 cores have µop caches which store the decomposition of individual instructions and then other "optimizers" that target specific constructs (e.g. loop stream detector).

bogomipz · on March 20, 2019

Might you or anyone else have a link to that Bochs x86 virtual machine paper?

Similarly might you have links for those papers that describe "trace-based instruction predecoding for hardware CPUs."?

Cheers

naasking · on March 21, 2019

I think it's the first result when googling "bochs x86 paper".