Hacker News new | past | comments | ask | show | jobs | submit login

That's very true.

In fact, I took one of the techniques (traces) directly from a paper describing how the Bochs x86 virtual machine works :-)

But it goes deeper than that. I believe the whole "trace" thing in both jits and vms comes from a few papers describing trace-based instruction predecoding for hardware CPUs.




Yes I believe that's true, with the idea of a trace cache originating in the 90s to work around perceived I-cache limitations, "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching" is the seminal paper on it.


The idea of straight-line traces goes back even further, to Josh Fisher's work on trace scheduling in compilers in the early 80s (linearizing one control-flow path gives a much wider scope for optimizations):

https://en.wikipedia.org/wiki/Josh_Fisher#Trace_Scheduling

He combined this with a VLIW processor architecture to build Multiflow, a hardware startup. (Interesting history tidbit: Robert Colwell, who architected the P6, the first out-of-order Intel core, started his career at Multiflow before joining Intel. The P6 didn't have any trace-cache influences, but the P4, a few years later, infamously did...)


Interestingly trace caches are now gone in both hardware CPUs and major jit compilers :-) Correct me if I am wrong..?


Yes, nowadays x86 cores have µop caches which store the decomposition of individual instructions and then other "optimizers" that target specific constructs (e.g. loop stream detector).


Might you or anyone else have a link to that Bochs x86 virtual machine paper?

Similarly might you have links for those papers that describe "trace-based instruction predecoding for hardware CPUs."?

Cheers


I think it's the first result when googling "bochs x86 paper".




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: