The title said "software", so I assumed they were going to exclude the infamous Pentium FPU bug. But no, there it is.
To me, the interesting thing about testing a CPU is that it's theoretically possible to comprehensively test all inputs and outputs, but the time required makes that totally impossible.
Not so much anymore... there has been a ton of work put in by the EDA companies to get companies to do formal verification (which they obviously sell very expensive tools for) even before you get to physical design and testing.
For the chip my team is designing, we are formally verifying our ISA using a new domain specific language (http://www.cl.cam.ac.uk/~acjf3/l3/) which really helps lock down the "gold model" which all our other tests (Our cycle accurate C++ model, our RTL (verilog) model, and eventually the physical simulation) need to live up to.
As far as the tools provided by EDA companies, they have a ton of standard verification tools that have actually gotten a lot better and faster since the 90s, but best of all there are things like Cadence's Palladium (http://www.cadence.com/products/sd/palladium_series/pages/de...) which is basically a super FPGA like device which isbuilt specifically for verifying functionality of your circuits... while a FPGA is to 100 to maybe 1000x faster than simulating RTL, Cadence claims Palladium is up to 1,000,000x faster than RTL simulation.
Anyways: Most chips done today (especially due to the advanced process nodes) require EXTENSIVE verification that is just as long, if not longer, than the design and implementation (though it occurs at the same time as part of the "flow").
Exactly. You might like and keep handy this illustration from IBM's efforts. I think it nicely summarizes many of the tasks and issues in HW verification at various layers. At the least, it should give the impression to readers of how overwhelming the job can be without best-in-class tools. ;)
I think we can do same for software, though. Just got to keep it simple, layered, and each layer building on one before it properly. I did it informally in a style that copied Wirth's Lilith work albeit special-purpose. Verisoft did quite a bit on full-stack for imperative. SAFE (crash-safe.org) is working on it for functional. I think a shortcut is to implement VLISP Scheme in hardware using hardware verification techniques along with previously verified I/O system. I've already seen LISP processors, VLISP for rigorous implementation, Shapiro made a security kernel, and the right hardware target can be reused for ML and Haskell code potentially. To counter hardware issues, run several in synch in same way as old Tandem NonStop architecture. Result should be flexible, fast enough for some workloads, enforce POLA, and have five 9's.
What you think of combining a verified LISP with hardware implementation as a time saver on goal to verification?
Note: Remember that, once we have that, building and verifying other toolchains is so much easier because we can work at high-level. Even highly-optimized systems such as yours could benefit from rigorously-verified systems maybe running same synthesis or checks overnight as a check against faster, possibly buggy implementations you use for iterations. Although, I mainly see them as a root-of-trust for other systems in network.
To me, the interesting thing about testing a CPU is that it's theoretically possible to comprehensively test all inputs and outputs, but the time required makes that totally impossible.