Aren’t you basically just saying you are able to measure the error rate? I mean that’s good, but already a given in this scenario where hes reporting the 7% error rate.
No. If you're able to verify correctness of individual items of work, you can accept the 93% of verified items as-is and send the remaining 7% to some more expensive slow path.
That's very different from just knowing the aggregate error rate.
No, it's anything that's harder to write than verify. A simple example is a logic puzzle; it's hard to come up with a solution, but once you have a possible answer it's really easy to check it. In fact, it can be easier to vet multiple answers and tell the machine to try again than solve it once manually.