We’d call these non reproducible non errors (200 for GraphQl) as “usefully wrong.” You see this in AI a lot where they spent millions in market research or ask MBA types how to recommend a product and it turns out that someone buying a new laptop tends to also want to buy new shoes. You’d get people like executives not happy they looked bad for spending money on one thing only to find out they were way off. Good news is more revenue makes people look good, so that’s not a hard problem. What’s hard is if a contact us form stops working (bad example), people stop using it but still use the app the same way and spend as much is the form and the people behind it necessary or are people brand loyal and willing to put up just with a minor bug? Similarly if we get something not working and can’t reproduce it did a network card have a low level error that propagated in such a way even our monitoring couldn’t pick it up?
I thought this was a new error in complex systems we have now with hundreds of clusters creating basically non-deterministic problems. But fine I remembered before kernels became better at talking to things like drivers and external hardware we’d see weird bugs outside our boundaries that were really hard to track down and often never manifested themselves in the same way. This is when you’d go to the weird guy no one talked to and in a week he’d have some piece of odd C code with a hex value doing logic no one understood that bypassed whatever error we were having.
It is too bad those guys that I’m pretty sure didn’t do much largely fell victim to the MBA thinking of the 90s. Now we usually will have one team go well we are calling the code right and the other saying they are sending it right and both aren’t wrong except it isn’t working so they are. We’ve reached a point where we have contracts with every vendor because the problem usually is actually like a Cloudflare :) but I’d argue it’d be far easier to just fix or create a work around and file a bug with them then spend more time on daily calls working with someone like you and knowing your progress. So I know what you mean by tools companies use. Unless it has hit industry standard we won’t even evaluate open source as we couldn’t blame someone.
I thought this was a new error in complex systems we have now with hundreds of clusters creating basically non-deterministic problems. But fine I remembered before kernels became better at talking to things like drivers and external hardware we’d see weird bugs outside our boundaries that were really hard to track down and often never manifested themselves in the same way. This is when you’d go to the weird guy no one talked to and in a week he’d have some piece of odd C code with a hex value doing logic no one understood that bypassed whatever error we were having.
It is too bad those guys that I’m pretty sure didn’t do much largely fell victim to the MBA thinking of the 90s. Now we usually will have one team go well we are calling the code right and the other saying they are sending it right and both aren’t wrong except it isn’t working so they are. We’ve reached a point where we have contracts with every vendor because the problem usually is actually like a Cloudflare :) but I’d argue it’d be far easier to just fix or create a work around and file a bug with them then spend more time on daily calls working with someone like you and knowing your progress. So I know what you mean by tools companies use. Unless it has hit industry standard we won’t even evaluate open source as we couldn’t blame someone.