ML models aren’t turing machines (unless you loop their output back as input). T...

magicalhippo · on April 26, 2022

But it's similar to using a compiler, no?

I almost never compile the compiler I use, so I'm implicitly trusting that the compiler actually spits out what I expect and not some kind of backdoor[1].

[1]: https://dl.acm.org/doi/10.1145/358198.358210

layer8 · on April 26, 2022

What exactly corresponds to the compiler and its input/output in your analogy? It doesn’t seem very similar.

magicalhippo · on April 26, 2022

I guess I misunderstood the context.

I thought the issue was that you get some premade model from a company, feed it input and it classifies for you. With a compiler you feed it input and it produces a binary.

If you don't have access to the source, meaning model training data or source code for the compiler, then you can't be sure the model won't intentionally misclassify or the compiler won't insert trojan code.

But I see now the op meant something different.

layer8 · on April 26, 2022

The difference I see is that an ML model is at first glance not a compiled binary with hidden mechanics: It’s a network graph with weights on the edges and where all nodes work in the same easy-to-understand way. The model also isn’t a unique function of the training data in the way that the compiler binary is a function of the compiler source — you can get slightly differently behaving models from the same training data, so you can’t totally predict the model’s behavior from the training data like you can predict the compiler’s behavior from the compiler source. The model itself is generally the better “source” for predicting (well, simulating) its exact behavior. That’s why it is surprising that the presence of a backdoor can remain undetectable by inspecting the model. There would be somewhat of an analogy if there was a backdoored compiler where the backdoor cannot be detected by analyzing the compiler binary’s machine code.