I think the point is that even though it's smaller and less obvious that the other objects, it's still sufficient to "hijack" the recognition for the whole image. If they had a little sticker with a normal picture of a toaster on it it's unlikely that it would've prevented the banana from being recognised, and the image only looks a bit like a toaster, whereas the banana is unambiguously recognisable.
It appears to me that this mostly fools whole-image classification algorithms. If the system performs object segmentation first and then applies classification to each object in the scene, this method is unlikely to be effective.
One can paste such a sticker on top of the face or other objects to be disguised and it might reduce recognition accuracy a bit, but applying some “paint-in” algorithms to fill in the blank covered by the sticker would basically remove its effect. That is unless it is used to cover some prominent features, although that is often unpractical in many circumstances.
This sticker only works against this classifier. If you start changing the algorithm, you'd need to change the attack to match.
If you think you can write a better image classifier by first segmenting the image before using ML, then I encourage you to get your own computer vision paper published and see how that works for you.
A potential application of this could be some kind of “privacy sticker” that you’d wear on your hat or your face in order to disable automated facial recognition systems.
I'm guessing the opportunity here is to disrupt the face detection step - make the face detector zero in on your non-face sticker/image, so the face recogniser never gets passed an actual face.
Face detectors generally check the whole image for faces, outputting everything above a threshold, so as long as a face is present, it's getting detected.
Check out the 34C3 talk on adversarial AI. They found that the percentages of fooling are still high if one gets the adversary model completely wrong when designing the attack generating network, so it seems surprisingly stable.
I think that working around these type of fooling is easy but not really worthwhile for now. After all, adversarial models are designed to improve the performances of the models.
Also in the article, they test a detector that has to identify a single object in an image that contains two: place an actual toaster next to the banana and call it fooled.
More than that, you can fool models that work completely differently (like decision trees, SVM and kNN) with false data made for the other model, which shows some kind of underlying similarity in there that we don't know yet.
Do you know how to fool the detector in the article even if it were perfect? Put a sticker with a picture of a toaster that is closer to the toasters in its training dataset than the banana is, to the bananas in its training dataset.
Humans would consider this a non-interesting exploit.
This detector has to choose a label for an image with two labels. Use something like YOLO2 on this, and the detector will recognize a banana AND a toaster.
Now we do know that compared to humans, these detectors over-react to textures over structure. If you look at the sticker, you can see how it kinda looks like a toaster: the big red blob looks a bit like a toaster button. A generic shape is there, the thing over the button looks like both the control to lower the toasts and the slit where the toasts are.
These classifiers will get better at recognizing structure, especially if we train them against this kind of things.
These are the types of fun details that are totally going to be part of future history lessons (no matter which direction, positive or negative, all of this is heading towards).
Hardening production ML systems are going to be fun. I think exploiting ML will be the new kid on the block just like how we were introduced to XSS exploits. See https://github.com/cchio/deep-pwning
Trying to "harden" million-parameter models trained on a relatively small number of relevant examples will be a nightmare to make web security look easy.
Too bad... Vancouver/Whistler was an awesome place to have a conference, even if the weather was rainy and the skiing so-so. A NIPS too big to be hosted by anything but a mega-city sounds depressing.
That reminds of the old days when automatic "self-learning" SPAM classification began.
Back then, spammers sent deliberately gibberish messages. The goal was that users (rightfully) marked those as SPAM, somehow disturbing the machine learning and thus weakening the overall SPAM recognition.
Alas, I don't know if this was actually working, and if so, how large the effect was. This would be an interesting bit of history.
When I saw the "oily" legs on reddit I was curious if such an illusion could be used to fool AI camera surveillance. The recent article on China's surveillance network came to mind.