Oh the interesting part is “our AI could not interpret images of common objects at unusual angles”.
Now that’s fascinating - why not? Is computer vision just boring pattern recognition and really does not have “concepts” underlying it - if so 90% of the AI hype is false?
There are cases where AI can recognise gender on an X-ray when humans can't, find tumors that experienced doctor's can't. This must mean that human doctors looking at Xrays use just boring pattern recognition and AI has actual concepts of what it's seeing.
But does it really? Or is it more observant than a human doctor and more thorough, but only at the limited task of deciding if this X-ray looks like the million other X-rays of a male abdomen versus the million X-rays of a female abdomen.
I assume counting the number of ribs is not what is meant …
“We found that even state-of-the-art models which are optimally performant in data similar to their training sets are not optimal — that is, they do not make the best trade-off between overall and subgroup performance — in novel settings,” Ghassemi says. “Unfortunately, this is actually how a model is likely to be deployed. Most models are trained and validated with data from one hospital, or one source, and then deployed widely.”
It's simple math. If the correlation between gender and sex is 0.99 then if a method can determine your sex with say a 90% accuracy then it can determine your gender with an 89% accuracy (very roughly). The difference is negligible.
Mind that there's a big difference between machine learning (which these robots use) and generative AI, which is what most of the recent hype has been about.
ML is by now mostly a proven technique with known limitations. E.g. being unable to deal correctly with situations not present in the training data. Generative AI is an offshoot of this, where people largely seem to like pretending those known limitations don't apply for vague reasons.
What ? Stable diffusion doesn't have an underlying understanding that humans typically have two arms, two hands and five fingers per hand gathered from vast sea of training data ? That's a bold statement.
IIRR it’s a debate as to the difference between 99% of the time
It predicts the next pixel will be fleshy and the pixel next to it is background this making something that looks fingery (and so when presented with
An odd angle that 99% drops crazily” or that somehow there is a executive function that has evolved that gets a concept of finger with movement, musculature etc
It’s the “somehow evolved” part that is where I have my concerns.
Predictive ability based on billions images, sounds good. Executive function - how does that work? But at some point we are playing “what is consciousness” games.
Would love to hear more rigourous thought than mine - any links gratefully received:-)
I actually agree with you. I was a bit sarcastic. If I understand correctly there isn't a fundamental difference when it comes to text output vs pixel data output in this context. If so then it suddenly sounds much more of a stretch (intuitively) to claim that somehow stable diffusion understands the real world (like people claim to be the case with language models).
Now that’s fascinating - why not? Is computer vision just boring pattern recognition and really does not have “concepts” underlying it - if so 90% of the AI hype is false?
There must be several phds in that at least :-)