What’s an interesting problem that’s solved with segment anything?

rampantraccoon · on April 13, 2023

The problem being solved is AI being able to distinguish unique objects within visual data. Before SAM, people would have to train a model on specific objects by labeling data and training a model to understand those objects specifically. This becomes problematic given the variety of objects in the world, settings they can be in, and their orientation in an image. SAM can identify objects it has never seen before, as in objects that might not be part of the training data.

Once you can determine which pixels belong to which object automatically, you can start to utilize that knowledge for other applications.

If you have SAM showing you all objects, you can use other models to identify what the object is, understand it's shape/size, understand depth/distance, etc. It's a foundational model to build off of for any application that wants to use visual data as an input.

DaiPlusPlus · on April 13, 2023

> SAM can identify objects it has never seen before

I'd love to see what SAM does when you send it a photo of rolling fog though, e.g. https://www.google.com/search?q=rolling+fog+scotland&tbm=isc... - what happens then? (and how can it meaningfully segment-out fog?)

yeldarb · on April 13, 2023

Not sure if this is what you mean, but I grabbed some of those images & dropped them in to see what it predicted: https://imgur.com/a/CXLmYXo

idopmstuff · on April 13, 2023

It groups the fog as a single object (except where it's separated by things like hills).

You can see what it does - it's available to test at https://segment-anything.com/.

endisneigh · on April 13, 2023

Yes, what I am interested in are the other applications.

swyx · on April 13, 2023

see video demo where joseph showed how it improves on sota? https://youtu.be/SZQSF-A-WkA

mritchie712 · on April 13, 2023

yep, value is pretty clear from his demo. Goes from dozens of clicks to identify an object within an image to a single click. SAM does almost exactly what you'd want as a human in every one of his examples.