Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: 2M+ free stock photos in novel T-SNE based search engine (zoomstock.com)
62 points by jacobn on Sept 10, 2020 | hide | past | favorite | 12 comments


Hi! We created a new type of search engine for the top free stock photo sites.

It works by taking all the images, using T-SNE to lay them out in a gigantic grid, then putting a quad-tree on top of that grid and letting you smoothly navigate that tree. It's a simple but surprisingly powerful concept: you can actually "walk" an entire catalog!

The search is really cool. Instead of just getting a paginated / infiniscroll listing, you actually get two full quad trees: one with search hits only, and one with the results shown in the context of all the other images. And you can switch seamlessly between them. The latter can be really powerful for finding poorly keyworded images.

Once you've gotten the hang of it (pan & zoom, very similar to Google Maps), you can explore the images at a far faster pace than you can realistically do with traditional search engines.

And T-SNE organizing the images works wonders for creating a sensible layout (be sure to zoom in - at the outer layers the sub-sampling makes it look jumbled even when it isn't).


This is great! Bookmarked, will use.

I think, bug spotted: Show results only 3,567 results

But the grid is shoving only aprox 300 photos when zoomed out. How can I see the rest of results ? Pagination etc. ?


You see the rest by zooming in (see the ?-button for all the ways to zoom).


are visually similar images next to each other?


By and large yes: T-SNE does it's best to achieve that - it's kind of tricky to take an N-dimensional space and flattening it to 2 dimensions. You inevitably get some tearing and some scatter.

So they're not always next to each other, but oftentimes.



Well done but be careful, TSNE has some drawbacks, well detailed here: https://distill.pub/2016/misread-tsne/

For building a more precise 2D embedding or image similarity search engine you may want to rely on a neural embedding + FAISS or Annoy instead of compressing into 2D with TSNE. When doing the later, distances don't mean much anymore.


This is neat, but what's the advantage to searching this way versus by keyword?


You can absolutely search by keyword, and that's where you usually want to start.

The difference lies in what you can do after that. Regular stock photo sites usually give you an infiniscroll on your search term, then you have to click on an image and browse through its related images (by which point you've lost the search query), then rinse & repeat until you give up basically.

Here, instead of "hiding" the images (or rather: not having a way of showing them all) we literally give you the entire catalog and a way to meaningfully navigate it.

By viewing only the hits you get a complete result set right at your fingertips, and they're meaningfully organized so the different image types that responded to your query will likely be grouped.

By viewing the results in context you can find all the images that are similar to the hits, but didn't have your keywords attached to them.


Thanks for elaborating!


Is the t-SNE the best algorithm for this task?


T-SNE is pretty good. We evaluated UMAP as well but didn't get as good of a result with it, but that may have been due to bugs in the implementation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: