Hacker News new | past | comments | ask | show | jobs | submit login

It would be nice if someone could compare these commercially available APIs with Yahoo's open_nsfw model in terms of accuracy: https://github.com/yahoo/open_nsfw

I'm currently building an API wrapper around it and running it on a Hetzner server with a GTX 1080 - prediction takes about 0.25 seconds and while I haven't optimised it for parallel execution, I think it should be able to handle at least +10 images/sec comfortably. I'm also testing video moderation by using ffmpeg to slice the video into screenshots and predicting the min/avg/high scores.

Moderating 25 million explicit images using Google Cloud Vision would cost around $19,500/mo vs €99/mo on Hetzner.




Makes a lot of sense, actually its really difficult to get a large enough dataset for moderation tasks to make a decent inhouse model for a fair enough comparison.

Sure, we can try scraping that from pornhub etc but fee then the negative classes would be very domain specific, using stock images may not provide a good measure.

Also, its really weird to assign such a task to any of your employees, feels kinda strange :)


Yahoo's model could be fine-tuned: http://caffe.berkeleyvision.org/gathered/examples/finetune_f...

Yeah, it's definitely not a nice task but what's stopping someone (well, besides potential legal issues) from using these commercial APIs to create datasets programatically and training a cloned model from that?

I'm curious what the profit margins are on these APIs because I think they are way overpriced.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: