Hacker News new | past | comments | ask | show | jobs | submit | rustdeveloper's comments login

Happy to suggest another web scraping API alternative I rely on: https://scrapingfish.com


What’s the chance you’re affiliated? Almost every one of your comments links to it. And curiously similar interest in Rust from the official HN page and yours. No need to be sneaky.


Interesting data for sugar in products scraped from Walmart: https://scrapingfish.com/blog/scraping-walmart


I was really excited to read this, but it's very shallow and barely presents any data at all.

It's main and mostly meaningless conclusion is that there are more skus in the Walmart catalog for sugar-dominated products than for any protein- or fat-dominated ones, and that online reviewers generally tend to leave better reviews for those.

Nothing about actual sales volume, promotional actions by Walmart, or monetary or nuyitional share of food consumed (or even purchased), etc -- any of which might say something more impactful.



("Scraping Fish" owns this site but it isn't disclosed anywhere, and this guy "rustdeveloper" seems associated with them, most of his comments push that service)


This is correct, my friends from Scraping Fish are hosting https://compareproxy.com to help people find proxy for web scraping. I'm happy to "push" for Scraping Fish as I'm also a satisfied user who received a lot of help from the founders for my web scraping projects.


For web scraping at scale you want to get lost in the crowd. This usually means being (or pretending to be) chromium on windows. Unusual browsers are suspicious, detected or have very distinct fingerprint.


Indeed. I heard about a browser called Zen a couple weeks ago and installed it. Just took it for a drive yesterday, and by the end of the day, Reddit had blocked me just based on my sporadic, normal use of the site for about two hours here and there while I did other things.

I switched back to Safari and it worked normal immediately.


This is a terrible news :( I know it was an option for web scraping and I used in once. I’m curious what is the real reason they took it down.


I have seen a push in the past year or so for saving storage across Google products. Caching the Internet takes a lot of storage. I suspect that's why they've removed it.


Surprisingly, according to you tool, HN is neutral on “web scraping”. I noticed others also reported bias for neutral on other keywords.


There are also SaaS products with usage based pricing. It depends on what SaaS or software or product it is. Different pricing model works for different things.


but sometimes the product I need doesn't offert the right subscription for me.


Actually, they do allow this. I store my photos and iPhone backup on synology NAS. You just have to devote some time to set it up yourself.


I'm using Scraping Fish because of their pay-as-you-go style pricing as opposed to subscription with monthly scraping volume commitment. And they don't charge extra credits for JS rendering or residential proxies because the cost of each request is the same: https://scrapingfish.com



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: