Hacker News new | past | comments | ask | show | jobs | submit | kevinsundar's comments login

Hey I'm curious what your thoughts are on whether you need a full blown agent that moves the mouse and clicks to extract contents from webpages or a more simplistic tool that can just scrape pages + take screenshots and pass it through an LLM is generally pretty effective?

I can see niches cases likes videos or animations being better understood by an agent though.


Airtop is designed to be flexible, you can use it as part of a full-blown agent that interacts with webpages or as a standalone tool for scraping and screenshots.

One of the key challenges in scraping is dealing with anti-bot measures, CAPTCHAs, and dynamic content loading. Airtop abstracts much of this complexity while keeping it accessible through an API. If you're primarily looking for structured data extraction, passing pages through an LLM can work well, but for interactive workflows (e.g., authentication, multi-step navigation), an agent-based approach might be better. It really depends on the use case.


I'm looking for something similar that can also extract the diff of content on the page over time, in addition to screenshots. Any suggestions?

I have a homegrown solution using an LLM and scrapegraphai for https://getchangelog.com but would rather offload that to a service that does a better job rendering websites. There's some websites that I get error pages from using playwright, but they load fine in my usual Chrome browser.


Good point on offloading it as for the amount of work that's required in setting up a wrapper for something like Puppeteer, Playwright etc that also works with a probably quite specific setup, I've found the best way to get a quality image consistently is to just subscribe to one of the many SASS' out there that already do this well. Some of the comments above suggest some decent screenshot-as-a-service products.

Really depends on how valuable your time is over your (or your companies) money. I prefer going for the quality (and more $) solution rather than the solution that boasts cheap prices, as I tend to avoid headaches of unreliable services. Sam Vines Boots theory and all that.

For image comparison I've always found using pixelmatch by Mapbox works well for PNG's

https://github.com/mapbox/pixelmatch


The easiest solution to this is probably extracting / formatting the content, then running a diff on that. Otherwise you could use snapshot testing algorithms as a diffing method. We use browserbase and olostep which both have strong proxies (first one gives you a playwright instance, second one just screenshot + raw HTML).


Same, I live in San Diego where there are lots of military helicopters and activity in close proximity to SAN. I mostly see them go directly above the airport. I wonder what is different about DCA (maybe noise abatement?) why this isn't done.


DC has a staggering density of restricted airspace; Reagan National has unusually tight approach/departure requirements... so it doesn't surprise me that if this was going to happen somewhere, it would be there.


I wasn't able to fully understand the video but from prior knowledge TCAS requires both aircraft to communicate and give their pilots differing instructions (go up to one, go down to another) right? Do all army helicopters have TCAS and is it generally interoperable with commercial airliners?


Just an amateur av-nerd, but from what I gather the army helicopter would have had a compatible TCAS, but TCAS won't issue instructions (RA / Resolution Advisory) below 1000 ft. Unclear what happens when one aircraft is above 1000 and one is below, but video seems to imply that the plane got an RA to immediately climb and did so, while presumably the helicopter just got a Traffic advisory without any required instruction to follow.


Ah that's really good to know. Makes sense for it not to operate under 1000 ft since you clearly can't have a "Dive" RA for the lower aircraft. I assume maybe just the higher aircraft gets the "Climb" RA then by design.


It is but it is inoperable below 1000 feet.


It's not inoperable; warnings are still issued, but the RA (resolution advisory, i.e. "climb", "dive" etc.) functionality is inhibited.


BTW if you want to stay up to date with these kinds of updates from OpenAI you can follow them here: https://www.getchangelog.com/?service=openai.com

It uses GPT-4o mini to extract updates from the website using scrapegraphai so this is kinda meta :). Maybe I'll switch to o3 mini depending on cost. It's reasoning abilities, with a lower cost than o1, could be quite powerful for web scraping.


I might be missing some context here - to what specific context does your comment refer to? I'm asking because I don't see you in the conversation and you comments seems an out of context self-promoting plug.


Hey! I'm sorry you feel that way. There's several people who have subscribed to updates to OpenAI from my comment so there is clearly value to other commenters. I understand not everyone is interested though. It's just a free side project I built and I make no money.

Additionally, I believe my contribution to the conversation is that gpt-4o-mini, the previous model advertised as low-cost, works pretty well for my use case (which in this case can help others here). I'm excited to try out gpt-03-mini depending on what the cost looks like for web scraping purposes. Happy to report back here once I try it out.


Yep honestly I got so tired of wading through marketing emails that I built a (free) email digest service for updates / changelogs from SaaS tools.

https://getchangelog.com


The one thing holding RSS back is that finding RSS feeds and subscribing to them in another app is frankly time consuming.

I built a free service for people who specifically want to track updates / features / releases to SaaS tools, services, and GitHub repos. https://www.getchangelog.com . It effectively is an RSS search engine + email digest

I think its unique because it uses a combination of LLM based web scraping to find rss feeds and I am working on a solution to generate RSS feeds from any blog / api changelog right now to expand the set of sources. I really wish RSS was more widespread and there was a better discovery solution.


This is really cool! I unfortunately don't have a reMarkable but I run a similar "free personalized email digest" service https://www.getchangelog.com mainly for people following changes to tools / services / Github repos & other RSS feeds.

Wonder if you'd be willing to add email support? Anyways, great work, and I love the design. It really matches the design ethos of reMarkable.


That's super cool! Your product reminds me of https://mailbrew.com/ which I used for a couple of years

> Wonder if you'd be willing to add email support?

I might add support for Kindle/Supernote and send a PDF by email to them, but I wouldn't really want to turn this thing into a business. I already build another SaaS for a living and just don't have enough energy to dedicate to this


At least in California, any gift cards under $10 are redeemable for cash by law. But thats not saying the merchant will make it easy to do so. I usually have to ask for a manager and wait for a bit.


You've never had to set a timer while cutting meat (which inherently takes two hands)?


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: