Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Dffer – Get notified when a website changes (dffer.com)
63 points by dffer on Oct 6, 2017 | hide | past | favorite | 50 comments


There are many, many competitors in this space (visualPing, changeTower, versionista, followThatPage, trackly.io). Most have similar pricing to Dffer, but some are completely free (possibly with worse UI).

I wonder whether this sort of freemium model works, because there are some people who's use case (like mine) absolutely isn't worth spending money on, but I'd like more than 3 checks for free. Obviously I'll take a worse UI that's free over spending money on such a ridiculously simple app. I could literally buy an EC2 instance just to run 10-20 daily cron jobs for less than signing up for Dffer.

My use case is to use trackly.io to see when new firmware/software updates come out for devices I own. But of course, that's never something I'd be willing to pay for.


There are some users who would actually rather pay $5/mo than use a free service. Thats because paying money gives the impression (real or not) that the service is more likely to stick around and be fixed when it goes down.


For me, $5 for not having to setup cronjobs is worth it.


I don't think price is about the task (cronjobs), it's about how you value the output.

If I'm using cronjobs for some trivial unimportant task (like checking for firmware updates), I am not willing to pay. If I'm using cronjobs for super important competitive analysis, sure I'd pay.


Do any of these companies offer webhooks, or are they all oriented at human users reading emails?


Does this work with content that is loaded dynamically with JS? I've used http://www.changedetection.com/ in the past, but it only scrapes the HTML.


We offer “full page” rendering using PhantomJS (soon to be headless Chrome) for Versionista, a competing product. It works rather well, but it introduces its own set of problems... eg you need timers to ensure all the AJAX calls have completed, and so on.


Thanks - just signed up.

I found the 'crawl' settings a little confusing, as the first item is about loading additional URLs (which is what I thought I wanted, as JS is often loaded from separate URLs rather that being embedded in the HTML), but then there's a later option about a full browser. I chose this latter option, and left the other one unchecked. I hope that's correct.


I wonder how its implemented. I think its possible to do run this type of service at a very large scale for extremely low costs using AWS Lambda and Cloudwatch events.


That’s exactly what we use for a competing product. The difficulty is less in the crawling, and more in the scheduling and difference visualization (including filters). Presenting salient results without a ton of false positives is hard work.


I've made something similar a while ago called "rssa" [1]. You give it some urls, some way of extracting information from them (via a jQuery selector, regex etc.) and if things change it tells you.

I use it for tracking prices on Amazon, some stats about my repositories and so on.

[1] https://github.com/fabiospampinato/rssa


seems like a lot of us did this. my version let you write little JavaScript modules to extract data from a page (which had been parsed by jsdom already), and then the change tracking picked things up.


Here's something I've wished I'd had over the 20 years of web development efforts I've been involved in: for production sites, a site that crawls a path regularly and saves a copy of the site in a timeline fashion.

How many times I've thought, "Man, we've made a lot of changes in the last 9 months and no one has a visual narrative of all of those changes..."


I built exactly this once, was fun to set and forget and look back on it in a few months. It calculated the diff and you could see obvious spikes when big changes were rolled out. You saw a lot of big spikes on say Apple.com when new products were announced or marketed, whereas Reddit.com was always changing and couldn't detect any meaningful patterns.

It also had ffmpeg in the backend that would create a timelapse for you on demand. I built it when we rolled out our some rebranding/homepage changes on GitHub.com and nobody really knew off the top of our heads when exactly the last time we updated it was. Git only tracked code changes, no visual changes.

It was originally built on top of http://www.paulhammond.org/webkit2png/


I implemented an auto-screenshotter for my app that takes a screenshot every time the app launches after a few seconds. It works really well to give a visual progression of the apps style and features. I'm at 3k screenshots over 1.5 years of development. The idea is to make a timeline when the app is done.

I liked it so much I will add it into each app I work on from now on.


I'm currently building something that goes somewhat in that direction. Shoot me a mail, and I'll get back to you :)


I use Huginn for this - https://github.com/huginn/huginn


What I really want is something that can track changes to documents (PDFs, but any doctype really) hosted on some site. The URL may or may not change, but the location of the document URL on the page probably won't be changing. Really I'm just interested in the MD5, so no need to track specific changes.

I've got some custom scripts but paying somebody would be nice. Management of these becomes a pain after 10 or so.


Hey, can you tell more about the context on why you need this? It would help to know if there's a more general need and if so, maybe someone gets motivated to implement it...


Hey, I run Versionista (https://versionista.com) and we offer PDF change tracking and visualization.


I've been using a Firefox add-on for this

https://addons.mozilla.org/en-US/firefox/addon/update-scanne...

It works quite well but there's not much in terms of excluding content. When I needed to do that I just ran it through a simple script that filtered for me (Node.js).


Curious if this works on SPA's, and if so what are you using?


It’s posaible to do this for SPAs, yes. We use PhantomJS for rendering JavaScript at Versionista... soon we will swap to headless Chrome.


What is the use case for this? Why would I pay for it?


About a year back I had written a script that would scrape a page every 15 minutes, and notify me when the content changed. It was basically a Band Tour page, and I wanted to be the first to know when the tickets got announced so that I could grab 'em before they got sold out.

P.S. : It was an amazing concert. :)


Assuming this is reliable it would be great for job searches. (if you're targeting smaller organizations with static-enough job pages) Seems a bit expensive though--especially when I might rather sign up for, say, daily checks on up to 240 websites instead of hourly checks on 10 websites.


I actually do this. It's far more effective to apply for new postings than whatever happens to be live at the moment, as companies will happily advertise a position even as they're waiting for a candidate to accept an offer they've made, keeping new applicants around 'just in case'.

But I just use cron, shell, and github.


I'm building a product in kinda new market. I'll use something like this to keep up with other landing pages.


I've set it up to monitor an "out-of-stock" string on a product I'd like to buy, but is going to be hard to get in the run up to Christmas, so I guess that's one.

If I was very keen, I guess I might pay a little to check more often than daily (although I haven't done that in this case).


I've used a script I wrote that does this basically to try and snag react conf tickets https://github.com/jnmandal/sitewatcher


One interesting case is tracking news sites. The details added or removed from new stories over time can sometimes support a claim of bias or agenda-driven reporting. Of course sometimes they are just corrections.


It could help with competitor analysis - point it to the websites of competition in your industry to stay on top of where they are going.


Monitoring companies you want to work for for new job postings is what I use similar tools for.


I wouldn't wait for a "formal opening" before applying. In fact, if you do, you're competing with even more people.


How has that worked for you?


So when do you apply?


Immediately. There's no harm, and it's more likely that a company is always interested in good applicants. You can always follow up again when the position does show up on their open position listing.

Also, if you apply during the "offseason", it's more likely that technical hire has more time to respond to you in more detail. I've developed a couple relationships that way that paved the way for an eventual hire in one case and a technical relationship in another.


So you just submit your resume to places you'd like to work, even if they're not hiring? Do you even look at job postings? That's interesting...


> What is the use case for this?

I've written something similar for a company who wanted to monitor their rival's website for news about expansion.

> Why would I pay for it?

I'm not sure either. A quick script in a cron job could also do this.


> A quick script in a cron job could also do this

I love comments like this. You couldn't hope for a better example of "shit's easy syndrome".

Yeah, a quick script in a cron job. Oh, but now we need 2 pages monitored. Now we need 100. Now provide an interface for marketing to add and remove pages. Make sure it emails these N people (different for each page of course!). Oh, this has to be up 24/7 so we'll need a server and monitoring and a test environment and deploy strategy. Your script is giving false positives because of changing asset timestamps. Oh, your script needs to provide a visual indication of what changed and when. Etc etc etc et cetera.

That's why you pay for things like this. Because it's actually a shitload of work.

I recently used a similar site (visualping.com) to score a pair of airpods, which are in tight supply, by watching my local apple store for availability. Worked beautifully. Write my own script, are you kidding?


> Yeah, a quick script in a cron job. Oh, but now we need 2 pages monitored. Now we need 100. Now provide an interface for marketing to add and remove pages. Make sure it emails these N people (different for each page of course!). Oh, this has to be up 24/7 so we'll need a server and monitoring and a test environment and deploy strategy. Your script is giving false positives because of changing asset timestamps. Oh, your script needs to provide a visual indication of what changed and when. Etc etc etc et cetera.

> I recently used a similar site (visualping.com) to score a pair of airpods, which are in tight supply, by watching my local apple store for availability. Worked beautifully. Write my own script, are you kidding?

You are totally conflating use cases. For most people, monitoring 2/3 things, a cron script works fine. In fact, I have one that's running right now and it Just Works (TM). By the time I need to provide visual diffs and have bulk email logic, I'm probably going to use an external tool (or whip it up as a side project if I think it's worth my time).

You can't equivocate the n=1 case and the n=10000 case.


> You can't equivocate the n=1 case and the n=10000 case.

Fair enough, but the comment I was replying to was equivocating those cases too. I was simply pointing out that maybe, just maybe, a service like this might have some value above and beyond a 5 minute script and a cron job. And indulging in a small rant against the "i could do that in a script/i could write that in a weekend" etc mindset.


I mean, I've written scripts like this, complete with Chef server deployment and filtering criteria. It's not surprising to see HN mention how feasible it is; we're not really the target market.


"I'm not sure either. A quick script in a cron job could also do this."

Except probably 99% percent of the planet couldn't tell you what a cron job is.


I think 99% is low balling it. Also, not all of the remainder could actually set one up either.


So, I am part of the 1%! Gosh, I didn't even know...


I get that, but we are on hackernews right now. Presumably a larger portion of the users here know what cron is.


Long ago, web browsers used to actually have this capability built-in. I remember doing it with Netscape Navigator.


Wanted to do something similar for so long !! I am also interested if you do ajax sites, and if so how ?


If you want to build something like this on your own, check Page.REST.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: