Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not sure if the author is here, but the downloadable SQLite database significantly benefits from applying compression (~75% with gzip).

Also, is there a write-up of how they collected the prices? I have wanted to do a similar analysis for years, but immediately gave up realizing I would be spending 95% of my efforts scraping and entity matching. By and large manufacturers seem to go out of their way offer unique SKUs intentionally to avoid comparisons.



I was going to mention that your browser almost certainly sends an Accept-Encoding: gzip header, but it appears the server doesn't care to sent a Content-Encoding: gzip back!


I am the author - I appreciate your comment & the parent comment. I'll make the sqlite file more manageable shortly (wasn't expecting the project to get this much attention so its taking a while to catch up with everything!)


Thanks for creating this - I have cross-posted a link to your site/project on Reddit to "/r/loblawsisoutofcontrol"


With 7z compression level 9, the .sqlite archive comes down to 61 MiB, which is about a 92% file size reduction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: