Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anybody runs the HIBP password DB locally? (ideally with this latest treasure trove) I saw some converted it to a Bloom filter (which makes lots of sense: for all the passwords on which it answers 'definitely not in set', you know there's no false positive and in case it'd answer 'potentially in set' you could still query manually against the online DB).

I'll search online but if a fellow HNer runs it offline, I'm all ears...

P.S: I've got Gbit/s FTTH as well as servers in datacenters so downloading tens of gigabytes ain't an issue



It is only tens of gigabytes, no need for fiber to handle this. The 37GB of files can be downloaded in 1 hour on an 83mbps link.

(1gbps is 450GB/hour, useful for estimating things)


My favorite rule of thumb is that 100Mbps is about 1TB/day.


I wrote a thing for this back when you could download the whole hash database as a single torrent, but I haven’t checked it since they moved over to the PwnedPasswordsDownloader system. This doesn’t use any probabilistic data structures though, it just packs the database into the smallest binary file I could come up with.

https://github.com/tylerchr/pwnedpass


I have a project which acts as a local cache for the HIBP database.

https://github.com/lorenz/hibp-cached

It downloads and continually updates from the upstream database while serving the identical API. On a fast link it can download the entire thing in a few hours.

It just uses a giant BoltDB file to store compressed chunks.


Curious about your use case; using their online service [1] is up-to-date and reveals almost no information.

[1] https://news.ycombinator.com/item?id=39044339




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: