Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I didn't know about the algorithm until after I got hired there. It's actually really useful in a number of contexts, but my favorite was using it to find optimal split points for sharding lexicographically sorted string keys for mapping. Often you will have a sorted table, but the underlying distribution of keys isn't known, so uniform sharding will often cause imbalances where some mappers end up doing far more work than others. I don't know if there is a convenient open source class to do this.



Interesting idea, hadn’t that about that way to apply it.

I knew it from before my interview from a turbo pascal program I had seen that sampled dat tape backups of patient records from a hospital system. These samples were used for studies. That was a textbook example of it’s utility.


I guess the question in my mind is: would you expect a smart person who did not previously know this problem (or really much random sampling at all) to come up with the algorithm on the fly in an interview? And if the person had seen it before and memorized the answer, does that provide any signal of their ability to code?


My gut instinct is no. I certainly don't think I'd be able to derive this algorithm from first principles in a 60 minute whiteboarding interview, and I worked at Google for 4 years.


They wanted to see your analytical thinking skills at work. To pass you only needed to be sensible. You didn’t fail the interview if you couldn’t invent reservoir sampling!


uh, no, people would get a fail on the question if they didn't correctly identify both the initial selection and sample acceptance criteria.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: