Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Why are you merging up to one rare token at the beginning or at the end? Let’s consider that someone searched for C_0 R_1 C_2 C_3. If we don’t do this merge, we would end up searching for C_0, R_1, C_2 C_3, and this is bad. As established, intersecting common tokens is a problem, so it’s way better to search C_0 R_1, C_2 C_3. I learned this the hard way…

But since R_1 C_2 C_3 is in the index as well, instead of searching for C_0 R_1, C_2 C_3 with a distance of 2, you can instead search for C_0 R_1, R_1 C_2 C_3 with a distance of 1 (overlapping), which hopefully means that the lists to intersect are smaller.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: