Just want to be clear, I am not the developer behind this. I found this and was looking for discussion on HN about it because it was quite impressive. I wasn't able to find it by searching so I submitted it in hopes that it would take me to the past article (that is how the url duplicate detection code used to work), but instead it created a new story.