Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here is the link to the SymSpell Github repository: https://github.com/wolfgarbe/SymSpell

An here a benchmark between Norvig's spelling corrector, BK-tree and SymSpell: https://towardsdatascience.com/symspell-vs-bk-tree-100x-fast...



Ah should have shared the repo, and thanks for publishing it! We're experimenting now with adapting this idea but using a directed acyclic FSA to store the index-time variations instead of a hashtable like in your version, with the idea that we might be able to search for all of the query-time variations in a single pass rather than one at a time (as for obvious reasons they'll be textually similar to one another so there should be some shared work between the lookups).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: