Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How?


use a regex like (aardvark|apple|...|zebra)* and the standard DFA construction for regular expressions.


You are technically right. Though if we assume that the size of the dictionary is at least o(n), then the size of such DFA will be exponential in n, and indexing it will be o(n), resulting in a o(n^2) solution again, I think.


The number of states is exponential for a general regex. This regex has an NFA closely resembling the trie for the dictionary. There could be at most one live state per depth in the trie. So the number of possible states is bounded at least by (words in dictionary)^(length of longest word). Really much smaller because the states at different levels have to share prefixes.


They are completely correct. If the DFA fits in RAM, following state transitions will be O(1) and using the DFA for a concatenated string of length N will be O(N). You simply missed a good solution.


Without the *, you get a graph that is a compressed version of the trie. With the *, the size of the DFA is extremely likely to explode.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: