Hacker News new | past | comments | ask | show | jobs | submit login

That's the first time I see a regular expression implementation with intersection, apart from my own from many, many moons ago. A pleasant surprise. I also had added a negation operator, which adds another level of unintuitive behavior. E.g., you might think that !a would block aa from matching, but it doesn't.

The anagram operator is a new one for me. Nice.




I've once implemented a regular expression engine based directly on https://en.wikipedia.org/wiki/Brzozowski_derivative – it's straightforward to implement both intersection and negation this way.


hfst (open source rewrite of the Xerox finite state toolkit) includes negation:

    $ echo 'a b
    c b
    c d*' | hfst-regexp2fst > ab.fst
    $ echo 'c ?+' | hfst-regexp2fst >cdotplus.fst
    $ hfst-intersect ab.fst cdotplus.fst | hfst-expand -c3
    cb
    cd
    cdd
    cddd
    cdddd
    $ hfst-expand -c3 ab.fst
    ab
    cb
    c
    cd
    cdd
    cddd
    
The regex syntax is a bit quirky due to backwards compatibility with lexicons written in XFST, see https://github.com/hfst/hfst/wiki/Regular-Expression-Operato...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: