Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

TBH, I feel like regexes would be much easier to understand in a more literate form with standard syntax. Something like:

  user_part = re.repeat(re.alnum | re.chars(".-_+"))
  domain_segment = re.repeat(re.alnum)
  domain = re.list(domain_segment,separator=".",minimum=2)
  email_address = user_part + "@" + domain
Where, in a real program, `domain` would be defined in a "standard library of constructions" that you can just import and re-use in more complicated regexes.

Something like this can be implemented in any language with operator overloading, no DSL required. Without operator overloading, the syntax would be a bit more awkward, but still nicer than the current regexp madness.



I don't quite understand where regex gets its reputation from. I think that once you remember the meaning of the operators, it's not too bad. (And the concise syntax is actually very helpful.)

I get that the meaning of the operators is not clear unless you're already familiar with regex, but neither is the meaning of !, ?, %, &, |, ^, ~, &&, ||, <<, >>, *, //, &, ++ (prefix), ++ (postfix), and so on. You learn these because you need them once, and then they're burned into your mind forever. Regex was similar for me.


I think regex gets some of its hate from people writing painfully complex matchers. 99% of my day to day regex use is simpler string searches on the command line or in my editor. I’m really happy I took the time to learn the syntax (spent about a week on it around 15 years ago) because now it doesn’t get in my way.


Regex syntax helps you understand what a regex does, not necessarily why it does it.

You can't decompose it into parts, you can't give those parts human-friendly names, you can't re-use parts in other regexes, you can't (easily) write functions that return or manipulate regexes (like that "list with separator" function shown above).


The limitation with regex is that's it's context free, whereas a regular grammar is simpler and more powerful IMO.


Also, regexps are just that: regular. You can mostly read them from left to right decoding each symbol at a time.


recommend you check out raku Grammars … https://docs.raku.org/language/grammars



There’s really nothing about this that couldn’t be expressed just as clearly in a more general API though, even in C, IMO. Building it into the syntax is a little weird to me.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: