Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s definitely not doing what the prompt asked for.

https://regex101.com/r/ZNQa9X/1

The generated regex is the same as

    (\bdog\b)|(\bcat\b)
https://regex101.com/r/vTtEU4/1

I’m currently trying to figure out how to match a word starting with dog without using

    \bdog.*
because

    .*
would proceed to eat the rest of the line.

So I was thinking I could say

    \bdog[^\b]*
But that doesn’t work, it also ends up eating the rest of the line as well.


Use \S which is the opposite set of \s which avoids eating word boundaries too.

    \b(dog\S*)|(\S*cat)\b
You could also use a \B instead of a \S though there are different meanings there.


It almost does the trick

https://regex101.com/r/sbpy8s/1

But this matches for example

    dog.cat
as one single word.

But I would like that it matches separately

    dog
and

    cat
in this case.

Likewise, I’d want for example

    dogapple-bananacat
to be matched as two separate words

    dogapple
and

    bananacat
After a bit more reading online I thought that maybe the following regex would do what I want:

    \b(dog\p{L}*)|(\p{L}*cat)\b
https://regex101.com/r/1NT5Ie/1

But that does not match

    dog42
as a word.

What I want is a way to include everything after dog that is not \b

And likewise everything preceding cat that is not \b

Edit: I think I’ve found it after reading https://stackoverflow.com/questions/4541573/what-are-non-wor...

    (\bdog\w*)|(\w*cat\b)
Seems to behave exactly like I want.

https://regex101.com/r/f3uJUE/1


Out of curiosity: if humans have trouble coming up with anything non-trivial, like regexes, why should something that has been trained on the output of humans do much better?

To me it feels like if 90% of $TASK content out there would be bad and people would struggle with it, then the AI-genrated $TASK output would be similarly flawed, be it regarding a programming language or something else.

As a silly example, consider how much bad legacy PHP code is out there and what the answers to some PHP questions could become because of that.

But it's still possible to get answers to simplistic problems reasonably fast, or at least get workable examples to then test and iterate upon, which can easily save some time.


After all, who needs wget when you have \wcat!


Agree; the ChatGPT answer is not correct, as the assignment is to match a word that starts with `dog` and ends with `cat`. You can make .* non-greedy by adding ? at the end, but it's not needed in this case, as the engine should backtrack. Something like this should work: /\bdog[\w_-]*cat\b/ (assuming _ and - should be allowed inside words). You can also specify word-separators ([^ ] instead of [\w_-]) if that's easier to read.


  \bdog\w\*


Yup. See my response to the other sibling comment. In particular:

    (\bdog\w*)|(\w*cat\b)
Seems to behave exactly like I want.

https://regex101.com/r/f3uJUE/1




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: