It’s definitely not doing what the prompt asked for. https://regex101.com/r/ZNQa...

shagie · on Feb 1, 2023

Use \S which is the opposite set of \s which avoids eating word boundaries too.

    \b(dog\S*)|(\S*cat)\b

You could also use a \B instead of a \S though there are different meanings there.

codetrotter · on Feb 1, 2023

It almost does the trick

https://regex101.com/r/sbpy8s/1

But this matches for example

    dog.cat

as one single word.

But I would like that it matches separately

dog

and

cat

in this case.

Likewise, I’d want for example

    dogapple-bananacat

to be matched as two separate words

    dogapple

and

    bananacat

After a bit more reading online I thought that maybe the following regex would do what I want:

    \b(dog\p{L}*)|(\p{L}*cat)\b

https://regex101.com/r/1NT5Ie/1

But that does not match

    dog42

as a word.

What I want is a way to include everything after dog that is not \b

And likewise everything preceding cat that is not \b

Edit: I think I’ve found it after reading https://stackoverflow.com/questions/4541573/what-are-non-wor...

    (\bdog\w*)|(\w*cat\b)

Seems to behave exactly like I want.

https://regex101.com/r/f3uJUE/1

KronisLV · on Feb 2, 2023

Out of curiosity: if humans have trouble coming up with anything non-trivial, like regexes, why should something that has been trained on the output of humans do much better?

To me it feels like if 90% of $TASK content out there would be bad and people would struggle with it, then the AI-genrated $TASK output would be similarly flawed, be it regarding a programming language or something else.

As a silly example, consider how much bad legacy PHP code is out there and what the answers to some PHP questions could become because of that.

But it's still possible to get answers to simplistic problems reasonably fast, or at least get workable examples to then test and iterate upon, which can easily save some time.

btown · on Feb 1, 2023

After all, who needs wget when you have \wcat!

paulclinger · on Feb 2, 2023

Agree; the ChatGPT answer is not correct, as the assignment is to match a word that starts with `dog` and ends with `cat`. You can make .* non-greedy by adding ? at the end, but it's not needed in this case, as the engine should backtrack. Something like this should work: /\bdog[\w_-]*cat\b/ (assuming _ and - should be allowed inside words). You can also specify word-separators ([^ ] instead of [\w_-]) if that's easier to read.

mminer237 · on Feb 1, 2023

  \bdog\w\*

codetrotter · on Feb 1, 2023

Yup. See my response to the other sibling comment. In particular:

    (\bdog\w*)|(\w*cat\b)

Seems to behave exactly like I want.

https://regex101.com/r/f3uJUE/1