It's impossible to make. You cannot prove any sentence was created by an LLM and...

nottorp · on Feb 21, 2024

SEO pages pushing some product are SEO pages pushing some product. You should ignore them no matter what the source is, so what does it matter if they're LLM generated or hand written?

The problem is that people keep consuming the samey low quality content instead of skipping it (think superhero movies and Netflix series that are all indistinguishable from each other). As long as they're satisfied with that, they'll fall for fake product reviews too.

solardev · on Feb 21, 2024

Maybe you can't determine that with certainty, but there may be statistical tools you can use to estimate the probably that some content came from one of the LLMs we know about based on their known writing styles?

Someone did something like that to identify HN authors (as in correlating similar writing styles between pseudonyms) a few years back, for example: https://news.ycombinator.com/item?id=33755016

Or a study applying similar analysis to LLMs: https://arxiv.org/pdf/2308.07305

Of course, LLM output can be tweaked to evade these, just like humans can alter their writing style or handwriting to better evade detection. But it's one approach.

yunwal · on Feb 21, 2024

Unless you design the LLM yourself and purposefully watermark the output. https://arxiv.org/pdf/2306.04634.pdf

vasco · on Feb 21, 2024

That's a digital signature, same as sending an email with GPG to prove you sent it. You wouldn't say that because some people use GPG you can somehow detect who wrote every email on earth, it's a push model vs pull. This is why I wrote "any sentence" vs "some sentences".

dns_snek · on Feb 21, 2024

Watermarking is not at all like a digital signature and a lot like steganography. I only have a surface level understanding of the process, but it works by biasing token selection to encode information into the resulting text in a way that's resistant to later modifications and rephrasing.

I have my doubts about the effectiveness of this method and realistically, it won't make any difference because the bad actors will just use an LLM that doesn't snitch on them, so you're technically correct.

vasco · on Feb 21, 2024

The only way to make that stenography robust is to have the encoded message be generated with some secret key that can be verified. Otherwise anyone could manually fake the stenography into human typed messages assisted by some encoder and you'd have no way of telling if it was really typed by an LLM. That line of thinking is what makes it have to be like a signature to work like you said for "any sentence". I also think these methods only work above certain character limits. Short messages are impossible to tell.

tempusalaria · on Feb 21, 2024

If you look here : GitHub.com/HNx1/IdentityLM you can see that it’s relatively easy to sign LLM output with a private key using an adaptation of the watermarking method.

vasco · on Feb 21, 2024

This application is exactly what I was describing. I'll look it over to see how it scales the encryption strength based on token length or how it deals with short messages, which is the only thing I'd think it'd be very hard to do. If you print 2 paragraphs it's easy to change some tokens with a secret key mask but if you print "Yes", it's not so easy. Thanks for the great share.

FergusArgyll · on Feb 21, 2024

you can always ask it to include a '!' after every word and then sed it away. Poof, there goes your watermark