Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How is this supposed to work? By inserting special unicode characters?

How can you watermark text?



I haven't read how Google is doing it, but one way it could be done is to nudge which tokens get sampled. For example, every other token could have an odd numbered id (where each token is assigned an id from 0 to 32000 or however many it has). Then in order to detect the watermark you just tokenize the text and see if the pattern is there. A problem with this approach is that it harms the accuracy and coherency, for example if you ask "What is 2+2", and the token "4" is token #102, and it has to pick an odd-numbered token, then it may respond with a wrong answer or yap on strangely due to its limited selection of tokens (like "The accurate answer to your mathematical query is the number Four")


You can insert known spelling errors, choose certain phrasings, and more. It doesn't have to be new characters added to the text. Government security services have done stuff like this for decades to weed out moles.


moles should know better than to utilize mountweazels! https://en.wikipedia.org/wiki/Fictitious_entry


We've been studying unintentional watermarks for years.

https://en.wikipedia.org/wiki/Stylometry


You do not even need extra characters (although they help). You can use spaces, missing punctuation, upper/lower case in particular cases, conjunction usage and not using it, word substitution, common misspellings, transposed letters, etc. How many extra spaces/tabs can you add to the end of a paragraph? At the beginning? Between sentences? Inside them? Then you have an AI agent design it and then train another one to detect it.


> SynthID-Text works by discreetly interfering in the generation process: It alters some of the words that a chatbot outputs to the user in a way that’s invisible to humans but clear to a SynthID detector. “Such modifications introduce a statistical signature into the generated text,” [...] “During the watermark detection phase, the signature can be measured to determine whether the text was indeed generated by the watermarked LLM.”


As stated in the article, it alters the probabilities that the network produces in a predictable way so that a different (but still correct-sounding) word is picked. It subtly alters the wording from what it would have output normally in such a way that you can detect it, while still sounding correct to the user


There's an article from ieee that explains it:

https://spectrum.ieee.org/watermark#:~:text=How%20Google%E2%...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: