but the font changes won't be expressed in the (plain text) output of the LLM.

yifanl · 2025-02-11T22:03:36 1739311416

Presumably the font will represent letters to look like a different letter, making it not useful to LLMs scraping the site but useful for visual readers.

This would have detrimental effects to people who use screen readers or have their own stylesheets of course.

WillAdams · 2025-02-11T22:19:59 1739312399

For that it would make more sense to run a routine which replaces letters with visually identical glyphs at different encoding points.

palmotea · 2025-02-12T15:11:12 1739373072

> For that it would make more sense to run a routine which replaces letters with visually identical glyphs at different encoding points.

It seems like that would be pretty easily defeatable with the similar mapping to the one used to do the replacement.

WillAdams · 2025-02-12T17:28:18 1739381298

Yes, but it's one more hurdle for them.

chefandy · 2025-02-12T03:06:19 1739329579

That would only stymie the smallest-time players. Things like sideways text in margins or rotated table column headers are common enough that these have been solved problems for decades. Breaking the text down into specific elements and handling it differently or ignoring it altogether based on context and content is trivial.

rolph · 2025-02-12T04:11:53 1739333513

yes thats right a plain text will be distinctive from a watermark version thus outed as an automated forgery. vs incorrect watermark sugesting human attempt to forge, this introduces complications for the generation of output, namely conserving the cypher as well as making sense