This is a biased article which argues in favor of DRM (as paywalls) because it is published in a paywalled magazine.
Further in, the article admits that such content is already free and sits inside the page source but is obfuscated by running code.
US law has the precedent (and a recent case that's about AI training) that training, reading, and transforming are not illegal if the materials themselves are legally obtained. Wholesale duplication of copyright material is illegal, but AI companies have already shown in court that they don't duplicate material but rather transform it at great effort and expense.
The article makes it clear that Common Crawl is ignoring copyright take down requests, and only modified it's search engine to fake having taken down content.
Further in, the article admits that such content is already free and sits inside the page source but is obfuscated by running code.
US law has the precedent (and a recent case that's about AI training) that training, reading, and transforming are not illegal if the materials themselves are legally obtained. Wholesale duplication of copyright material is illegal, but AI companies have already shown in court that they don't duplicate material but rather transform it at great effort and expense.