You don't think non-consensually revealing somebody's identity is a problem?
Resorting to DDoS is not pretty, but "why is my violent behavior met with violence" is a little oblivious and reversal of victim and perpetrator roles.
If it's information that's medium-difficult to get, and the only people that would use the information to cause harm can easily put in more effort than that, then I don't think it's "violence" to post that information.
Step 1: discontinue the public repository, step 2: sell access to your GPL codebase.
The GPL (and even the AGPL) doesn't require you to make your modified source code publicly available (Debian explicitly considers licenses with this requirement non-free). The GPL only states you need to provide your customers with source code.
Sure, but it also allows your customers to modify the source code you provided, and distribute/sell it. With MIT they can simply relicence it and sell binary-only versions. The open-ness stops at that point.
It also shows why this approach is questionable. Opus 4.6 without tool use or web access can provide chardets source code in full from memory/training data (ironically, including the licensing header): https://gist.github.com/yannleretaille/1ce99e1872e5f3b7b133e...
This comes with the uncomfortable implication that its impossible to tell actually to what extent are LLMs pulling together snippets of GPLd code, and to what extent is that legally acceptable.
Parallel creation is a very minimal defense to copyright infringement claims. It is practically impossible to prove in humans, to much annoyance of musicians. "Go prove in a court that you have never heard this song, not even in the background somewhere".
LLMs having been trained on all software they could get their hands on will fail this test. There is no parallel creation claim to be had. AI firms love to trot out the "they learn just like humans" which is both false and irrelevant; It's copyright when humans do it to. If you view a GPL'd repo and later reproduce the code unintentionally? Still copyright infringement.
De-facto though, things are different. The technical details behind LLMs are irrelevant. AI companies lie and frustrate discovery, whilst begging politicians to pass laws legalizing their copyright infringement.
There won't be a copyright reckoning, not anymore. All the dumb politicians think AI is going to bail out their economies.
Indeed, and that's through the API. If you use Claude Chat/Code and even if you then turn off web search, it still has access to some of its tools (for doing calculations, running small code snippets etc.) and that environment contains chardets code 4 times:
It's not surprising that they were able to create a new, working version of chardet this quickly. It seems the author just told Claude Code to "do a clean room implementation" and to make sure the code looks different from the original chardet (named several times in the prompt) without considering the training set and the tendency for LLMs to "cheat".
Wow. The guy who’s been thanklessly maintaining the project for 10+ years, with very little help, went way out of his way to produce a zero-reuse, ground-up reimplementation so that it could be MIT licensed... and the very-online copyleft crowd is crucifying him for it and telling him to kick rocks.
Unbelievable. This is why we can’t have nice things.
Mark Pilgrim isn't even the original author, he just ported the C version to Python and contributed nothing to it for the last 10 years.
If you take 5 minutes to look at the code you'll see that v7 works in a completely different way, it mostly uses machine learning models instead of heuristics. Even if you compare the UTF8 or UTF16 detection code you'll see that they have absolutely nothing in common.
Its just API compatible and the API is basically 3 functions.
If he had published this under a different name nobody would have challenged it.
Many of the anti-debugging techniques for desktop binaries do not work on WebAssembly: it can't jump to an address, it can't read the instruction pointer, it can't read/access it's own machine code, ...
Obfuscated javascript could still import a WebAssembly polyfill, if there really was any advantage in doing so: https://github.com/evanw/polywasm
Since WebAssembly instructions are much easier to reason about, you could probably auto-optimize away a lot of the obfuscation, like "this is a silly way to do X, so we can just do X directly".
It's mostly Rust compiled to wasm binaries. There's also TinyGo and you could use C/C++ as well, but those 3 are a lot less common as far as I can tell.
And Blazor, however I am not a big fan of it, it feels like an escape path for WebForm developers, and I surely don't want to debug that kind of code again, MVC is much better approach.
Resorting to DDoS is not pretty, but "why is my violent behavior met with violence" is a little oblivious and reversal of victim and perpetrator roles.
reply