More

kpcyrd · 2026-03-21T12:52:44 1774097564

You don't think non-consensually revealing somebody's identity is a problem?

Resorting to DDoS is not pretty, but "why is my violent behavior met with violence" is a little oblivious and reversal of victim and perpetrator roles.

ajam1507 · 2026-03-21T13:21:54 1774099314

> You don't think non-consensually revealing somebody's identity is a problem?

I do think it’s a problem. You are the only one excusing bad behavior here.

Dylan16807 · 2026-03-22T02:21:25 1774146085

If it's information that's medium-difficult to get, and the only people that would use the information to cause harm can easily put in more effort than that, then I don't think it's "violence" to post that information.

kpcyrd · 2026-03-20T01:36:17 1773970577

I stopped programming in python about 8-9 years ago because the tooling was so bad.

kpcyrd · 2026-03-20T01:23:08 1773969788

Step 1: discontinue the public repository, step 2: sell access to your GPL codebase.

The GPL (and even the AGPL) doesn't require you to make your modified source code publicly available (Debian explicitly considers licenses with this requirement non-free). The GPL only states you need to provide your customers with source code.

selcuka · 2026-03-20T01:34:20 1773970460

Sure, but it also allows your customers to modify the source code you provided, and distribute/sell it. With MIT they can simply relicence it and sell binary-only versions. The open-ness stops at that point.

kpcyrd · 2026-03-20T01:09:25 1773968965

I think this was more about "please choose _any_ license" because of the problem outlined here:

https://opensource.stackexchange.com/questions/1150/is-my-co...

kpcyrd · 2026-03-12T17:30:12 1773336612

I feel like this is related to these issues (with somebody attempting this approach for real):

https://github.com/chardet/chardet/issues/327

https://github.com/chardet/chardet/issues/331

ylere · 2026-03-12T19:40:18 1773344418

It also shows why this approach is questionable. Opus 4.6 without tool use or web access can provide chardets source code in full from memory/training data (ironically, including the licensing header): https://gist.github.com/yannleretaille/1ce99e1872e5f3b7b133e...

torginus · 2026-03-12T21:18:26 1773350306

This comes with the uncomfortable implication that its impossible to tell actually to what extent are LLMs pulling together snippets of GPLd code, and to what extent is that legally acceptable.

pera · 2026-03-12T22:31:19 1773354679

There are a lot of examples like that since the first announcement of GitHub Copilot in 2021, search for (copying) "verbatim" in this submission:

https://news.ycombinator.com/item?id=27676266

Here is a more recent example I found in Cursor's browser experiment from January:

https://news.ycombinator.com/item?id=46661236

SlinkyOnStairs · 2026-03-12T21:46:41 1773352001

> and to what extent is that legally acceptable.

De-jure, not at all.

Parallel creation is a very minimal defense to copyright infringement claims. It is practically impossible to prove in humans, to much annoyance of musicians. "Go prove in a court that you have never heard this song, not even in the background somewhere".

LLMs having been trained on all software they could get their hands on will fail this test. There is no parallel creation claim to be had. AI firms love to trot out the "they learn just like humans" which is both false and irrelevant; It's copyright when humans do it to. If you view a GPL'd repo and later reproduce the code unintentionally? Still copyright infringement.

De-facto though, things are different. The technical details behind LLMs are irrelevant. AI companies lie and frustrate discovery, whilst begging politicians to pass laws legalizing their copyright infringement.

There won't be a copyright reckoning, not anymore. All the dumb politicians think AI is going to bail out their economies.

codethief · 2026-03-12T20:37:50 1773347870

Wow, I did not expect such perfect reproduction. Link to the actual source code (before being rewritten):

https://github.com/chardet/chardet/blob/5.0.0/chardet/mbchar...

ylere · 2026-03-13T01:01:20 1773363680

Indeed, and that's through the API. If you use Claude Chat/Code and even if you then turn off web search, it still has access to some of its tools (for doing calculations, running small code snippets etc.) and that environment contains chardets code 4 times:

  /home/claude/.cache/uv/archive-v0/nZCy52fMCgTsNaLySn0xf/chardet
  /home/claude/.cache/uv/wheels-v6/pypi/chardet
  /usr/lib/python3/dist-packages/pip/_vendor/chardet
  /usr/local/lib/python3.12/dist-packages/chardet

It's not surprising that they were able to create a new, working version of chardet this quickly. It seems the author just told Claude Code to "do a clean room implementation" and to make sure the code looks different from the original chardet (named several times in the prompt) without considering the training set and the tendency for LLMs to "cheat".

lupire · 2026-03-12T17:45:26 1773337526

That's worth its own submission and discussion.

alberto-m · 2026-03-12T18:08:25 1773338905

It has been submitted last week, happy reading:

https://news.ycombinator.com/item?id=47259177

alexwebb2 · 2026-03-12T22:26:33 1773354393

Wow. The guy who’s been thanklessly maintaining the project for 10+ years, with very little help, went way out of his way to produce a zero-reuse, ground-up reimplementation so that it could be MIT licensed... and the very-online copyleft crowd is crucifying him for it and telling him to kick rocks.

Unbelievable. This is why we can’t have nice things.

aeyes · 2026-03-13T13:15:26 1773407726

Mark Pilgrim isn't even the original author, he just ported the C version to Python and contributed nothing to it for the last 10 years.

If you take 5 minutes to look at the code you'll see that v7 works in a completely different way, it mostly uses machine learning models instead of heuristics. Even if you compare the UTF8 or UTF16 detection code you'll see that they have absolutely nothing in common.

Its just API compatible and the API is basically 3 functions.

If he had published this under a different name nobody would have challenged it.

marxisttemp · 2026-03-13T02:09:20 1773367760

Nothing to help out a thankless maintainer like allowing companies to use his work wholesale while contributing nothing back! Enjoy your nice things

kpcyrd · 2026-03-11T22:50:39 1773269439

Many of the anti-debugging techniques for desktop binaries do not work on WebAssembly: it can't jump to an address, it can't read the instruction pointer, it can't read/access it's own machine code, ...

kpcyrd · 2026-03-11T22:46:32 1773269192

Obfuscated javascript could still import a WebAssembly polyfill, if there really was any advantage in doing so: https://github.com/evanw/polywasm

Since WebAssembly instructions are much easier to reason about, you could probably auto-optimize away a lot of the obfuscation, like "this is a silly way to do X, so we can just do X directly".

kpcyrd · 2026-03-11T22:21:58 1773267718

It's mostly Rust compiled to wasm binaries. There's also TinyGo and you could use C/C++ as well, but those 3 are a lot less common as far as I can tell.

pjmlp · 2026-03-12T08:26:19 1773303979

And Blazor, however I am not a big fan of it, it feels like an escape path for WebForm developers, and I surely don't want to debug that kind of code again, MVC is much better approach.

kpcyrd · 2026-03-10T12:54:35 1773147275

Your open source experience is very different from my open source experience.

kpcyrd · 2026-03-09T13:33:32 1773063212

Running Rust on them worked well for me: https://github.com/kpcyrd/ch32v003-demo

I had to put in more effort regarding RAM use and flash size, but I managed to fit a game into the 16kb limit regardless: https://github.com/kpcyrd/game-streetcat2026

pedro_caetano · 2026-03-11T09:03:44 1773219824

Very nice walkthrough in the demo readme, thanks for sharing.