It's a pity that there is no clear way to send takedown requests. We didn't ask for deceptive garbage to be generated as documentation for LibreOffice, but here it is and newbies are discovering it: https://deepwiki.com/LibreOffice/core/2-build-system (spoiler: LibreOffice has never used Buck as a build system)
Out of curiosity, how come LibreOffice has .buckversion, BUCK, .buckconfig, etc? This commit[1] does seem to indicate using Buck to build at one point, though it is 10 years old.
It seems like it uses Make to build the main thing and then Buck is used to publish Java API to Maven Central, rather than building. Which is kind of a side thing, I guess code organization wise it would be better if Buck was not included on the root level? But Buck requires it? Although not a huge deal, but explains at least how LLM thought it was equivalent in importance.
Indeed and the BUCK file was updated 2 months ago. It definitely does use BUCK. Maybe it's not the main build system, but it's not hard to see why someone or something unfamiliar with the project would think that.
Really there should be a comment in the BUCK file explaining what it is.
I sent them a politely worded threat and they responded right away opting me out:
> Hello, I am writing you as an author of Open Source software seeking to protect my security and that of my users.
> What I would like to know is: how may I prevent deepwiki from indexing my projects, specifically those in the ----- GitHub organization? If you consider yourselves to have implicit legal permission to train on my projects and write about them, know that I hereby explicitly and permanently revoke that permission.
> Since you likely believe that I lack the authority to get you to stop, I will add this:
> To the extent allowed by law I will consider any incorrect information you publish about my projects to be libelous and, given this notice, made with your intention. LLMs have no will to act, so publishing misinformation about my project, at such time as that happens, could only be the result of human will.
Just because you "consider" incorrect information to be libelous does not mean that it is. While it's true that LLMs have no will to act, the use of an LLM to publish information that ends up being incorrect does not imply that the user of the LLM intended to post incorrect information.
IANAL, but I don't believe that lack of intent to cause harm means that relief can't be granted. You could still take legal action if they refused to remove the information. To my knowledge, a question that still hasn't been thoroughly tested from a legal perspective is to what degree users of LLMs should be reasonably expected to be aware of the potentially for false information and to what degree continued use despite that knowledge constitutes willful negligence.
Humans can make mistakes too when compiling information and when mistakes are done unintentionally without the intent of causing harm, I believe liability is typically limited. I would expect the same would be true of LLM use. As long as the user of the LLM has taken reasonable precautions to ensure accuracy, I think liability probably should be limited in most cases. In the case of DeepWiki getting something wrong, I think the case for significant reputational damage is pretty weak.
It's reasonable precautions where it seems to me they are likely to be what I would consider partially to wholly negligent. The lack of an ability to opt out, for example.
Sometimes you have to get angry to set boundaries.
Anyone human who puts their name to their words is free to write about my work.
If their wiki was just a wiki with human and AI contributorship that would have been a much better product and I probably would have been fine with it.
Joseph R. McConnell et al. (January 6, 2025). Pan-European atmospheric lead pollution, enhanced blood lead levels, and cognitive decline from Roman-era mining and smelting. https://www.pnas.org/doi/full/10.1073/pnas.2419630121
"eIRC is a modern, scalable enterprise messaging architecture built on the IRC protocol. Designed for organizations that require ephemeral, real-time communication without the heavy operational overhead of pub/sub systems like Apache Kafka, eIRC delivers high-throughput, low-latency chat experiences while minimizing memory and CPU usage per user."
It does support history as well: "IRC History Bridge: Implement Redis-backed buffer for message replay".
That's too far in the other direction, IMO. The IRC protocol was a poorly designed mess. Tying yourself to it means inheriting all of its bizarre quirks and limitations, and there's very little that existing IRC servers do that would be difficult to replicate.
(For a taste of just how weird and terrible IRC can be, try to answer the question "what is the maximum length of an IRC message". If your answer is a specific number, it is incorrect.)
In my understanding, IRC was not designed at all. It has more like been built in an ad-hoc manner. IRCv3 is an attempt to create a clear specificatiot.
> But that doesn't mean that I have to lay any more credence to the guys who have, since time immemorial, said "this new thing makes something easier; it will make us worse because we do not slog as much".
I don't use LLMs myself, but the anecdotes about skill atrophy seem credible. From the OP: "Even for me it shows. I tried to write some test code recently and I absolutely forgot how to write table tests because I generate all that. And it’s frightening."
"I was one of the many who reported this problem in one from or another. The problem is Windows-specific. I have found out that the problem actually comes from the Windows print system. There is no way to check that a printer is actually active or working or even present without incurring a long time-out in case the printer is not present or powered. Trying to check the defualt printer will incur the same time-out.
Calc apparently wants to check the printer to know how to construct its layout, and has to wait for the time-out to continue.
Some of the comments that claim that Calc hangs and never returns have probably not waited long enough for the timeout.
On my new Windows 11 computer, this printer system behavior has been changed and I no longer experience a delay while opening Calc."
"On a summer day, she was discussing literary philosophy with her brother Per Olov Jansson next to the outhouse at their summer cottage in the archipelago. Tove quoted Immanuel Kant, who Per Olov immediately downplayed. To get back at her brother, Tove drew the ugliest creature she could imagine on the outhouse wall. That drawing, out of chance, is the first glimpse of a Moomin-like figure, although Tove called it a Snork."