Hacker Newsnew | past | comments | ask | show | jobs | submit | manbitesdog's commentslogin

I cringe every time I see Claude trying to co-author a commit. The git history is expected to track accountability and ownership, not your Bill of Tools. Should I also co-author my PRs with my linter, intellisense and IDE?

If those tools are writing the code then in general I do expect that to be included in the PR! Through my whole career I've seen PRs where people noted that code that was generated (people have been generating code since long before LLMs). It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself (which in my experience is the case where it's obvious boilerplate or the generated section is small).

Needing to flag nontrivial code as generated was standard practice for my whole career.


> It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself

If this is not the case you should not be sending it to public repos for review at all. It is rude and insulting to expect the people maintaining these repos to review code that nobody bothered to read.


Sometimes code generation is a useful tool, and maybe people have read and reviewed the generator.

The difference here is that the generator is a non-deterministic LLM and you can't reason about its output the same way.


As a rule, I commit the input to the code generation tool, i.e., what the GPL refers to as "the preferred form of the work for making modifications to it", generate as part of the build process, and, where possible, try to avoid code generation tools designed around the assumption that its output will be maintained rather than regenerated from modified input.

As for LLM code assistants, I don't really view them as traditional code generation tools in the first place, as in practice they more resemble something in between autocomplete and delegating to a junior programmer.

As for attribution, I view it more or less the same way as "dictated but not read" in written correspondance, i.e., an disclaimer for errors in the code, which may be considered rude in some contexts, and a perfectly acceptable and useful annotation in others.


"Here's what AI came up with and it mostly worked the one time I tested it. Might need improving".

No. I don't want to test and pick through your shitty LLM generated code. If I wanted the entire code base to be junk, it'd say so in the readme.


Usually, pre-LLM generated code is flagged because people aren't expected to modify it by hand. If you find a bug and track it to the generated code, you are expected to fix the sources and re-generate.

This is not at all the case with LLM-generated code - mostly because you can't regenerate it even if you wanted to, as it's not deterministic.

That said, I do agree that LLM code is different enough from human code (even just in regards to potential copyright worries) that it should be mentioned that LLMs were used to create it.


> If those tools are writing the code then in general I do expect that to be included in the PR!

How about compiler?


Compilers don't usually write the code that ends up in a PR. But compilers do (and should) generally leave behind some metadata in the end result saying what tools were used, see for example the .comment section in ELF binaries.

Are you checking in compiled artifacts? Then yeah, we should have a chain of where that binary blob came from.

Compiler versions are usually included in the package manifest. Generally you include commit info compiler version and compilation date and platform embedded in the binaries that compilers produce.

Absolutely. Let's say I have a problem with gRPC and traced it to code generated using the gRPC compiler. I can reproduce it, highlight it and I'm pretty sure the gRPC team would address the issue.

Replace gRPC compiler with LLM. Can you reproduce? (probably not 100%). Can anybody fix it short of throwing more english phrases like "DO NOT", "NEVER", "Under No Circumstances"?

Probably not.


>It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself

I thought the argument was that AI-users were reviewing and understanding all of the code?


> people have been generating code since long before LLMs

How? LSTM?



For example `rails generate ...` built into the Rails CLI.

See, for example, this blog post from 2014: https://go.dev/blog/generate

The following comment in the blog post

    //go:generate stringer -type=Pill
generates a .._string.go file which contains a '.String()' method.

I would find it very reasonable to commit that with 'Co-Authored-By: stringer v0.1.0' or such.

Or 'sed s/a/b/g' and 'Co-Authored-By: sed'


There are many techniques. You're most likely to come across things like declarative DSL:s and macros, then there are things like JAXB and similar tooling that generates code from data schemas, and some people script around data sources to glue boilerplate and so on.

Arguably snippet collections belong to this genre.


Holy shit I’m old.

You assemble all your machine code using a magnetized needle?

I am not against the general use of AI code. Quite simply, my view is that all relevant context for a review should be disclosed in the PR.

AI and humans are not the same as authors of PRs. As an obvious example: one of the important functions of the PR process is to teach the writer about how to code in this project but LLMs fundamentally don't learn the same way as humans so there's a meaningful difference in context between humans and AIs.

If a human takes the care to really understand and assume authorship of the PR then it's not really an issue (and if they do, they could easily modify the Claude messages to remove "generated by Claude" notes manually) but instead it seems that Claude is just hiding relevant context from the reviewer. PRs without relevant context are always frustrating.


What's really tricky with the legal protections area is this: 90% of the value of the S&P 500 is intangible. Meaning if you suck out the book value (10%), the rest is brand, IP, rights, sources & methods, etc. So if a company can't protect that, it's not particularly valuable anymore. Maybe we will see a shift back to tangible assets and book value (25,000 $8MM Vera Rubin machines) and away from intangibles...

I think this is just the beginning so people are apprehensive, rightfully so, at this stage. I agree with you that AI use should be disclosed but using the commit message as a billboard for Anthropic hell no. Go put an add on the free tier.

You don't generally commit compiled code to your VCS. If you do need to commit a binary for whatever reason, yeah it makes sense to explain how the binary was generated.

You do usually pin your compiler version though, or at the very least set a minimum version

Don't be silly.

I use good ol' C-x M-c M-butterfly.

https://xkcd.com/378/


Sometimes using AI to code feels closer to a Butterfly than emacs right?

A whole lot of people find LLM code to be strictly objectionable, for a variety of reasons. We can debate the validity of those reasons, but I think that even if those reasons were all invalid, it would still be unethical to deceive people by a deliberate lie of omission. I don't turn it off, and I don't think other people should either.

For the purpose of disclosure, it should say “Warning: AI generated code” in the commit message, not an advertisement for a specific product. You would never accept any of your other tools injecting themselves into a commit message like that.

My last commit is literally authored by dependabot.

well you know 100% know what dependabot does

Leaves you open to vulnerabilities in overnight builds of NPM packages that increasingly happen due to LLM slop?

You can set a minimum age for packages (https://docs.github.com/en/code-security/reference/supply-ch...), though that's not perfect (and becomes less effective if everyone uses it).

> becomes less effective if everyone uses it

I don’t think that’s necessarily the case. Exposure and discovery aren’t that tightly correlated. Maybe there’s a small effect, but I think it is outweighed by the fact that blast radius and spread is reduced while buying time for discovery.


But how much AI-generated code? If it's just a smallish function or two while most iof the code was written by hand?

My tools just don't add such comments. I don't know why I would care to add that information. I want my commits to be what and why, not what editor someone used. It seems like cruft to me. Why would I add noise to my data to cater to someone's neuroticism?

At least at my workplace though, it's just assumed now that you are using the tools.


well if I know a specific LLM has certain tendencies (eg. some model is likely to introduce off-by-one errors), I would know what to look for in code-review

I mean, of course I would read most of the code during review, but as a human, I often skip things by mistake


Tbh as long as the PR looks good, its good to go for internal testing.

What editor you are using has no effect on things like copyright, while software that synthesises code might.

In commercial settings you are often required to label your produce and inform about things like 'Made in China' or possible adverse effects of consumption.


If a whole of people thought that running code through a linter or formatter was objectionable, I'd probably just dismiss their beliefs as invalid rather than adding the linter or formatter as a co-author to every commit.

A linter or a formatter does not open you up to compliance and copyright issues.

Linters and formatters are different tools then LLMs. There is a general understanding that linters and formatters don’t alter the behavior of your program. And even still most projects require a particular linter and a formatter to pass before a PR is accepted, and will flag a PR as part of the CI pipeline if a particular linter or a particular formatter fails on the code you wrote. This particular linter and formatter is very likely to be mentioned somewhere in the configuration or at least in the README of the project.

Like frying a veggie burger in bacon grease. Just because somebody's beliefs are dumb doesn't mean we should be deliberately tricking them. If they want to opt out of your code, let them.

> frying a veggie burger in bacon grease

hmm gotta try that


I love black bean burgers (bongo burger near Berkeley is my classic), sounds like an interesting twist

Never fried one in bacon grease, but they are good with bacon and cheese. I have had more than one restaurant point out that their bacon wasn't vegetarian when ordering, though.

In your view, those who prefer veggie burgers are dumb. Am I misinterpreting?

I've heard similar things before. Frying a veggie burger in bacon grease to sneakily feed someone meat/meat-byproducts who does not want to eat it, like a vegan or a person following certain religious observances. As in, it's not ok to do this even if you think their beliefs are stupid.

In my view, vegans are dumb but it's still unethical to trick them into eating something they ordinarily wouldn't. Does that make sense to you? I am not asking you to agree with me on the merits of veganism, I am explaining why the merits of veganism shouldn't even matter when it comes to the question of deliberately trying to trick them.

Can you see a world where everyone has an AI Persona based on their prior work that acts like a RAG to inform how things should be coded? Meaning this is patent qualified code because, despite being AI configured, it is based on my history of coding?

Likewise. I don’t mind that people use LLMs to generate text and code. But I want any LLM generated stuff to be clearly marked as such. It seems dishonest and cheap to get Claude to write something and then pretend you did all the work yourself.

The reason I want it to be marked as such is because I review AI code differently than human code - it just makes different kinds of mistakes.

I think the issue is less attribution and more review mode. If I assume a change was written and checked line-by-line by the author, I review it one way. If an LLM had a big hand in it, I review it another way.

You can disclose that you used an LLM in the process of writing code in other ways, though. You can just tell people, you can mention it in the PR, you can mention it in a ticket, etc.

+1. If we’re at an early stage in the agentic curve where we think reading commit messages is going to matter, I don’t want those cluttered with meaningless boilerplate (“co-authored by my tools!”).

But at this point i am more curious if git will continue to be the best tool.


I'm only beginning to use "agentic" LLM tools atm because we finally gained access to them at work, and the rest of my team seems really excited about using them.

But for me at least, a tool like Git seems pretty essential for inspecting changes and deciding which to keep, which to reroll, and which to rewrite. (I'm not particularly attached to Git but an interface like Magit and a nice CLI for inspecting and manipulating history seem important to me.)

What are you imagining VCS software doing differently that might play nicer with LLM agents?


Of course git is great!

Check out Mitchell Hashimoto’s podcast episode on the pragmatic engineer. He starts talking about AI at 1:16:41. At some point after that he discusses git specifically, and how in some cases it becomes impossible to push because the local branch is always out of date.


So if I use Claude to write the first pass at the code, make a few changes myself, ask it to make an additional change, change another thing myself, then commit it — what exactly do you expect to see then?

A Co-Authored-By tag on the commit. It's a standard practice and the meaning is self-explanatory. This is what Claude adds by default too.

I make the commits myself, I don't let Claude commit anything.

I guess if enough people use it, doesn’t the tag become kind of redundant?

Almost like writing “Code was created with the help of IntelliSense”.


I don't think so. The tag doesn't just say "this was written by an LLM". It says which LLM - which model - authored it. As LLMs get more mature, I expect this information will have all sorts of uses.

It'll also become more important to know what code was actually written by humans.


I'm not really sure that's any of their business.

If you accept the code generated by them nearly verbatim, absolutely.

I don't understand why people consider Claude-generated code to be their own. You authored the prompts, not the code. Somehow this was never a problem with pre-LLM codegen tools, like macro expanders, IPC glue, or type bundle generators. I don't recall anybody desperately removing the "auto-generated do not edit" comments those tools would nearly always slap at the top of each file or taking offense when someone called that code auto-generated. Back in the day we even used to publish the "real" human-written source for those, along with build scripts!


It's weird, because they should not consider it as their own, but they should take accountability from it.

Ideally, if I contribute to any codebase, what needs to be judged is the resulting code. Is it up to the project's standards ? Does the maintainer have design objections ?

What tool you use shouldn't matter, be it your IDE or your LLM.

But that also means you should be accountable for it, you shouldn't defend behind "But Claude did this poorly, not me !", I don't care (in a friendly way), just fix the code if you want to contribute.

The big caveat to this is not wanting AI-Generated code for ideological reasons, and well, if you want that you can make your contributors swear they wrote it by themselves in the PR text or whatever.

I'm not really sure how to feel about this, but I stand by my "the code is what matters" line.


Sounds bit like the label "organic (food)" coiuld be applied to hand-written code?

Some differences with the human source for those kinds of tools: (1) the resultant generated code was deterministic (2) it was usually possible to get access to the exact version of the tool that generated it

Since AI tools are constantly obsoleted, generate different output each run, and it is often impossible to run them locally, the input prompts are somewhat useless for everyone but the initial user.


Well is it actually being used as a tool where the author has full knowledge and mental grasp of what is being checked in, or has the person invoked the AI and ceded thought and judgment to the AI? I.e., I think in many cases the AI really is the author, or at least co-author. I want to know that for attribution and understanding what went into the commit. (I agree with you if it's just a tool.)

I have worked with quite a few people committing code they didn't fully understand.

I don't meant this as a drive by bazinga either, the practice of copying code or thinking you understand it when you don't is nothing new


Pre-LLM, it was much easier for reviewers to discern that. Now, the AI-generated code can look like it was well thought out by somebody competent, when it wasn't.

Have you ever reviewed an AI-generated commit from someone with insufficient competence that was more compelling than their work would be if it was done unassisted? In my experience it’s exactly the opposite. AI-generation aggravates existing blindspots. This is because, excluding malicious incompetence, devs will generally try to understand what they’re doing if they’re doing it without AI

I think the issue is not that the patches are more compelling but that they're significantly larger and more frequent

I have. It's always more compelling in a web diff. These guys are the first coworkers for which it became absolutely necessary for me to review their work by pulling down all their code and inspecting every line myself in the context of the full codebase.

I try to understand what the llm is doing when it generates code. I understand that I'm still responsible for the code I commit even if it's llm generated so I may as well own it.

Yes and if they copy and paste code they don’t understand then they should disclose that in the commit message too!

Yes, it sets the reviewer's expectations around how much effort was spent reviewing the code before it was sent.

I regularly have tool-generated commits. I send them out with a reference to the tool, what the process is, how much it's been reviewed and what the expectation is of the reviewer.

Otherwise, they all assume "human authored" and "human sponsored". Reviewers will then send comments (instead of proposing the fix themselves). When you're wrangling several hundred changes, that becomes unworkable.


Sent from my iPhone

> Should I also co-author my PRs with my linter, intellisense and IDE?

Absolutely. That would be hilarious.


Tools do author commits in my code bases, for example during a release pipeline. If I had commits being made by Claude I would expect that to be recorded too. It isn't for recording a bill of tools, just to help understand a projects evolution.

I suspect vibe coders might actually want you to consider turning to Claude for accountability and ownership rather than the human orchestrator.

If your linter is able to action requests, then it probably makes sense to add too.


Eh, there are some very good reasons[0] that you would do better to track your usage of LLM derived code (primarily for legal reasons)

[0]: https://www.jvt.me/posts/2026/02/25/llm-attribute/


legally speaking.. if you're not sure of the risk- you don't document it.

>legally speaking.. if you're not sure of the risk- you don't document it.

Ah, so you kinda maybe sorta absolve yourself of culpability (but not really — "I didn't know this was copyrighted material" didn't grant you copyright), and simultaneously make fixing the potentially compromised codebase (someone else's job, hopefully) 100x harder because the history of which bits might've been copied was never kept.

Solid advice! As ethical as it is practical.

By the same measure, junkyards should avoid keeping receipts on the off chance that the catalytic converters some randos bring in after midnight are stolen property.

Better not document it.

One little trick the legal folks don't want you to know!


Seems ethical

Yea in my Claude workflow, I still make all the commits myself.

This is also useful for keeping your prompts commit-sized, which in my experience gives much better results than just letting it spin or attempting to one-shot large features.


No, because those things don't change the logical underpinnings of the code itself. LLM-written code does act in ways different enough from a human contributor that it's worth flagging for the reviewer.

> The git history is expected to track accountability and ownership, not your Bill of Tools.

The point isn't to hijack accountability. It's free publicity, like how Apple adds "Sent from my IPhone."


Sent from my Ipad

I've heard of employers requiring people to do it for all code written with even a whiff of it

Could be cool if your PRs link back to a blog where you write about your tools.

> Should I also co-author my PRs with my linter, intellisense and IDE?

Kinda, yeah. If I automatically apply lint suggestions, I would title my commit "apply lint suggestions".


Huh? Unless the sole purpose of the commit was to lint code, it would be unnecessary fluff to append the name of the automatically linted tools that ran in a pre-commit hook in every commit.

well maybe?

co-authoring doesn't hide your authorship

if I see someone committing a blatantly wrong code, I would wonder what tool they actually used


You have copyright to a commit authored by you. You (almost certainly) don't have copyright (nobody has) to a commit authored by Claude.

Where is there any legal precedent for that?

In some jurisdictions (e.g. the UK) the law is already clear that you own the copyright. In the US it is almost certain that you will be the author. The reports of cases saying otherwise I have been misreported - the courts found the AI could not own the copyright.


>Where is there any legal precedent for that?

Thaler v. Perlmutter: The D.C. Circuit Court affirmed in March 2025 that the Copyright Act requires works to be authored "in the first instance by a human being," a ruling the Supreme Court left intact by declining to hear the case in 2026.

And in the US constitution,

https://constitution.congress.gov/browse/article-1/section-8...

Authors and inventors, courts have ruled, means people. Only people. A monkey taking a selfie with your camera doesn't mean you own a copyright. An AI generating code with your computer is likewise, devoid of any copyright protection.


The Thaler ruling addresses a different point.

The ruling says that the LLM cannot be the author. It does not say that the human being using the LLM cannot be the author. The ruling was very clear that it did not address whether a human being was the copyright holder because Thaler waived that argument.

the position with a monkey using your camera is similar, and you may or may not hold the copyright depending on what you did - was it pure accident or did you set things up. Opinions on the well known case are mixed: https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...

Where wildlife photographers deliberately set up a shot to be triggered automatically (e.g. by a bird flying through the focus) they do hold the copyright.


Guidance on AI is unambiguous.

https://www.copyright.gov/ai/

AI generated code has no copyright. And if it DID somehow have copyright, it wouldn't be yours. It would belong to the code it was "trained" on. The code it algorithmically copied. You're trying to have your cake, and eat it too. You could maybe claim your prompts are copyrighted, but that's not what leaked. The AI generated code leaked.


The linked document labeled "Part 2: Copyrightability", section V. "Conclusions" states the following:

> the Copyright Office concludes that existing legal doctrines are adequate and appropriate to resolve questions of copyrightability. Copyright law has long adapted to new technology and can enable case-by- case determinations as to whether AI-generated outputs reflect sufficient human contribution to warrant copyright protection. As described above, in many circumstances these outputs will be copyrightable in whole or in part—where AI is used as a tool, and where a human has been able to determine the expressive elements they contain. Prompts alone, however, at this stage are unlikely to satisfy those requirements.

So the TL;DR basically implies pure slop within the current guidelines outlined in conclusions is NOT copyrightable. However collaboration with an AI copyrightability is determined on a case by case basis. I will preface this all with the standard IANAL, I could be wrong etc, but with the concluding language using "unlikely" copyrightable for slop it sounds less cut and dry than you imply.


That's typical of this site. I hand you a huge volume of evidence explaining why AI generated work cannot be copyrighted. You search for one scrap of text that seems to support your position even when it does not.

You have no idea how bad this leak is for Anthropic because with the copyright office, you have a DUTY TO DISCLOSE any AI generated work, and it is fully RETROACTIVE. And what is part of this leak? undercover.ts. https://archive.is/S1bKY Where Claude is specifically instructed to HIDE DISCLOSURE of AI generated work.

That's grounds for the copyright office and courts to reject ANY copyright they MIGHT have had a right to. It is one of the WORST things they could have done with regard to copyright.

https://www.finnegan.com/en/insights/articles/when-registeri...


I merely read the PDF articles you linked, then posted, verbatim, the primary relevant section I could find therein. Nowhere does it say that works involving humans in collaboration with AI can't be copyrighted. The conclusions linked merely state that copyright claims involving AI will be decided on a case by case basis. They MAY reject your claim, they may not. This is all new territory so it will get ironed out in time, however I don't think we've reached full legal consensus on the topic, even when limiting our scope to just US copyright law.

I'm interpreting your most recent reply to me as an implication that I'm taking the conclusions you yourself linked out of context. I'm trying to give the benefit of the doubt here, but the 3 linked PDF documents aren't "a mountain of evidence" supporting your argument. Maybe I missed something in one of those documents (very possible), but the conclusions are not how you imply.

Whether or not a specific git commit message correctly sites Claude usage or not may further muddy the waters more than IP lawyers are comfortable with at this time (and therefore add inherent risk to current and future copyright claims of said works), but those waters were far from crystal clear in the first place.

Again, IANAL, but from my limited layman perspective it does not appear the copyright office plans to, at this moment in time, concisely reject AI collaborated works from copyright.

Your most recent link (Finnegan) is from an IP lawyer consortium that says it's better to include attribution and disclosure of AI to avoid current and future claim rejections. Sounds like basic cover-your-ass lawyer speak, but I could be wrong.

Full disclosure: I primarily use AI (or rather agentic teams) as N sets of new eyeballs on the current problem at hand, to help debug or bounce ideas off of, so I don't really have much skin in this particular game involving direct code contributions spit out by LLMs. Those that have any risk aversion, should probably proceed with caution. I just find the upending of copyright (and many other) norms by GenAI morbidly fascinating.


> because with the copyright office, you have a DUTY TO DISCLOSE any AI generated work,

I was not aware of that. WHo has that duty and when do they have it?


Currently, the US copyright application process has an AI disclosure requirement for the determination of applicability of submitted works for protections under US copyright law.

The copyright office still holds that human authorship is a core tenet of copyrightability, however, whether or not a submission meets the "de minimis" amount of AI-generated material to uphold a copyright claim is still being decided and refined by the courts and at the moment the distinction appears to fall on whether the AI was used "as a tool" or as "an author itself", with the former covered in certain cases and the latter not.

The registration process makes it clear that failure to disclose submissions in large contribution authored by contractor or ai can result in a rejection of copyright claim now or retroactive on discovery.


You do not apply for copyright. In the US you can, optionally, register a copyright. You do not have to, but it can increase how much you get if you go to court.

I do not know whether any other country even has copyright registration.

Your main point that this is something the courts (or new legislation) will decide is, of course, correct. I am inclined to think this is only a problem for people who are vibe coding. The moment a human contributes to the code that bit is definitely covered by copyright, and unless you can clearly separate out human and AI contributed bits saying the AI written bits are not covered is not going to make a practical difference.


My (limited) understanding was that without formal registration you cannot file any infringement suits against any works protected by said copyright. Then what's the point of the copyright other than getting to use that fancy 'c' superscript?

You do, as the developer. Let's circle back to the original comment that started this discussion:

https://news.ycombinator.com/item?id=47594044

That comment is spot on. Claude adding a co-author to a commit is documentation to put a clear line between code you wrote and code claude generated which does not qualify for copyright protection.

The damning thing about this leak is the inclusion of undercover.ts. That means Anthropic has now been caught red handed distributing a tool designed to circumvent copyright law.


can you tell me where exactly in the documents you link to it says that?

It's beyond obvious that a LLM cannot have copyright, any more than a cat or a rock can. The question is whether anyone has or if whatever content generated by a LLM simply does not constitute a work and is thus outside the entire copyright law. As far as I can see, it depends on the extent of the user's creative effort in controlling the LLM's output.

It may be obvious to you, but it has lead to at least one protracted court case in the US: Thaler v. Perlmutter.

> The question is whether anyone has or if whatever content generated by a LLM simply does not constitute a work and is thus outside the entire copyright law.

Its is going to vary with copyright law. In the UK the question of computer generated works is addressed by copyright law and the answer is "the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken"

Its also not a simple case of LLM generated vs human authored. How much work did the human do? What creative input was there? How detailed were the prompts?

In jurisdictions where there are doubts about the question, I think code is a tricky one. If the argument that prompts are just instructions to generate code, therefore the code is not covered by copyright, then you could also argue that code is instructions to a compiler to generate code and the resulting binary is not covered by copyright.


The binary should be considered "derived work". Only the original copyright owner has the exclusive right to create or authorize derivative works. Means you are not allowed to compile code unless you have the license to do so. Right?

Yes, so is LLM generated code a derivative work of the prompts? Does it matter how detailed the prompts are? How much the code conforms to what is already written (e.g. writing tests)?

It looks like it will be decided on a case by case basis.

It will also differ between countries, so if you are distributing software internationally what will be a constraint on treating the code as not copyrightable.


According to the law, if I use Claude to generate something, I hold the copyright granted Claude didn’t verbatim copy another project.

why wouldn't antroipic own it? they generated it?

It is not "beyond obvious" that a cat cannot have copyright, given the lawsuit about a monkey holding copyright [1], and the way PETA tried to used that case as precedent to establish that any animal can hold copyright.

[1] https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...


Anthropic could at least make a compelling case for the copyright.

It becomes legally challenging with regards to ownership if I ever use work equipment for a personal project. If it later takes off they could very well try to claim ownership in its entirety simply because I ran a test once (yes, there's a while silicon valley season for it).

I don't know if they'd win, but Anthropic absolutely would be able to claim the creation of that code was done on their hardware. Obviously we aren't employees of theirs, though we are customers that very likely never read what we agreed to in a signup flow.


Using work equipment for a personal project only matters because you signed a contract giving all of your IP to your employer for anything you did with (or sometimes without) your employer's equipment.

Anthropic's user agreement does not have a similar agreement.


My point was that they could make a compelling case though, not that they would win.

I don't know of ant precedent where the code was literally generated on someone else's system. Its an open question whether that implies any legal right to the work and I could pretty easily see a court accepting the case.


Who owns the copyright for something not written by anybody, you ask? Is it the man who pays to have it written, or the owner of the machine that does the writing? But it is neither. Nobody owns the copyright because nobody has written it.

I think all you need to do is claim that your girlfriend is your laptop. /s

That's very nice and I ended up with the same font I tend to use (Source Code Pro) vs the font I used before (Noto Sans Mono). Some features I'd love to see:

- An ELO-based version with many more variables, so that I can open the site from time to time and find more nice fonts

- Some global stats

- Not losing the leaderboard after reloading

- Spline Sans Mono


Maybe this is a bit US-centric, direct negative feedback is very common in many cultures, e.g. Dutch


Definitely sounds like the US.

When I worked a Radboud University in the Netherlands for a summer, they were definitely more direct than I was used to, and kept work more work-focused. But they also combined that with a culture of quitting on time, and going out to socialize a bit before dinner, which I think was vital to sustain interpersonal connections.

I liked that style a lot, but Americans are very bad about quitting on time, which necessitates more socialization at work itself.


Probably. I'm from the US, and I know a few Dutch people, and I find their approach to direct negative feedback off-putting to the point of feeling rude, even when knowing what to expect from them. (I'm sure they find my communication style long-winded, frustrating, and a waste of their time.)

It's a cultural thing, to be sure, and what you grew up with and are used to tends to dominate how you feel about things.


IMHO the Dutch are more direct for the same reason they are less sensitive to authority and approach their superiors as equals.

Netherlands effectively being a River Delta, there always was the threat of water, a force greater than anyone. IOW if a flood comes, both the king and the peasant start digging.

This is completely different from neighboring countries UK and Germany, which both traditionally had strong sense of hierarchy and not contradicting the master.


> IOW if a flood comes, both the king and the peasant start digging.

By the same reasoning, India, Bangladesh and China — all ancient civilizations threatened by great rivers — should have developed similar egalitarian cultures but the reality is the polar opposite.

Maybe something as complex as human civilizations can't be the result of just one geographical feature.


India and China are huge countries with very small percentage river deltas. Not comparable by any means. Bangladesh in a very young country that inherited it’s culture from India. I’m not an anthropologist, but sorry, I don’t agree there is any likeness with these countries.

If you have an other theory about the Dutch culture and why it is so different from it’s neighbors, I happy to hear.


> Maybe this is a bit US-centric,

You are violating the rule of the principle in saying this. :-)

(Yes, I am aware it does not apply here.)

It is EXTREMELY US-centric and frankly as a Brit who lived and worked in Central Europe and was previously engaged to a Norwegian, I find Crocker's rules laughable.

How it looks to me is:

"Use European manners with me. Don't waste your words or my time. Shut up and get on with it."


...at least until we get real Test-Time Training (TTT) that encodes the state into model weights. If vast amounts of human knowledge can be compressed into ~400GB for frontier models, it's easy to imagine the same for our entire context


There is a fantastic book on this topic called "By Design : Why There Are No Locks on the Bathroom Doors in the Hotel Louis XIV" that cover the inception of modern brands. It's a very similar story to what happened to watches. The industrial revolution made quality clothing accesible for everyone, so the cloth appearance no longer was a clear indicative of status, so brands started placing their emblems on the outside of the fabric inside of the inside. At the time it was seen as gross, obvious and bad taste. Guess we had a century to normalize it.


> At the time it was seen as gross, obvious and bad taste.

It's still gross, obvious, and in bad taste.


Looks great. I just ordered it. Thanks for the recommendation.


I also got the recommendation from here, happy to see I'm giving it back!



Another big factor is that many areas of the US feature horizontal architecture instead of vertical ones (i.e. Los Angeles vs NY). A bus stop in those areas cover orders of magnitude less people given the same radius


Plus a long queue of yet-undiscovered architectural improvements


I'm suprised there isn't more "hope" in this area. Even things like the GPT Pro models; surely that sort of reasoning/synthesis will eventually make its way into local models. And that's something that's already been discovered.

Just the other day I was reading a paper about ANNs whose connections aren't strictly feedforward but, rather, circular connections proliferate. It increases expressiveness at the (huge) cost of eliminating the current gradient descent algorithms. As compute gets cheaper and cheaper, these things will become feasible (greater expressiveness, after all, equates to greater intelligence).


It seems like a lot of the benefits of SOTA models are from data though, not architecture? Won't the moat of the big 3/4 players in getting data only grow as they are integrated deeper into businesses workflows?


That's a good point. I'm not familiar enough with the various moats to comment.

I was just talking at a high level. If transformers are HDD technology, maybe there's SSD right around the corner that's a paradigm shift for the whole industry (but for the average user just looks like better/smarter models). It's a very new field, and it's not unrealistic that major discoveries shake things up in the next decade or less.


With such a high throughput because of sparsity, I'm particulary interested in distilling it into other architectures. I'd like to try a recurrent transformer when I have the time


Using an ebook and a bluetooh page turner solved it for me


What page turner do you use?


A cheap one from Aliexpress with only forward/backward keys


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: