Copyright was never conceived to apply to technology like this and the onslaught...

theonemind · on Jan 13, 2024

People generate all of the data going into the system and then the middle-men (OpenAI, Microsoft, Google, Big Tech middle-man of the week) reap a disproportinate centralized benefit. That causes a bigger problem than the so-called rent-seeking behavior of copyright holders in this case, as this has the net effect of leveraging human creativity, etc. to devalue it and continue the erosion of the middle class.

Bad things happen when you let middlemen get the upper hand, like the American health care system, or big finance disconnected from the real economy. I'll vote against the middleman every time in favor of the original value creator, because society goes down the toliet when middlemen win.

hyperhopper · on Jan 13, 2024

What is the alternative though? I agree with the feelings and sentiments of the anti-ai people that want it to pay copyright, but I never hear any considerations for what comes next.

This is going to end up being the music industry all over again. It's going to be impossible for any individuals or small companies to get the rights needed, and instead were going to get massive content labels selling the rights, or only giant corporations being able to hop through all these new hoops.

We don't want a repeat of that as a society, creating yet another leeching middleman and horrible industry favoring only the incumbents.

hyperbovine · on Jan 13, 2024

I don’t see it ending like that. LLMs will just be taught not to emit copyrighted content verbatim. Whatever the courts end up deciding, they’ll be trained to stay just this side of legal. I’m certain it’s already being worked on.

mjburgess · on Jan 13, 2024

Yes, when I read "rent-seeking" i assume OP meant OpenAI.

Google search at least was just a link to content we wrote. OpenAI just steals it.

__loam · on Jan 13, 2024

OP was obviously referring to the copyright holders whose data he feels so entitled to.

CuriouslyC · on Jan 13, 2024

Open models are a thing. Rather than attacking the technology (which is great) with litigation to hurt a few bad actors, we should attack the capitalist rules that enables rent seeking middle man parasites to flourish.

verisimi · on Jan 13, 2024

Yes. If you think about it, the individual is being subjected to a man in the middle attack, cleaving a creator from their creation via the use of consent agreements for providing a platform. Rent seeking.

naet · on Jan 13, 2024

The artist or author might end up being the loser, and the multi billion corporation harvesting their work might make an unearned profit off it.

To me personally it's crazy how many people think that we would be better off without any kind of copyright protection. Copyright solves many real world problems and protects people against having a company profit off their work... but as soon as AI is involved so many people start to advocate for throwing it away.

letmevoteplease · on Jan 13, 2024

If companies are required to purchase licenses for everything they train on, it will guarantee that only huge corporations with deep pockets can produce powerful models. Microsoft will be slightly inconvenienced, Stability AI will be destroyed. Some artists might get a payday, but most of the money will go to companies with large copyright libraries like Getty. The general quality of all models will decrease. I don't see any other possible outcome.

teddyh · on Jan 13, 2024

Almost a year ago, I made¹ the following prediction:

It looks like to me that many companies want to use the new generative tools, and many others want it not to impact their stake in the copyright system. I’m pretty sure they will both come to a compromise which will leave most users without any benefits, either from reduced copyrights or from availability of generative tools. It’s what would make both powerful parties satisfied (if not happy), and will impact the status quo the least.

Say, for instance, that they instituted a mostly mandatory licensing scheme, so that an individual artist had no choice but to allow use of their art as input when creating generative tools. People using art in this way have to pay a rather high licensing fee, but it is not paid to the artist, but to some sort of central copyright office. Huge copyright holders can also pay an exorbitantly high fee (to the same recipient) to opt out of licensing. Win-win-win; Existing copyright holders keep their existing copyrights, only large-ish actors can create new generative tools, new political positions and institutions are created with lots of money flowing in. Of course, artists then get screwed by being co-opted by generative tools which they can never afford to create themselves, and the general public get robbed both of the opportunity of using and creating new generative tools, and of any less restrictive copyright law.

1. <https://news.ycombinator.com/item?id=35191112>

kolinko · on Jan 13, 2024

For music there are already similar mechanisms in place in many countries - in Poland it's ZAiKS, in US it's ASCAP. They collect fees from organisations playing copyrighted music publicly.

(I agree that it would be terrible if they began enforcing other copyrighted content and for training purposes, because it would lead to centralisation)

PoignardAzur · on Jan 13, 2024

Sacem in France.

They're the worst, eg they will notoriously come after you if you play public domain music as well.

et1337 · on Jan 13, 2024

I hope you’re wrong, but I think you’re right.

vidarh · on Jan 13, 2024

In agreement with your "slightly inconvenienced": The world's dozen or so largest publishers have market caps averaging below $10bn range each.

"Even" just OpenAI alone could pocket a few of them if they need easy sources of acquiring content.

This includes the largest educational publishers. And while these publishers do not own all their content, the reality is most authors earn so little, that a "allow AI training on my work for $x extra" would give them vast amounts of content.

As for Getty, Getty has a market cap of "only" $2bn. The big players will easily afford to build or buy libraries like that.

But of course it will be the end of decent open models.

nomel · on Jan 13, 2024

> it will guarantee that only huge corporations with deep pockets can produce powerful models

It will also guarantee that the financial means to continue making that data, that is clearly so important, would be preserved. Someone has to pay for the crafting of the data.

kg · on Jan 13, 2024

For many artists this is not about "getting a payday" and is instead about "not being replaced by AI". So the outcome you describe would probably sound great to those artists.

Eisenstein · on Jan 13, 2024

How did dock workers feel wen containerized shipping starting gaining popularity? Should we have let them all continue putting things on ships piece b piece and stacking and unstacking each shipment by hand?

How did portrait artists feel when photography was gaining popularity? Should we have let them control the industry so that if we want to record a memory of a person we must have them stand or sit for hours while someone draws them?

etc.

__loam · on Jan 13, 2024

Man there's always someone in these discussions who will smugly tell us that this is all inevitable and our empathy for the creatives in our economy is misplaced. To you I give a hearty fuck you.

Eisenstein · on Jan 13, 2024

No, I am describing what happens when technology makes the market for certain jobs and talents change. The stevedores may have had a bad time for a while but our modern society only exists because we can ship things quickly and efficiently.

I feel bad for copy editors and people who write corporate blog posts or design logos or come up with ad jingles, but their niche is gone now and they need to adapt.

Thanks for being respectful and cordial though.

__loam · on Jan 13, 2024

I often see these processes described as passive economic mechanisms that we are subjected to and not as decisions that we all make collectively and actively accept, making excuses based on the neoliberal understanding of our time as to why those people deserve to have their jobs made redundant and their livings wrenched from them.

To me, it's a kind of cowardice that people like you shrug your shoulders at and sigh and say "that's just the way things are". You can say that's just how the markets work. I don't have to respect you for it.

Eisenstein · on Jan 13, 2024

I am not saying that artists are going to stop being a thing. We will keep buying books written by people and watch movies directed by people and people will still make music and what have you, but it will be different. The music industry was completely different in 1900 when there no available mass recordings, different again in the 1950s with popular radio and records, and the 2000s brought the internet and MP3s.

Things change -- people's jobs will be different. It isn't going to mean artists will stop making art or machines will make everything bland, it is just a new tool that will change industries and make things easier for people to do well and thus make more art. Some people won't be able to live well doing the same thing they do now, but what they do now wasn't what they would have been doing if they were in their grandparents time.

thomashop · on Jan 13, 2024

I'd say you are creating a bit of a straw man there. The commenter you are responding to didn't say that's just the way things are. It feels like you are making their argument for them.

They showed some examples in the past and showed that society adapted.

We could try and improve our society and systems to have a safety net, education that allows us to adapt to rapidly changing technologies, etc sure but that's a whole discussion in itself.

CuriouslyC · on Jan 13, 2024

If you give people freedom (good thing, right?) and tools exist to perform a task in a variety of different ways (some faster/more efficient than others) people will naturally gravitate towards using the most efficient tools to gain a competitive advantage, and other people will prefer work produced with those tools because it's better/cheaper. As long as better tools exist and people are free, this is just the way things are gonna play out.

If you're angry that independent artists are being fucked over by bigcorp, AI tools aren't the battle you should pick, because it's a guaranteed loss for a lot of very logical reasons, and it's just another example of a pattern of oppression enabled by our social and political systems. Even if by magic you managed to change something there'd just be another inequality coming down the pipe shortly after.

CuriouslyC · on Jan 13, 2024

The good artists are already using AI, just like they photobashed, traced templates and used camera obscuras to produce better art faster down through the ages. A true artist transcends medium to focus on message.

justatdotin · on Jan 13, 2024

AI is a tool. different artists use different tools. some good artists use ai. many good artists will not be interested in that particular tool.

encomiast · on Jan 13, 2024

I don't think most people believe we are better off without copyright. I think people believe that copyright protects specific concrete expressions and that fair use exists to allow others to build on ideas in transformative ways. It's not clear where building a learning model from this work sits in this context, hence the court cases.

Also, it's a subtle difference, but copyright is not intended to solve the problem of companies profiting off of artist's works, it is intended to promote the progress of science and useful arts. It attempts to do this by giving creators limited exclusive rights.

sjfjsjdjwvwvc · on Jan 13, 2024

How does locking away most of the knowledge, research and learning materials in the private vaults of a few publishing houses for their personal profit promote the progress of science I wonder?

Even scientists are tired of the predatory and rent seeking behaviour of the publishers they have fallen prey to and are looking for any way out.

This is not promoting progress this is the opposite of it

encomiast · on Jan 13, 2024

I think it grossly mischaracterizes what copyright protects to describe is as "most of the knowledge, research and learning materials". Still I agree, that the extensions of copyright length and the behavior/incentives of publishers works against the original intent of copyright. Having said that, publishers only have control of copyright because authors give it to them. Copyright rests with the creator — the system where people are compelled to sign this over to publishers is a different (but of course related) problem. Scientists who are tired of the predatory behavior of publishers have other choices today. It's not clear what alternative you are proposing.

nomel · on Jan 13, 2024

> vaults of a few publishing houses for their personal profit

Because they made it, it wouldn't exist without them, and others value it. If this data wasn't objectively valuable, we wouldn't be having this discussion.

Kuinox · on Jan 13, 2024

> but as soon as AI is involved so many people start to advocate for throwing it away

No, it's been years I've heard it.

Don't try to portray some people opinion as they are some AI zealot.

It's brought up on discussions about torrent, Disney, streaming platforms, music, etc...

andybak · on Jan 13, 2024

Yes. I've been aware of the intellectual property debate at least back to the great crackdown on sampling around when Paul's Boutique was released. And following it in depth from around the time Lawrence Lessig made arguments to the Supreme Court.

A large chunk of the tech community was following that case and most on HN seemed to be highly sceptical of the current status quo.

sjfjsjdjwvwvc · on Jan 13, 2024

How does it protect a small artist against a large corporation profiting off their work?

I don’t even have the means to start litigation, let alone see it through.

It only protects those who are already moneyed and/or famous enough to negatively impact a large corporation’s reputation - and even in those cases it’s mostly for the benefit of the lawyers and bureaucrats who make a living off it.

encomiast · on Jan 13, 2024

If you register your work, which requires some effort, but is not prohibitively expensive or difficult, you can sue for statutory damages, which are substantial enough (up to $150k for willful infringement) that lawyers will work on contingency. There are many individual artists how have been successful here. The law actually has some real teeth that individuals can use to protect their work.

nomel · on Jan 13, 2024

It would be nice if there was a preventative concept, where the role of the creator being a predator, seeking and suing, would be mostly reversed, so that others would instead ask for permission, and maybe get the rights to copies through a fair exchange of money, like a license. We could call this "copy rights".

realusername · on Jan 13, 2024

> The artist or author might end up being the loser, and the multi billion corporation harvesting their work might make an unearned profit off it.

Exactly like before AI you mean then? Except instead of OpenAI it was Disney, Universal and other large corporations on that same seat.

>to me personally it's crazy how many people think that we would be better off without any kind of copyright protection.

Why should I care that the old billionaire copyright corps are dying exactly? What would I benefit defending them for me as what they did was privatizing culture as far as I remember for their personal benefit and even had a large negative influence on tech.

The copyright system being so unequal and skewed towards multi billion companies dug its own grave by itself.

mortify · on Jan 13, 2024

The copyright issue seems unchanged. Anyone taking wholesale quotes from another entity is likely in violation of copyright law. If someone uses AI, and posts the output from it as their own work, and that work contains copyrighted material, the person who posted it is in violation of copyright. AI is just a tool they chose to use and they remain responsible for remaining in compliance with copyright law.

What we need is a reasonable way for people using AI to determine which parts of the text or images they have are subject to copyright.

__loam · on Jan 13, 2024

Just a tool that required billions of dollars worth of copyrighted material to be created.

OkayPhysicist · on Jan 15, 2024

How can you possibly argue that taking a bunch of text and creating an application that creates text isn't transformative?

The tool itself unambiguously is fair use.

__loam · on Jan 16, 2024

Whether something is transformative is one of 4 tests for fair use.

tomp · on Jan 13, 2024

> Anyone taking wholesale quotes from another entity is likely in violation of copyright law

What do you mean anyone?

Is Sony liable when you play an entire movie on their TV? Is Nuance liable when you use their Dragon screen reader to cerbalize an entire NYT article? Is Google liable when you display an entire webpage in Google Chrome? How about if you switch to Dark Mode, is that a transformative use?

Why would AI be any different? It’s just a tool at the end of the day!

teddyh · on Jan 13, 2024

The problem is people at large companies creating these AI models, wanting the freedom to copy artists’ works when using it, but these large companies also want to keep copyright protection intact, for their regular business activities. They want to eat the cake and have it too. And they are arguing for essentially eliminating copyright for their specific purpose and convenience, when copyright has virtually never been loosened for the public’s convenience, even when the exceptions the public asks for are often minor and laudable. If these companies were to argue that copyright should be eliminated because of this new technology, I might not object. But now that they come and ask… no, they pretend to already have, a copyright exception for their specific use, I will happily turn around and use their own copyright maximalist arguments against them.

(Copied from a comment of mine written over a year ago: <https://news.ycombinator.com/item?id=33582047>)

kolinko · on Jan 13, 2024

> these large companies also want to keep copyright protection intact, for their regular business activities

Care to share an example? I didn't hear of OpenAI or anyone else arguing or trying to sue anyone for abusing the copyright. If anything, their business decisions rely on an assumption that copyright will not help them protect their work

bamboozled · on Jan 13, 2024

Prime example for you right here:

https://nypost.com/2023/12/18/business/openai-suspends-byted...

100% pure unadulterated hypocrisy from "OpenAI".

kolinko · on Jan 16, 2024

T&C yes, but not copyright. This is fully consistent with them opposing copyright and not opposing paywalls/api limitations.

misnome · on Jan 13, 2024

Don't they have an explicit T&C that says you are not allowed to use their output for training other models?

kolinko · on Jan 16, 2024

T&C yes, but not copyright.

teddyh · on Jan 13, 2024

I was mostly thinking of large companies also creating their own AI, like Google, Microsoft, etc.

layer8 · on Jan 13, 2024

If their model was leaked, you can be sure they’d claim copyright protection on it.

kolinko · on Jan 16, 2024

I wanted to say that they are to smart to expect dmca to protect them.

But then, I think that surely they would use copyright to block competition from using their model directly.

Espressosaurus · on Jan 13, 2024

Because ChatGPT users are the only people that are worth considering.

zx8080 · on Jan 13, 2024

OpenAI is the force to cut slice from the copyright pie which the big copyright hoarders have. The hoarders will not strike back to try to kill the OpenAI business. Because in any case they will not be able to kill the technology itself. So, obviously, it's better for them to have OpenAI as a partner and share some profit with them to control the AI field than to kill this one and wait for another AI menace to raise.

OpenAI is not the one who would kill a copyright. They just want their cut.

bluescrn · on Jan 13, 2024

Why should 'big tech' corporations be allowed to use AI to remix/mash-up human-generated content all of a sudden when creative individuals have generally been prohibited from doing it for so long?

mlrtime · on Jan 13, 2024

Wow, I didn't know creative individuals have been banned from remixing copywrited material in their own private works.

We must tell the millions of kids who doodle characters in their notebooks that this prohibited.

ssss11 · on Jan 13, 2024

The data is not the technology.

gumballindie · on Jan 13, 2024

[flagged]

zarzavat · on Jan 13, 2024

Physical property is a consequence of the laws of physics. If I have a gold coin in my hand, then you don't have that gold coin in your hand. If you want the coin then you can either trade for it or fight me, but either way only one of us can have the coin. Even a bird with a worm knows a concept of physical property.

Copyright is something that was invented relatively recently, a few hundred years ago, because of a new technology back then: the printing press. Before the printing press there was no need for copyright.

Now today we again have a new technology in neural networks, and it's entirely possible that the realities of this technology push us back in the other direction, undoing what the printing press did.

gumballindie · on Jan 13, 2024

The world you describe is that which turned former communist countries in the underdeveloped entities they are today. By not respecting people’s right to own property you disincentivise them from adding value. The printing press analogy is relevant. Similar to how we made rules on how that tool can be used we now need rules in how to protect people’s creations from being taken away by force using ai. Physical property rules are not the result of “physics”. Are the result of evolving beyond the savagery of taking that which doesnt belong to you by force. If it wasnt protected by law it would be protected by the sword. I get that some people would prefer that type of living but by and large civilised humans dont.

abathologist · on Jan 13, 2024

> Physical property is a consequence of the laws of physics. If I have a gold coin in my hand, then you don't have that gold coin in your hand.

So do you believe people should only be able to own real estate that they are currently physically occupying? By your reasoning, no one could ever own a piece of land larger than what they were currently standing or lying on. So no one could own land. So abolition of all property rights, essentially.

I'm hella down for this, but then we should also be able to walk into the OpenAI offices and inspect their source code, cause that shit won't fit in any one hand afaik.

This is incredibly naive misunderstanding of how property rights work: they are 100% a social and conceptual construct. IANAL, but I believe you are confusing property with possession.

But, yeah. I'm down: no one can own anything that isn't currently in their hand. Let's go liberate a lot of fake property that is "owned" in violation of the laws of physics!

o11c · on Jan 13, 2024

and more than that: even if we would agree that works should not be protected, we're currently in a highly-asymmetrical position where big players like Microsoft can take people's hard work but give nothing back. The only way to survive under a copyright regime is viral licensing.

flir · on Jan 13, 2024

You probably didn't mean to, but implying that's how the whole of humanity works is a bit out there.

It's the local rules (geographically and temporally), sure. But rules can be changed.

colechristensen · on Jan 13, 2024

Sure, for reasonable copyright terms. Currently, if you create something when you’re young and live long, a 150 year long copyright term is reasonably possible. (Life + 70 years)

Much as I appreciate someone’s rights to their work, things should enter the public domain in something more like 10 to 20 years. Even then, copyright protections are too strong when in force. You published something so people would use it, your ability to limit how should be quite restricted to protecting you from folks selling it as their own. I am also in favor of forced standard licensing terms.

Like say after five years there should be a standard streaming licensing fee for films and shows such that anyone can broadcast/stream/sell copies for a flat rate.

heyoni · on Jan 13, 2024

We also have to consider the cost of enforcement. We can’t be soaking up millions and billions of taxpayer dollars to protect copyrights or field complaints that aim to protect mutated copies of said works…just like you don’t send a swat team to enforce parking tickets, we have to consider what is at loss for the New York Times or other copyright holders before clogging up the courts.

There’s a reason lawyers are so quick to file a suit and it’s because it cost nothing to sic the dogs of the American justice system on others.

gumballindie · on Jan 13, 2024

Kim Dotcom’s adventure calmed things down for the last wave of digital ip theft. Once that happens with one or two ai copyright disbelievers the rest will calm down.

heyoni · on Jan 13, 2024

But is it really a problem if the AI is transmitting the information in its own words? And even if that is considered illegal, doesn’t it significantly diminish said crime?

gumballindie · on Jan 13, 2024

AI doesnt transmit information in its own words. It has no “own” no “self”. It does what it was programmed to do, just like any other type of software. Turns out that some people using ai have made it ingest content without permission so they can resell it for profit. That should not be permitted. My property is not yours to take unless you agree to my terms. I did not give you permission to download my data, art, code or text, to ingest in a token database and then resell it in any shape or form derived or not. No ifs no buts. If you want it you have to pay for it or respect terms. The bulk of ai companies respect that. A handful of sociopaths dont. They are the issue.

heyoni · on Jan 13, 2024

I wouldn't call ANYONE disrespecting terms and conditions they may have agreed to a sociopath. Not everything in a contract is enforceable just because it's written there, whether or not both parties signed it. And unless it's spewing out copyrighted materials "verbatim" there is an argument to be made that the LLM learned to talk from an open source and inserted knowledge from a copyrighted one.

However this turns out for private AI, I hope at the very least it can be considered fair use. Monetized LLMs can be forced to pay up or follow terms but individuals should be able to pool together and create open source models. I'm not saying I have the exact legal arguments for why this would work but LLMs in their current forms need to exist.

gumballindie · on Jan 13, 2024

I absolutely agree that LLMs should exist. Torrents still exist and have their purpose. Criminals always argued that their crime is not really a crime and found all kinds of arguments in favour of it. Similarity people developing ai that doesnt respect people’s property use all sorts of wild arguments in their favour - ai learns like a human, it benefits society, other countries will use it against us, and so on. That doesnt mean we should give into their demands to destroy society and people’s lives so they can have a competitive advantage over honest people. The fact that they want to steal, destroy entire industries they take from, and demolish norms so they can make their software appear intelligent, makes them sociopaths.

kg · on Jan 13, 2024

A significant portion of the training set for most image generation tools is stuff made in the last 10-20 years harvested from the internet, if not the last 5 years. We're not talking about 150 years of copyright protection here, we're talking about the time frames you suggest. Artists want to protect their own work and their livelihood, and AI is being trained on the work they're actively putting out right now. You would have to shorten copyright duration to something like 5 years to come remotely close to making modern image generation models possible without violating artist copyrights.

Text is different and much less difficult since its history as a medium is much longer - if you excluded the last 10-20 years of prose from your LLM it would probably still be very good at writing. But excluding the last 20 years of digital illustration and photography would be limiting yourself to a much lower-fidelity training set.

huytersd · on Jan 13, 2024

Your work is not free from derivation which is what GPT4 does in the overwhelming number of cases. If there are small outliers and it regurgitates something word for word, we can handle it like most other instances of copyright infringement as we do now. File a takedown notice and that particular phrase can be explicitly filtered out post output generation. Easy.

ketzo · on Jan 13, 2024

I agree about the derivation bit, but “File a takedown notice for every NYT article ever published after proving GPT can reproduce each one” is not what I would call a clean solution. That’s basically a regulatory DDOS attack.

Current copyright law is simply not equipped to handle LLMs, I think.

heyoni · on Jan 13, 2024

It’s what they do anyways. The file suit after takedown after DMCA and never ever hesitate to drag court cases out over months and years, wasting everybody’s time to make sure grandma pays up because someone in her house was using Napster.

queuebert · on Jan 13, 2024

I, for one, will enjoy watching lawyers and AI fight to the death.

huytersd · on Jan 13, 2024

I already love AI too much to enjoy it.

mvdtnz · on Jan 13, 2024

No, no one gets to get away with breaking the law over and over and over again with a simple "whoopsies" each time they get caught. There needs to be penalties.

huytersd · on Jan 13, 2024

So YouTube should be shut down the first time a copyrighted work is uploaded to it?

gumballindie · on Jan 13, 2024

Yeah, except I paid for the work i derived mine from. I paid taxes to learn in school, i paid for textbooks, i paid to see a painting, i paid to watch a movie, and i paid even to learn how to speak and do math. Stop stealing, and pay what you owe. Easy.

encomiast · on Jan 13, 2024

Are you really suggesting that learning from watching others, going to library, taking in the public domain, etc. is a form a theft?

FactKnower69 · on Jan 13, 2024

>Corporate communism want to take that away

contentless, thrashing drivel

mvdtnz · on Jan 13, 2024

I'm sure you'd feel the same way if it was your life's work these systems were hoovering up and regurgitating.