Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A publisher publishes. Once something is published, once something is public, the control a publisher has over the published thing is limited. For example, a publisher can not choose who reads a book after it is sold, who reads an article after it got printed. It is not a given at all that a publisher should have any say about being indexed. The search engine relies on a public fact - X wrote Y. That's legal (limitations apply).

The position of Brave of not accepting to be blacklisted if not all search engines are blacklisted is a pragmatic one. It works against Google's search monopoly, but still gives them some legal coverage since the robots.txt is not completely ignored, in case that indeed matters somewhere. I think it's an elegant approach well suited for the current state of the web, one that serves the greater good. And Brave, but that's completely fine.



A search engine index is an economic exchange between the website and the publisher.

To massively (over)simplify the argument to its essence (and ignore other important points): the publisher goes through the trouble and expense of creating the content The publisher then allows its content to be copied by a search engine only because being shown in search results gets it traffic back. The traffic it gets in return has value, and the publisher is happy for this arrangement to continue as long as the value of the traffic is more than the cost of producing and serving the content.

Brave offering a "license", for its own financial benefit, to "allow" others to use the content for LLM training gives zero benefit to the original publisher. This is why I use words like "sleazy" to describe Brave's position.

This argument applies to Google and Microsoft. Right now both are failing at citing sources in their generative AI search results. That is terrible and I hope it's fixed soon, as otherwise they're being sleazy scrapers as much as Brave is.

Finally, I wholeheartedly disagree they what Brave is doing is for the "greater good". The fact they charge extra for the "license" to use the content for LLM training shows that.


> A search engine index is an economic exchange between the website and the publisher.

A search engine index is a search engine index. It can have an economic impact, but it can not be an economic exchange, since it is a technical artifact.

Though I think I understand what you are trying to say - this is also a commercial relationship where both sides can profit. You are free to interpret the relationship between publishers and search engines with such a capitalist lense, that does not mean those mechanisms govern the actual rules. That a publisher is happy with what happens here is of no real concern. If any rules apply we are talking copyright, maybe media law, where happy is not a relevant category (ok, that of course can matter, it wouldn't here if the search engine uses a right).

I did not touch the LLM training data in my comment, as I did not read up on what Brave is really doing there. If Brave were really to sell complete texts from others, that would not be legal under copyright laws I'm familiar with, so I kinda doubt they do that.


It doesn't look like there's a restriction. The content is publicly available to anyone.


You are completely wrong. A publisher controls the publication of a work, text or other, including it's duplication and licensing. Other companies cannot xerox a book and sell it. Clear?


Indexing has clear analogues from before the Internet. It is obviously not copyright infringement. Quoting a small snippet to give context for a search result also obviously isnt copyright infringement.


So much so that publishers created special law to make unpaid snippets illegal in the context of global search engines, specifically to target Google, outside of copyright. Happened in the EU, Canada (I think) and Australia.


Clear?!

I specifically gave the example of selling it not being covered by what I'm talking about. But other parties can definitely "xerox" a book and read it. Use its content. Talk about what is written in it. Quote it. There are limits to the controlling rights of a publisher.


Xeroxing a book and selling it isn't legal, you realize that right.


> I specifically gave the example of selling it not being covered by what I'm talking about.


Useless. This discussion is about braver selling the indexed content via their api.


No it's not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: