Power Ventures' actions violated the CFAA because they bypassed security measures intended to make the content not-exactly-public. The judge dismissed claims of copyright infringement despite them hosting "cached" versions of the scraped profiles using Facebook's trade dress. The damages + discovery sanction had to do with them bypassing security restrictions with their scraping, creating profiles and using bots to scrape with those profiles to access further information than was public, and Power Ventures' ignoring their explicit cease-and-desist the first time, and their non-compliance with discovery in some context. Read the case.
>I could go on, but I am not sure what exactly you’re trying to argue here.
That public is public. If you leave the door open in the real world, someone CAN enter your home. If you host your image on a public webpage, they can scrape it. robots.txt is not a security measure, nor is it a contract that magically gives the right to scrape where it wasn't given previously, it is a gentleman's agreement that you can ignore if you want to be a dick about it, and know about the robots.txt. Ethically, that's wrong, but it is how it is.
Not to mention, as I came to understand while reading during this discussion, LAION wasn't even crawling: they were using a public commoncrawl dump to gather their images. commoncrawl had crawled the author's site previously. They just took that data and got image links out of it.
1. they weren't selling a dataset
2. the artist didn't "disable" scraping in any meaningful way, legally
3. linking to the image is not illegal, and they're justified to respond with an invoice in Germany to recover legal fees for this dumb copyright complaint
4. it may fall under fair use to download images and train neural nets on them, it may not be. it always depends on the context and the specific case.
No trespassing signs have legal weight with certain conditions, and it's up to the judge. I've been in a court case where simply taking pictures outside of the property and showing that the gap between each no trespassing sign was more than 100ft wide was enough for a judge to throw out the charges. I was dirt biking on the power company's property, you see. It hinges on the defendant not noticing such restrictions. In both cases, the robots.txt meant jack. Even their security measures usually meant jack. Instead, it was a legal cease and desist from a lawyer that constituted "no authorization."
There’s a lot of precedent here showing scrapping isn’t guaranteed to be acceptable.
https://en.wikipedia.org/wiki/Facebook,_Inc._v._Power_Ventur....
$79,640.50 in compensatory damages + $39,796.73 discovery sanction
I could go on, but I am not sure what exactly you’re trying to argue here.