More

beeboobaa · on March 20, 2024

Depending on the license, how is this legal?

altairprime · on March 20, 2024

In the US: Fair use disregards licenses. Fair use can be found to apply, or found not apply, by a court of law. Archival of information is generally felt to be Fair use, as is Search indexing.

ThunderSizzle · on March 20, 2024

The real argument is when is AI re-use of copyrighted material a violation of copyright. That is a large grey area that will probably be determined in favor of large corporations and not in favor of individuals. (As in, Disney can use AI writers to copy you, but you won't be allowed to copy Disney)

ryan-c · on March 20, 2024

Software Heritage aggressively insists on French law... which does not have fair use.

simonw · on March 20, 2024

Which bit? The archiving on SoftwareHeritage, the gathering of that data into the Stack or the subsequent training of models?

s4mw1se · on March 20, 2024

that’s a good question… There seems to be two problems.

The definition of open source depends on a license existing in a repo. Without a license it’s not legal to copy and distribute.

Public vs Private repo is a platforms issue not the code maintainers.

If a public repo does not have a license, it does not mean it free to copy and distribute.

If a private repo has an open source license like MIT, then the crawler has a right to copy and distribute that repo. Regardless if it has authorization to access the repo or not.

marcinzm · on March 20, 2024

> Without a license it’s not legal to copy and distribute.

Yes it is. Due to both the terms you agree when you use GitHub and the general Implied License that covers everything public on the internet.

https://en.wikipedia.org/wiki/Field_v._Google,_Inc.

vharuck · on March 20, 2024

Looking at that ruling, it seems the case you linked to hinged on a fact not applicable with the Stack:

>Field had actual knowledge of the Googlebot. He also was aware of the ways to prevent Google from either listing his site at all or listing it but not providing a link to the cached version. Instead of opting out, however, he chose to allow Google to both index and provide a link to the cached version.

For the AI dataset, (A) did the person know their work was being collected by this group and for this purpose, and (B) did they know of a way to prevent that collection?

AshamedCaptain · on March 20, 2024

It is not clear to me if they are _only_ using GitHub as source. The Stack explicitly mentions they are using Software Heritage as source and Software Heritage definitely sources from repositories that are NOT stored in GitHub (and never have been).

iamacyborg · on March 20, 2024

I don’t think that “implied license” you’re referring to holds up in the courts.

to11mtm · on March 20, 2024

Hopefully the crawler smart enough to properly handle edge cases...

e.x. if the repo has some sort of /used-licenses/ folder where the licenses for packages and the like are included, it could make a bad decision.

orf · on March 20, 2024

> Without a license it’s not legal to copy and distribute.

Is this true? When you post anything publicly, from sticking a poster on the street to making artwork like banksy, isn’t the default set to “it’s legal to copy, unless explicitly stated otherwise”?

amarshall · on March 20, 2024

The default in the majority of the world is that most creative works (including software code) are by-default copyrighted by the author, and the author must explicitly license away those rights. Some jurisdictions (e.g. France) put limits on what rights the author is allowed to give up. I.e., the default is it is illegal to copy (subject to exemptions like “fair use”).

ryan-c · on March 20, 2024

Note that this archive project is French.

fweimer · on March 20, 2024

Banksy apparently runs a licensing program. Their artwork is most definitely under copyright, and they rely on trademark protection as well.

There is also the practical issue that a lot of content is posted publicly without consent of the copyright owner. It's simply not true that just because someone else committed a copyright violation first, you can commit further violations without impunity based on that first violation.

zzo38computer · on March 20, 2024

> If a public repo does not have a license, it does not mean it free to copy and distribute.

Whether or not it is free to copy and distribute, it should be free to copy and distribute. (My opinion is that copyright is no good; if the file is public then you should be allowed to copy and distribute it.)

> If a private repo has an open source license like MIT, then the crawler has a right to copy and distribute that repo. Regardless if it has authorization to access the repo or not.

I should not think so. The license would only apply if you have a copy of it anyways. If you are not authorized to access it because it is private, then you would have to get a copy from somewhere else, and if nobody else is providing a copy, that shouldn't give you the right to unauthorized access. However, if it has been done, then it is done, so now there is a copy, and the license (if it is a license that allows copying it in this way) would authorize you to continue to use and distribute the copy that you have.

s4mw1se · on March 21, 2024

i’m not saying I agree or care about any of it. A sane company would never allow the use of source code from a third party without a license.

If repo is forked and the license is deleted the source code would need to be hashed to verify its the exact version of an open source repo. Mainly they don’t want copyleft or “malcious” license infecting their IP

If the hashes don’t match then it’s not technically the same code, so a company can’t safely use it without a license.

beeboobaa · on March 19, 2024

If something works on OS release version 1 then it should still work on OS release version 2.

Or in apple vernacular, it should just work.

beeboobaa · on March 19, 2024

First day on the internet?

beeboobaa · on March 17, 2024

How could it be done better?

westurner · on March 18, 2024

Could they be made from under spec for other applications steel?

There's not yet an awesome-coral restoration markdown README.md; or any mentions of both "MARRS" and "Reef Cubes".

Do coral prefer steel to other materials like concrete, sargassumcrete, hempcrete, sugarcrete, formed CO2, or IDK cellulose; and can you just add iron to the mix or what do coral prefer?

Can RUVs and robots deploy coral scaffolding safely at scale underwater?

What other shapes would solve?

westurner · on March 20, 2024

"Researchers create green steel from toxic red mud in 10 minutes" (2024) https://newatlas.com/materials/toxic-baulxite-residue-alumin... :

> Researchers have turned the red mud waste from aluminum production into green steel

Perhaps green steel for steel coral reef scaffolding

beeboobaa · on March 17, 2024

PMTiles is great for a static background. You can easily do a layer on top with geojson like you say, or even just by storing your own data in postgres and using postgis ST_AsMVT to turn it into a vector tile layer. Stick a cachebuster in the URL & a http cache in front and you can call it a day.

beeboobaa · on March 13, 2024

AI has always meant Artificial Intelligence. Intelligent and capable of learning, like a person.

LLMs are not AI.

outworlder · on March 13, 2024

> LLMs are not AI.

Neither are neural networks, by that definition. Or 'machine learning' in general. They all have been called "AI" at different points in time. Even expert systems – that are glorified IF statements – they were supposed to replace doctors.

Jensson · on March 13, 2024

People thought those techniques would ultimately become something intelligent, so AI, but they fizzled out. That isn't the doubters moving the goalposts, that is the optimists moving the goal posts always thinking what we have now is the golden ticket to truly intelligent systems.

beeboobaa · on March 14, 2024

Correct, we don't have AI yet.

readthenotes1 · on March 13, 2024

Some people are incapable of learning. Therefore, LLMs are AI?

As far as I recall, the turing test was developed long ago to give a practical answer to what was and was not practically artificial intelligence because the debate over the definition is much older than we are

beeboobaa · on March 14, 2024

Everyone is capable of learning else they'd have died as a toddler, or any time since when they tried to cross the road.

kristov · on March 13, 2024

I think the Turing test is subjective, because the result depends on who was giving the test and for how long.

beeboobaa · on March 13, 2024

> Ideally developers could let the user know their caps lock key is activated.

That would be up to the User Agent (the browser), not the website.

ninkendo · on March 13, 2024

I dream of a parallel universe where browsers took the lead in crafting innovative UI’s for standard web forms, with things like password prompts behaving intelligently, dropdowns supporting advanced autocomplete, excellent date pickers, caps lock reminders on password dialogs, etc etc.

Websites could have been simple to make with basic markup, leaving UX niceties to the browser vendors.

The world we live in is about as far from that as you get, with the stock UI for <input> elements being about par for 1992 UI toolkits, if even that.

makeitdouble · on March 13, 2024

The mobile platforms were a chance to reboot that part, and have browser do a lot more UI wise with custom handling of the different data types (dates, passwords, phone numbers, ranges etc.)

It just didn't pan out to tablets and desktop computers. But it might not be too late ?

stephenr · on March 13, 2024

> It just didn't pan out to tablets and desktop computers. But it might not be too late ?

Safari has a caps lock indicator on password fields on every platform, and has had it for several years - at least since Safari 12 on macOS 10.12 (circa 2018), possibly longer, but I don't have an older VM to test on.

Gecko has an open feature request for this, from 2 years ago: https://bugzilla.mozilla.org/show_bug.cgi?id=1757348

Chromium has an open feature request for this, from 3.5 years ago: https://issues.chromium.org/issues/40722752

pooper · on March 13, 2024

In particular, the input select multiple is atrocious with default styling.

Retr0id · on March 13, 2024

Well, they're not, which means web developers are picking up the slack.

beeboobaa · on March 13, 2024

Stay your lane, and if you really feel the itch go contribute to firefox or something

Retr0id · on March 13, 2024

If you think I'm a front-end developer, you are mistaken. I just empathize with them.

beeboobaa · on March 13, 2024

https://en.wikipedia.org/wiki/Generic_you

Retr0id · on March 13, 2024

Ah, so you were merely replying at me, rather than to me.

beeboobaa · on March 15, 2024

This is a public forum, we're not DMing

pimlottc · on March 13, 2024

How would the browser know when/how to display the capslock status? It doesn't know what any given web site is doing wrt to keyboard input. Firefox adds a capslock indicator on the text cursor but not all pages use standard input fields. They might use custom UI elements, or no visual elements at all. Some sites may not even care about capslock (e.g. an arcade game).

beeboobaa · on March 13, 2024

`<input type='password'`, just like TFA was talking about?

stephenr · on March 13, 2024

> That would be up to the User Agent (the browser), not the website.

Relevant: Safari has done this for ages.

beeboobaa · on March 12, 2024

Maybe I don't want to have to worry about if a PWA is good enough, and will remain good enough?

beeboobaa · on March 9, 2024

It shouldn't be a problem if you only train on legally acquired data. You will know the authors name and can contact them if you so wish.

astrange · on March 10, 2024

There aren't any laws that require "acquiring" something in a way that "knows the author's name".

theferalrobot · on March 9, 2024

I don't think any of the major players could do that for all their data and they are acquiring it legally.

pests · on March 10, 2024

What? How do you know the data your buying isn't AI generated by the sellers?

If they are scamming and you contact them, of course they will lie.

So how does this work?

beeboobaa · on March 8, 2024

> What's actually missing that's stopping this from working?

Proper support on all platforms. No point working on PWAs that have janky tooling (reason: see previous sentence) when they're only going to work decently on Android devices anyway.

evilduck · on March 9, 2024

So why would you build a native Android app if PWAs work better? There’s way more web developers than Android developers, and you would avoid the Play Store fees. Sound cheaper to me. What part of iOS is invalidating the value proposition for Android here?

You also didn’t answer what is missing. What is missing? What’s this insurmountable problem that’s solved everywhere else? Why is janky tooling attributable to Apple?

beeboobaa · on March 9, 2024

Try reading my post again, maybe? The tooling is pretty janky because no one does this yet. No point to torture yourself with janky tooling when you only get to target android anyway...

evilduck · on March 10, 2024

Again, not answering a thing but a making up a claim you aren't willing or able to support. How is this supposed web development tooling jankiness attributable to Apple today? Feature detection is a solved problem. What tooling are you even referring to? You aren't even trying to support this with a concrete claim. This is nonsense and you know it.

PWAs are the perfect scapegoat of infinite nebulous whining. The definition of a progressive web app might as well be "whatever Chrome has but Safari doesn't, no matter what year it is or how those features change, and no matter how terrible of an idea they might be even on Chrome".

beeboobaa · on March 12, 2024

> Proper support on all platforms. No point working on PWAs that have janky tooling (reason: see previous sentence) when they're only going to work decently on Android devices anyway.

If you need it spelled out for you:

* WebUSB

* WebBLE

* WebSerial

* WebGL

* Many more standards Apple refuses to implement because it would let developers break free of their walled garden

Without being able to target apple devices why would I, or anyone, bother using these technologies and invest in their tooling? Just make a native android app with quality tooling that's been around for a decade and be done with it.

evilduck · on March 13, 2024

Right, I wanted you to spell it out because I was expecting you to write that exact sort of nonsense. Your first three examples aren't even web standards they're experimental features in Chrome that not even Firefox supports. The fourth actually is supported by Safari/iOS. What missing standards are stopping you from writing Progressive WEB Apps? Be exact please.

And.... even if you wanted to build a serial-port enabled "Works only in Chrome" PWA today (lol, we both know you're not) there's no tooling jankiness stopping you from doing so, checking for `if ("serial" in navigator) { ... }` requires no tooling at all it's just plain javascript, you'd just choose to show an error message for browsers like Safari and Firefox that don't support it.

I'm not convinced you're even arguing in good faith here. Well, I never was because PWA whiners never are, but you've proven you're not.