Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

feels like it could be nice to abide by the license terms https://bria.ai/bria-huggingface-model-license-agreement/

> 1.1 License. > BRIA grants Customer a time-limited, non-exclusive, non-sublicensable, personal and non-transferable right and license to install, deploy and use the Foundation Model for the sole purpose of evaluating and examining the Foundation Model. > The functionality of the Foundation Model is limited. Accordingly, Customer are not permitted to utilize the Foundation Model for purposes other than the testing and evaluation thereof.

> 1.2.Restrictions. Customer may not: > 1.2.2. sell, rent, lease, sublicense, distribute or lend the Foundation Model to others, in whole or in part, or host the Foundation Model for access or use by others.

> The Foundation Model made available through Hugging Face is intended for internal evaluation purposes and/or demonstration to potential customers only.



A lot of these AI licenses are a lot more restrictive than old school open source licenses were.

My company runs a bunch of similar web-based services and plan to do a background remover at some stage, but as far as I know there's no current models with a sufficiently permissive license that can also feasibly download & run in browsers.


Meta's second Segment Anything Model (SAM2) has an Apache license. It only does segmenting, and needs additional elbow grease to distill it for browsers, so it's not turnkey, but it's freely licensed.


Yeah, that one seems to be the closest so far. Not sure if it would be easier to create a background removal model from scratch (since that's a more simple operation than segmentation) or distill it.


I got pretty far down that path during Covid for a feature of my saas, but limited to specific product categories on solid-ish backgrounds. Like with a lot of things, it’s easy to get good, and takes forever to get great.


Keep in mind that whether or not a model can be copyrighted at all is still an open question.

Everyone publishing AI model is actually acting as if they owned copyright over it and as such are sharing it with a license, but there's no legal basis for such claim at this point, it's all about pretending and hoping the law will be changed later on to make their claim valid.


Train on copyrighted material

Claim fair use

Release model

Claim copyright

Infinite copyright!


It is a 2024 model, for comparison https://github.com/danielgatis/rembg/ uses U2-Net which is open source from 2022. There is also https://github.com/ZhengPeng7/BiRefNet (another 2024 model, also open source), it's not too late to switch.


It's kind of silly to complain about not abiding by the model license when these models are trained on content not explicitly licensed for AI training.

You might say that the models were legally trained since no law mandates consent for AI training. But no law says that models are copyrightable either.


AI model weights are probably not even copyrightable.


Surely they would at least be protected by Database Rights in the EU (not the US):

>The TRIPS Agreement requires that copyright protection extends to databases and other compilations if they constitute intellectual creation by virtue of the selection or arrangement of their contents, even if some or all of the contents do not themselves constitute materials protected by copyright

https://en.wikipedia.org/wiki/Database_right


Those require the "database" in question to be readable and for every single element to be so too. Model weights don't satisfy that requirement.


At some point the worlds going to need a Richard Stallman of AI who builds up a foundation that is usable and not in the total control of major corporations. With reasonable licensing. OpenAI was supposed to fit that mold.


The repo doesn’t include the model.


Does the site not distribute it?


It doesn't, except that it runs it. There's no download link or code playground for running arbitrary code on it, so while technically it transfers the model to the computer where it's running (I think) it's not usually considered the same as distributing it.


Pretty sure downloading it to your browser counts as distributing it, legally speaking.


I think it's a bit more subtle than that. The code of this tool runs in your browser and makes it download the model from huggingface. So it does not host the model or provide it to you, it just does the download on your behalf directly from where the owner of the model put it. The author of this tool is not providing the model to you, just automating the download for you. Not saying it's not a copyright violation, and IANAL, but it's not a obvious one.


AYAL?


Sure!


Yeah, that doesn't sound right to me.


What's the point of running it in WebGPU then?

I think it's either running the model in the browser or a small part of it there. Maybe it's downloading parts of the model on the fly. But I kinda doubt it's all running on the server except for some simple RPC calls to the browser's WebGL.


What's the point of running it in WebGPU then?

Use client resources instead of server resources.


Anyone can easily do a online/offline binary check for web apps like these:

1. Load the page

2. Disconnect from the internet

3. Try to use the app without reconnecting


Well, my question is about where it lies within the gray area between fully online and fully offline, so that wouldn't work.

Edit: Good call! It's fully offline - I disabled the network in Chrome and it worked. Says it's 176MB. I think it must be downloading part of the model, all at once, but that's just a guess.

The 176MB is in storage which makes me think that my browser will hold onto it for a while. That's quite a lot. My browser really should provide a disk clearing tool that's more like OmniDiskSweeper than Clear History. If for instance it showed just the ones over 20MB, and my profile was using 1GB, at most it would be 50, a manageable amount to go through and clear the ones I don't need.


Yeah, this is why I think browsers need to start bundling some foundational models for websites to use. It's too unscalable if many websites start trying to store a significantly sized model each.

Google has started addressing this. I hope it becomes part of web standards soon.

https://developer.chrome.com/docs/ai/built-in

"Since these models aren't shared across websites, each site has to download them on page load. This is an impractical solution for developers and users"

The browser bundles might become quite large, but at least websites won't be.


As long as there’s a way to disable it. I don’t want my disk space wasted by a browser with AI stuff I won’t use.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: