Malicious AI models on Hugging Face backdoor users' machines

observationist · on Feb 29, 2024

So where is the list of malicious models?

What's the advantage of paying jfrog over using safetensors?

Why is it legal for them to make these claims without providing a discrete list of known malware infected models?

This seems really shady, to me, bordering on an extortion racket. "It'd be a shame if something happened to your computer, but don't worry, you can pay us for protection..."

In addition, where is the HuggingFace report on models known to have been infected? HF has a responsibility to inform people of potential issues if they downloaded something before it was flagged. Maintaining a list of known bad models would be a good idea, I think.

weinzierl · on Feb 29, 2024

For a long time I thought models were just weights. The first time I heard that Hugging Face models contain executable code was at FOSDEM in Stephen Chin's talk: "Security Starts Within the SBOM".

I think there was also an example where Python code with a try block, when it failed because of a missing dependency did a `sudo apt install` in the catch block.

Not sure how much SBOMs help, but we sure need to get this supply chain mess in order.

fzzzy · on Feb 29, 2024

This is only true for models that are python pickles. gguf doesn't suffer from this.

The ml community needs to move away from pickle files as soon as possible. Worst idea ever.

water-data-dude · on March 2, 2024

WHAT? Python warns you in the most apocalyptic language possible that pickles are dangerous as hell. Are the pickle files buried in the HF models in a non-obvious way? I can’t imagine people would just run random pickles from the internet (but then again, people do run `curl [URL] | bash` all the time…)

fzzzy · on March 6, 2024

I'm not responsible for whatever led the ml community to the widespread use of pickles for weights. It happened regardless of the warnings that pickles are dangerous.

Yes, ml researchers have been running random pickles from the internet for years now.

No they are not buried in a non obvious way. They are at the very top of many models. safetensors and gguf have started to replace this horrible practice, but there are still tons of models that use pickles.

morkalork · on Feb 29, 2024

I kind of wish I hadn't just read this. Thanks.

wolftickets · on Feb 29, 2024

Full disclosure I am head of product at Protect AI. To make this easier for everyone we have an open source tool (friendly licensing) called ModelScan https://github.com/protectai/modelscan/tree/main I wouldn't be shocked if they are using this under the hood, but all the best if they are! For a bit more info on this type of attack: https://protectai.com/blog/announcing-modelscan

educaysean · on Feb 29, 2024

Feels like we're back in 1999 when we were downloading random executables from the web to use as screensavers. The AI space is going to be a pretty meaty target for malicious actors, both in terms of vehicles for malware as well as propaganda machines. The next decade is going to be interesting.

littlestymaar · on Feb 29, 2024

It's been the case with npm and the likes for the past decade already, and indeed the past decade has been interesting with respect to the so called “supply-chain attacks”…

jsight · on Feb 29, 2024

I'm surprised that things like this don't happen more often. It seems like uploading malicious binaries to npm or other registries would be fairly easy.

Then again, convincing people to play around with a model might be easier than getting them to use your library in an application.

chatmasta · on Feb 29, 2024

I've seen an alarming number of Colab notebooks that casually ask for access to your entire Google Drive and then download a few thousand Python packages (and who knows what else) and execute code.

kramerger · on Feb 29, 2024

Eh, have you tried vscode lately?

Some of the official extensions install code from random places like that.

mistrial9 · on Feb 29, 2024

maybe it is more like - one million transactions in good faith and then an actor or group does a Bad Faith action.. is the tolerance for 'perfect' in the way ? How can ordinary use be enabled while correctly blocking out malicious use, without crazy and ugly intrusive security engineering ?

put another way, maybe the cost of a few bad faith actors is much higher than anticipated among people doing skilled work? How is that handled in a human way?

I can guess that de-platforming based on whatever is much easier for the server side. This dynamic is how we got to the personal computer "revolution" thirty+ years ago

the8472 · on Feb 29, 2024

Model weights are data, they shouldn't contain general purpose code with access to the host environment.

Sure, any library/tool that harnesses models might contain malicious code. The models themselves should not be able to.

This is more like distributing music as .exe files instead of .flac

Filligree · on Feb 29, 2024

You'd hope, but a lot of models are still distributed as .pt files, which are Pickle — which means they can contain and execute arbitrary code on load.

Some popular applications (Automatic1111 etc.) take measures against that, but the real fix is to ignore anything that isn't safetensors.

fragmede · on Feb 29, 2024

Which, large platforms like Hugging Faces bear the responsibility to do what they can to prohibit python pickles.

jsight · on Feb 29, 2024

I think some of the models require remote code execution to support things like custom tokenizers.

bongoman37 · on Feb 29, 2024

There are action models based on LLMs like Gorilla LLM or autoGPT that take a natural language input and convert that to API calls that could do all sorts of things from sending emails to performing stock trades.

the8472 · on Feb 29, 2024

To my knowledge it's still the harnesses that perform those actions, not baked into the weights (other than perhaps special tokens denoting that an action should be taken)

jwitthuhn · on Feb 29, 2024

Don't see it mentioned in the post, can any of these problems exist for models in the safetensors format? Can't say I know enough about model serialization to understand exactly how much safer it is.

rahimnathwani · on Feb 29, 2024

No, and that's the main reason safetensors was created.

Pickle was created to store Python objects. Safetensors was designed to store (only) weights.

RicoElectrico · on Feb 29, 2024

Worthy of note is that HF has a bot that can convert a model to safetensors.

lenerdenator · on Feb 29, 2024

Less of a huggingface, more of a facehugger.

rvz · on Feb 29, 2024

Doesn't help that these AI models are entirely black-boxes and you can't even inspect or interpret them.

This is no different to posting binaries online and telling folks to run them on their machines.

Me1000 · on Feb 29, 2024

There's a big difference, and there's absolutely a safe way to run models locally. Pytorch files use pickle serialization[0] which is insecure and can allow you to embed arbitrary code. Regular users should not be using that, they should instead be using safetensors or something like gguf which is (more or less) just a set of weights.

A gguf model file can be thought of (at a very high level) as a jpg. We have safe and secure ways of decoding and using a jpg.

The "black box" you refer to is more about not understanding why a model does what it does. We don't know why a particular node in the network has a value that ends up influencing the output. We understand how a deep neural net works.

[0] https://docs.python.org/3/library/pickle.html

rvz · on Feb 29, 2024

> Regular users should not be using that, they should instead be using safetensors or something like gguf which is (more or less) just a set of weights.

Regular users do not know any of this and do not care. All they know is to download .exe and run and never check whatever they are downloading is malicious or not.

> The "black box" you refer to is more about not understanding why a model does what it does.

That is my additional point which makes this situation absolutely even worse.

> We understand how a deep neural net works.

No one does and certainly not even the AI scientists even understand the unpredictable behaviours of these models after training.

Me1000 · on Feb 29, 2024

I'm sorry I think I didn't explain my point clearly. I'm trying to point out there's a difference between not understanding why a model outputs a specific token, and not understand what the computer is doing under the hood. We understand very well how fed forward networks work computationally, and there's nothing really insecure about that. The insecure problem here is in the serialization format that some of these models are distributed as.

As for my "regular users" comment, I don't think it's hard to imagine a world where users have a trusted program that runs the models. This is how basically all file formats work. Excel was insecure at one point for similar reasons, you could embed malicious code in macros, but today excel spreadsheets can run computations on files downloaded from the web and it's just as secure as open a .txt file.

rvz · on Feb 29, 2024

> I'm trying to point out there's a difference between not understanding why a model outputs a specific token, and not understand what the computer is doing under the hood.

Whenever there is trust involved, there is no difference to your point. How you're running the model requires trusting that the model, parser, etc isn't compromised and especially if it can be trusted to behave correctly after training - thus transparent explanations rather than hallucinations and its easy to trick and compromise them to do something else.

Given that we already don't trust the outputs of these AI models, the above security issue make this even worse and untrustworthy. The plain old regular average joe users do not care about the neural network format, etc and will run and open anything without checking regardless even if it is a text file disguised as a program.

Thus, it is entirely no different and we are back to square -1 (with the unexplainable properties of these models that people will try it out and trust its outputs)

saltcured · on Feb 29, 2024

I agree that we can define a safe serialization format for models with given assumptions about architecture. I.e. when the model is just the matrices and cannot supply custom inference code needed to process the matrices.

But, I expect we're going to have additional rounds of insecure practice just like we've had in every other popularized tech movement. People are going to develop frameworks with code-injection flaws, where they assume mode outputs (tokens) can be trusted and can contain executable content. Then, the inscrutable and untrustworthy models are going to be a problem as well.

Think everything from MS Office macro abuse to human drivers blindly following their GPS guidance into a lake. This can and most probabaly will be repeated with AI models, due to the prevalence of naive and over-trusting practitioners and consumers.

Me1000 · on Feb 29, 2024

100%. Software takes time to make it secure, since after all it’s written by us flawed humans. The runtime that consumes the model files might have bugs, but those will be fixed over time.

How people use model outputs (or inputs; I.e prompt injections) is a whole other area ripe for exploits, especially while these technologies are being adopted by people who don’t really understand them. But I view this as fundamentally different than the above. A model format can be secure in that it won’t just randomly delete files on your computer. This is a computer science problem and thus provably securable. I guess that was the point I was trying to make when differentiating why they work vs how they work.

helij · on Feb 29, 2024

Would running models in VM mitigate the risk?

heavyset_go · on Feb 29, 2024

Depends, there are VM escapes.

api · on Feb 29, 2024

Yeah but defense in depth. Now you have to own the VM and escape the VM, and the latter usually requires an attack against the exact VM you are running which may be hard to determine.

actualwitch · on Feb 29, 2024

You'd have to do gpu passthrough if you want performance.

bongoman37 · on Feb 29, 2024

I think we need models to safeguard against models and then maybe hard coded rules to safeguard against those models missing out on things. It would be exceedingly hard to verify what all could be maliciously encoded in a billion parameter model waiting to be unleashed with the right sequence of tokens.