microvms as espoused by things like firecracker offer full machines but have tradeoffs like no gpu (which makes it boot faster)
hyperlight shaves way more off - (eg: no access to various devices that you'd find via qemu or firecracker) it does make use of virtualization but it doesn't try to have a full blown machine so it's better for things like embedding simple functions - I actually think it's an interesting concept but it is very different than what firecracker is doing
Also - one needs to be careful cause many of the workloads they advertise on their site do not actually run under their kernel - it runs under linux which breaks a completely different type of trust barrier.
As for trust/full disclosure - I'm with nanovms.com
No it wasn't - you can still easily replicate. I just did.
My point is that you shouldn't go around talking about how "secure" you are when you have large gaping things like this. This btw is not the only major security issue they have.
What makes it insecure? It's a pretty simple protocol - anything in there that makes is insecure beyond naive mistakes that could be avoided with a well designed library?
I agree that there's probably not much of an argument to switch to it from the well established alternative mechanisms we are using already.
The one thing in its favor is that it makes it easier to have a polyglot web app, with different languages used for different paths. You can get the same thing using a proxy server though.
The language in use often has input validation libraries. The failure of the programmer to use them is not the fault of CGI. Further, proper administration of the machine can mitigate file injection, database injection, etc. Again, that people fail to do this isn’t the fault of CGI.
Yes, Shellshock is kind of a marginal case, but it probably does qualify as a security hole due in part to CGI itself, even though it doesn't affect Python programs (unless they spawn a shell). I don't know of any other examples of security problems caused by CGI, even partly. It's a very thin layer over HTTP.
IIRC the main issue was finding ways to convince a CGI script to write something to disk, at which point you could sometimes make it be treated as another CGI script. More of an issue on Windows than UNIX.
It definitely can’t. Either you had to put your script in cgi-bin, use an extension like .cgi in a directory with that feature explicitly enabled, or a magic sticky bit on the file if that was enabled.
You could configure the server to be insecure by, eg, allowing cgi execution from a directory where uploaded files are stored.
No, on all the servers I have any experience with, it can only execute executables the server administrator configures as CGI programs, not executables supplied by an attacker, and they never ran as root. Apache in particular is universally run as a non-root user, since the very first release, and its suEXEC mechanism (used for running CGI programs as their owners for shared web hosting services) refuses to run any CGI program as root. I've never seen a web server on a Unix system running as root: not CERN httpd, not NCSA httpd, not Apache, not nginx, not python -m http.server, not any of the various web servers I've written myself.
I hesitate to suggest that you might be misremembering things that happened 30 years ago, but possibly you were using a very nonstandard setup?
Oh, that's a good point. In those cases the web server actually needs root, in the sense that it has to be able to upgrade the firmware and reconfigure the network interface.
I was making a joke about AWS Lambda. It doesn't necessarily start up a new VM for each request, though; it can launch a new VM but it will reuse an existing VM if the same CGI-bin (oops I mean Lambda function) has been executed recently.
You're joking, but working in finance I witnessed the misuse of AWS Batch + AWS ECS to do something similar. Not gonna dox the company, but it was a German fintech unicorn.
It wasn't exactly for serving the response of the request per se, but a single customer click would launch an AWS ECS container with the whole Ruby and Rails VM just to send a single email message, rather than using a standard job queue.
It is extremely slow and super expensive. Amusingly, the UI had to be hardened so that double clicks don't cause two VMs to launch.
The rationale was that they already had batch jobs running in ECS, so "why not use it for all async operations".
Not everything needs 10k RPS, and in some sense there are benefits to a new process – how many security incidents have been caused by accidental cross-request state sharing?
And in a similar vein, Postgres (which is generally well liked!) uses a new backend process per connection. (Of course this has limitations, and sometimes necessitates pgbouncer, but not always.)
Few years ago I felt the same and created trusted-cgi.
However, through the years I learned:
- yes, forks and in general processes are fast
- yes, it saves memory and CPU on low load sites
- yes, it’s simple protocol and can be used even in shell
However,
- splitting functions (mimic serverless) as different binaries/scripts creates mess of cross scripts communication
- deployment is not that simple
- security wise, you need to run manager as root and use unique users for each script or use cgroups (or at least chroot). At that moment the main question is why not use containers asis
Also, compute wise, even huge Go app with hundreds endpoints can fit just few megabytes of RAM - there is no much sense to save so few memory.
At worst - just create single binary and run on demand for different endpoints
Was that the only reason? In our last testing (2021), on the same hardware and for our specific case (a billions of records database with many tables and specific workfloads), mysql consistently left postgres in the dust performance wise. Internal and external devs pointed out that probably postgres (or rather, our table structures & indexes) could be tweaked with quite a lot of work to be faster, but mysql performed well (for our purpose) even with some very naive options. I guess it depends on the case, but I cannot justify spending 1 cent (let alone far more as we have 100k+ tables) on something while something else is fast enough (by quite a margin) to begin with...
I think this is a core reason why containers have such a horrible security track record.
They weren't made by design.
One of the large problems is that there is no "create_container(2)". There are 8? different namespaces in conjunction with cgroups that make up "containers" and they are infinitely configurable. This is problematic and a core reason why we see container escapes almost every other month. Just look at user namespaces - some people use them and some people don't, but it was just a few months ago when multiple bypasses were published for them.
Firecracker is a complementary technology and not competitive.
Our main software - https://nanos.org is a linux kernel guest replacement whereas firecracker is a VMM (host) replacement for something like qemu.
So the vast majority of our users deploy to AWS, GCP, etc. and those deploys are created as ec2 instances (but without linux). Some of our users/customers use firecracker for ephemeral workloads.
It is the most widely used sandbox layer for pretty much everything. What escapes are you talking about? Are we supposed to take your word for it? Come on
Wait. What? What escapes? Is it that bubblewrap not faithfully implement the policy you give it or that there are surprising gaps in the kernel's namespace isolation?
"a determined and amoral adversary" - I'd kinda disagree with this (the amoral adversary part being necessary). If you crawl through the vast data breach notification lists that many states are starting to keep - MA, ME, etc. there are so many of them (like literally daily banks, hospitals, etc. are having to report "data breaches" that never ever make the news) - not all of them are happening cause of ransomware. Sometimes it's just someone accidentally not locking a bucket down or not putting proper authorization on a path that should have it. It gets found/fixed but they still have to notify the state. However, if someone doesn't know what they are looking at, or it's a program so it really has no clue what it's looking at and just sees a bunch of data - there's no malicious intent but that doesn't mean that bad things can't happen because that data has now leaked out.
Guess what a lot of these LLMs are training on?
So while Andrey's software is finding all sorts of interesting stuff there's a bunch of crap being generated inadvertently that is just bad.
I think that's worse and is exactly the problem with these - it's one thing for Linus to be yelling at someone in on a mailing list (is he right? is he just mad? who knows? I imagine they'll sort it out!) and something completely different to plaster someone's name next to 'SUSPECTED MALICIOUS' across a popular nerd messageboard.
alert, inform, notify, announce ..... that's what HN does.
How about refocusing your attention on the actual matter at hand.
That is, Linus calling out Kees for being actively malicous.
Personally, I would have contacted Konstantin privately to lock (at least temporarily) Kees account. Then I would have sought an explanation from Kees (on the mailing list), and waited for his response. And I would have done this before calling him out as being guilty of active maliciousness.
Of course Linus love drama (especially drama that he creates.).