Hacker Newsnew | past | comments | ask | show | jobs | submit | more eyberg's commentslogin

The rabbit hole goes even deeper. Georgios was thinking about this with Jikes even earlier: https://gousios.org/pub/gousios-mscthesis.pdf



microvms as espoused by things like firecracker offer full machines but have tradeoffs like no gpu (which makes it boot faster)

hyperlight shaves way more off - (eg: no access to various devices that you'd find via qemu or firecracker) it does make use of virtualization but it doesn't try to have a full blown machine so it's better for things like embedding simple functions - I actually think it's an interesting concept but it is very different than what firecracker is doing


These people definitely do not understand security at all:

https://github.com/unikraft/unikraft/issues/414

Also - one needs to be careful cause many of the workloads they advertise on their site do not actually run under their kernel - it runs under linux which breaks a completely different type of trust barrier.

As for trust/full disclosure - I'm with nanovms.com


they acknowledged the issue and the fix was merged in 2022, what exactly is the criticism here?


No it wasn't - you can still easily replicate. I just did.

My point is that you shouldn't go around talking about how "secure" you are when you have large gaping things like this. This btw is not the only major security issue they have.


Big fan of nanovms! I should have linked that instead, sorry


Outside of nostalgia there's no engineering reason to do this - definitely not for performance.

That same go program can easily go over 10k reqs/sec without having to spawn a process for each incoming request.

CGI is insanely slow and insanely insecure.


What makes it insecure? It's a pretty simple protocol - anything in there that makes is insecure beyond naive mistakes that could be avoided with a well designed library?

EDIT: Looks like the way CGI works made it vulnerable to Shellshock in 2014: https://en.m.wikipedia.org/wiki/Shellshock_(software_bug)

I agree that there's probably not much of an argument to switch to it from the well established alternative mechanisms we are using already.

The one thing in its favor is that it makes it easier to have a polyglot web app, with different languages used for different paths. You can get the same thing using a proxy server though.


CGI has a very long history of security issues stemming primarily from input validation or the lack thereof.


Right, but anything relating to input validation can be avoided by using a well designed library rather than implementing the protocol directly.


> CGI has a very long history of security issues stemming primarily from input validation or the lack thereof.

And a Go program reading from a network connection is immune from the same concerns how?


It's not, you have to use Rust :)


> It's not, you have to use Rust :)

If only I could borrow such confidence in network data... :-D


The language in use often has input validation libraries. The failure of the programmer to use them is not the fault of CGI. Further, proper administration of the machine can mitigate file injection, database injection, etc. Again, that people fail to do this isn’t the fault of CGI.


That's like saying forks and knives are vulnerable becuase you could stab someone with them.


>EDIT: Looks like the way CGI works made it vulnerable to Shellshock in 2014:

From your linked article: If the handler is a Bash script, or if it executes Bash...

But we are talking about Python not Bash.


Yes, Shellshock is kind of a marginal case, but it probably does qualify as a security hole due in part to CGI itself, even though it doesn't affect Python programs (unless they spawn a shell). I don't know of any other examples of security problems caused by CGI, even partly. It's a very thin layer over HTTP.


IIRC the main issue was finding ways to convince a CGI script to write something to disk, at which point you could sometimes make it be treated as another CGI script. More of an issue on Windows than UNIX.


Because... it can execute an arbitrary executable? In the old days, it also ran as root.


It definitely can’t. Either you had to put your script in cgi-bin, use an extension like .cgi in a directory with that feature explicitly enabled, or a magic sticky bit on the file if that was enabled.

You could configure the server to be insecure by, eg, allowing cgi execution from a directory where uploaded files are stored.


No, on all the servers I have any experience with, it can only execute executables the server administrator configures as CGI programs, not executables supplied by an attacker, and they never ran as root. Apache in particular is universally run as a non-root user, since the very first release, and its suEXEC mechanism (used for running CGI programs as their owners for shared web hosting services) refuses to run any CGI program as root. I've never seen a web server on a Unix system running as root: not CERN httpd, not NCSA httpd, not Apache, not nginx, not python -m http.server, not any of the various web servers I've written myself.

I hesitate to suggest that you might be misremembering things that happened 30 years ago, but possibly you were using a very nonstandard setup?


BusyBox runs as root by default, and it's used by hundreds of millions of devices.

For embedded devices (routers, security cameras, etc), it's very common to run CGI scripts as root.

So it is not even 30 years ago, it's still today, because of bad practices of the past.


Oh, that's a good point. In those cases the web server actually needs root, in the sense that it has to be able to upgrade the firmware and reconfigure the network interface.


Hey, it could be worse. Some people launch entire VMs to service individual requests.


That’s cloud-native!


You mean "Web Scale".


How is that done?


I was making a joke about AWS Lambda. It doesn't necessarily start up a new VM for each request, though; it can launch a new VM but it will reuse an existing VM if the same CGI-bin (oops I mean Lambda function) has been executed recently.


You're joking, but working in finance I witnessed the misuse of AWS Batch + AWS ECS to do something similar. Not gonna dox the company, but it was a German fintech unicorn.

It wasn't exactly for serving the response of the request per se, but a single customer click would launch an AWS ECS container with the whole Ruby and Rails VM just to send a single email message, rather than using a standard job queue.

It is extremely slow and super expensive. Amusingly, the UI had to be hardened so that double clicks don't cause two VMs to launch.

The rationale was that they already had batch jobs running in ECS, so "why not use it for all async operations".


I mean, a process kind of is an entire VM. But yeah, serverless is a marketing term for CGI scripts.


Not everything needs 10k RPS, and in some sense there are benefits to a new process – how many security incidents have been caused by accidental cross-request state sharing?

And in a similar vein, Postgres (which is generally well liked!) uses a new backend process per connection. (Of course this has limitations, and sometimes necessitates pgbouncer, but not always.)


Few years ago I felt the same and created trusted-cgi.

However, through the years I learned:

- yes, forks and in general processes are fast - yes, it saves memory and CPU on low load sites - yes, it’s simple protocol and can be used even in shell

However,

- splitting functions (mimic serverless) as different binaries/scripts creates mess of cross scripts communication - deployment is not that simple - security wise, you need to run manager as root and use unique users for each script or use cgroups (or at least chroot). At that moment the main question is why not use containers asis

Also, compute wise, even huge Go app with hundreds endpoints can fit just few megabytes of RAM - there is no much sense to save so few memory.

At worst - just create single binary and run on demand for different endpoints


Even without pgbouncer postgres uses long lived connections (and long lived processes) . So bad example.

Uber famously switched from pg to mysql because their SWEs couldn't properly manage connections


Was that the only reason? In our last testing (2021), on the same hardware and for our specific case (a billions of records database with many tables and specific workfloads), mysql consistently left postgres in the dust performance wise. Internal and external devs pointed out that probably postgres (or rather, our table structures & indexes) could be tweaked with quite a lot of work to be faster, but mysql performed well (for our purpose) even with some very naive options. I guess it depends on the case, but I cannot justify spending 1 cent (let alone far more as we have 100k+ tables) on something while something else is fast enough (by quite a margin) to begin with...


> should be foolproof by design.

I think this is a core reason why containers have such a horrible security track record.

They weren't made by design.

One of the large problems is that there is no "create_container(2)". There are 8? different namespaces in conjunction with cgroups that make up "containers" and they are infinitely configurable. This is problematic and a core reason why we see container escapes almost every other month. Just look at user namespaces - some people use them and some people don't, but it was just a few months ago when multiple bypasses were published for them.


(I'm w/NanoVMs).

Firecracker is a complementary technology and not competitive.

Our main software - https://nanos.org is a linux kernel guest replacement whereas firecracker is a VMM (host) replacement for something like qemu.

So the vast majority of our users deploy to AWS, GCP, etc. and those deploys are created as ec2 instances (but without linux). Some of our users/customers use firecracker for ephemeral workloads.


I see. Thanks for explaining!


bubblewrap is actually worse - there are known escapes in there that haven't been fixed for years


It is the most widely used sandbox layer for pretty much everything. What escapes are you talking about? Are we supposed to take your word for it? Come on


Wait. What? What escapes? Is it that bubblewrap not faithfully implement the policy you give it or that there are surprising gaps in the kernel's namespace isolation?


I kinda see the different side of the coin.

"a determined and amoral adversary" - I'd kinda disagree with this (the amoral adversary part being necessary). If you crawl through the vast data breach notification lists that many states are starting to keep - MA, ME, etc. there are so many of them (like literally daily banks, hospitals, etc. are having to report "data breaches" that never ever make the news) - not all of them are happening cause of ransomware. Sometimes it's just someone accidentally not locking a bucket down or not putting proper authorization on a path that should have it. It gets found/fixed but they still have to notify the state. However, if someone doesn't know what they are looking at, or it's a program so it really has no clue what it's looking at and just sees a bunch of data - there's no malicious intent but that doesn't mean that bad things can't happen because that data has now leaked out.

Guess what a lot of these LLMs are training on?

So while Andrey's software is finding all sorts of interesting stuff there's a bunch of crap being generated inadvertently that is just bad.


A better title might be "Kees Cook's account disabled for suspected malicious activity". He isn't exactly a random contributor.


I think that's worse and is exactly the problem with these - it's one thing for Linus to be yelling at someone in on a mailing list (is he right? is he just mad? who knows? I imagine they'll sort it out!) and something completely different to plaster someone's name next to 'SUSPECTED MALICIOUS' across a popular nerd messageboard.


I think the HN title was completely appropriate.

Linus was not jusy yelling - (i.e.): "You seem to have actively maliciously modified your tree completely." - Linus Torvalds.

Linus didn't even use the word 'potentially' - yet the HN title did.

So get off the back of the person who created the HN alert!


The purpose of HN is not really 'alerts'.


alert, inform, notify, announce ..... that's what HN does.

How about refocusing your attention on the actual matter at hand.

That is, Linus calling out Kees for being actively malicous.

Personally, I would have contacted Konstantin privately to lock (at least temporarily) Kees account. Then I would have sought an explanation from Kees (on the mailing list), and waited for his response. And I would have done this before calling him out as being guilty of active maliciousness.

Of course Linus love drama (especially drama that he creates.).


what HN does.

the actual matter at hand. That is, Linus calling out Kees

That kind of 'matter' is not really what 'HN does' so it's not really 'at hand'.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: