Sure, but what percentage of EC2 instances do you think have a firewall rule like that? Defaults are important. Not to mention the fact that the rule you listed breaks down if you’re running docker containers.
There’s no good defaults here. From AWS’s perspective the whole VM is in the same security context. It’s all untrusted customer code from the kernel on up. That’s the granularity on which policy can be applied. Either the whole VM can do something or it can’t. How you define your own security boundaries inside is your business. You can help AWS by doing your own access keys but most people would rather just also make their security boundary the VM and not have to think about it. Treat a compromised processes and a compromised host the same.
That's the "give up on container and process security" approach. It's a fine approach from any given user's standpoint. From an infra/tool provider standpoint, it ignores the intense demand that exists for establishing least privilege and defense-in-depth for containerized workloads. Fargate might be a good solution too.
This is backwards, you don't have to give up on container/process security just because AWS's own trust boundary is the VM. But to do that you can't assign VM-level privs. Something something eating cake.
Put yourself in AWS's position. Your customer is running a VM which you have only hypervisor level control over. Could be Linux, could be AIX, could be an appliance. How could you possibly implement user/process level security from the outside? How could you know what processes inside the opaque black box are the privileged ones?
Sorry, I don't think what you're saying makes sense. A key strength of AWS is that they don't use a "black box" approach when thinking about customer needs, and provide documentation and good defaults to help customers achieve their goals.
> From AWS’s perspective the whole VM is in the same security context.
That's AWS's perspective. But my perspective, as an AWS customer, is important as well. And I might want to run something on a VM that I myself don't necessarily trust - that is the ultimate benefit of an ephemeral, isolated VM.
Right, then you run that code in a VM that has no ambient privs. If you still need to do AWS access stuff then you just stick your access keys in files somewhere your untrusted process can't touch like if you weren't on AWS.
ECS and EKS have configurable functionality to blackhole IMDS. You can also configure IMDSv2 to have a hop limit of 1, preventing any bridged networks (such as containers) from accessing IMDS.
No, recent docker uses UID namespaces, where "root" inside a container translates to a high real UID as seen from the host OS (100000 or some such). iptables' "owner" extension would use that real UID to match.
Which could be a problem if intention was to allow it, as your sibling comment points out.
Yes - this is not really a Docker thing but a Linux kernel thing, although client-side support is of course needed from Docker and any other system that uses cgroups/namespaces. Also, one other thing to know is from the kernel's point of view "root" is not a thing. It has been unbundled into a set of capabilities (https://man7.org/linux/man-pages/man7/capabilities.7.html). When you launch a container, you specify which capabilities you want to drop (so even root in the container can't have them), and which UID mapping you want to use.
Yeah, if you're running containers with root access or processes with root running inside of them, there are a number of other things that will need to change about your security posture, least of which is probably how you provide or restrict access to IAM roles within the tenant boundary.