Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are you talking about the cloud host to cloud host networking or the POD networking inside the single host ?

The dizzying amount of NAT layers has to be killing performance. I haven't had the chance to ever sit down and unravel a system running a good load. The lack of TCP tuning combined with the required connection tracking is interesting to think about



i still dont understand why nearly all CNI's are so hell bent on implementing a dozen layers of NAT to tunnel their overlay networks, instead of implementing a proper control plane to automate it all away between routes.

Calico seems to be doing it semi-okeish, and even their the control plane is kind of unfinished?

The only software based solution which seem to properly have this figured out is VMware NSX-T. (i am not counting all the traditional overlay networks in use by ISP's based on MPLS/BGP).


I believe Azure CNI is pretty much point-to-point.

Azure Load Balancers and their software defined network use packet header rewriting at the host level to bypass the need for the traffic to physically traverse a load balancer appliance or a NAT appliance. They're generally rewritten when they arrive to the host hypervisor. This is done in hardware via an FPGA inline with the NICs. (This requires "Accelerated Networking" to be enabled, but that's the default in v4 VMs and required for v5 VMs.)

I'm not certain, but I believe AWS does something similar for their VMs. (Their marketing material mentions that they use a custom ASIC instead of an FPGA like Azure.)

With Azure Kubernetes Service (AKS), you can use the Azure CNI, which gives each Pod a unique IP address on the Azure Virtual Network. I can't confirm, but I'm reasonably certain that this means that Pod-to-Pod traffic is direct, with no NAT appliance or software in the way. Essentially the host NICs do the address translation inline at line rate and essentially zero latency.

However, PaaS platforms like Azure App Service or Azure SQL Database are very bad in comparison. They proxy and tunnel and NAT, all in software. I've seen latencies north of 7 milliseconds within a region!


Before you even get to the CNI, I think AWS VM to internet is at least 3 NAT layers.

So we have 3 layers from container to pod. The virtual host kernel is tracking those layers. Once connection to one container is 3 tracked connections. Then you have whatever else you put on top to go in and out of the internet.

The funny think to me is HaProxy recommended getting rid of connection tracking for performance while everyone is doubling down on that alone and calling it performant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: