Why does everyone use reverse proxies? It seems complex and inefficient. Why not serve xhr's and other dynamic content from the app server(s) and static content from a static webserver?
You could do all of those, except hiding app servers, with the client based technique I outlined in the nearby other comment. It would just be a tweak to the rule that the frontend uses to choose the the app server.
That's a simplistic scenario and does not apply at all here. Kubernetes is a container orchestration platform that can run thousands of containers over thousands of compute nodes and directing traffic to them will require some sort of routing/proxy system.
We already have routing systems for large numbers of nodes in the internet technology stack, it's not obvious to me why we another one on the HTTP layer.
I'm not sure I follow you. Do you mean that the routing systems at the lower networking layers can be thought of as proxies in the sense that they copy data in and copy data out? That's technically correct, but they're not conventionally called proxies.
Proxies, by definition, are intermediaries. Everything on the internet is connected by a giant network of proxies at some layer - NATs, gateways, firewalls, etc.
Kubernetes runs a cluster of machines that act like a mini internet, with many containers running many apps. These apps communicate with each other across containers and machines through a series of proxies so that apps only have to worry about a single address or service name. Kubernetes does allow for headless services which will publish all of the pod IPs under a DNS name if you want but this is usually not the common scenario.
Beyond just knowing endpoints, apps may need to worry about healthchecking, failover, load balancing, rate limiting, security, observability, routing decisions and more. It's far simpler to consolidate all this functionality rather than leaving it to every single app to implement it all over again.
An ingress controller is in charge of running a specific proxy that deals with traffic into and out of the cluster rather than within it. There are several implementations other than nginx and they all require various levels of tuning to fit the needs of the cluster, but it's an optimal solution since you might not want or have access to control traffic on the other side.
There are scenarios where your app servers might be varied as well -- I've leveraged reverse proxies in front of a PHP application that had parts in .NET and parts in Go, for instance.
Technologies/competencies change as projects evolve, and being able to effortlessly reorganized and reroute is so profoundly powerful.
What about just exposing the multiple instances of app server, and have the frontend code select one for load balancing or failover purpouses? There could be a load balancing config read by the client, or you can have static rules in the frontend js, like choosing shard number based on a hash from the client ip address.
Round-robin DNS might also work or complement this.
So you're answer to not using a reverse proxy is to fake your own via client-side logic? It's far better to have a tested, reliable, dynamic, and scalable solution right next to the actual app servers instead.
Almost everything on the internet is behind layers of proxies, it's not a bad thing and isn't much cause for concern.
I think your viewpoint might be somewhat inflexible if routing logic in the client & server looks like "faking a reverse proxy" to you. That's where the rest of the logic is, after all, and when designing systems we generally prefer to have the logic in fewer places.
It's a proven design rule (the end-to-end principle) to prefer the smarts at the edges of your system, and the problems stemming from the reverse proxy described in the article, in my book, counts as further evidence for this idea.