Why does everyone use reverse proxies? It seems complex and inefficient. Why not...

coryodaniel · on Sept 13, 2017

Off the top of my head: load balancing, hiding details of app servers, compressing responses and multivariate testing.

All of which could be done at the app server level sure, but then that would shift that complexity to your app and your developers.

Oh and job security, obviously.

fulafel · on Sept 13, 2017

You could do all of those, except hiding app servers, with the client based technique I outlined in the nearby other comment. It would just be a tweak to the rule that the frontend uses to choose the the app server.

manigandham · on Sept 13, 2017

That's a simplistic scenario and does not apply at all here. Kubernetes is a container orchestration platform that can run thousands of containers over thousands of compute nodes and directing traffic to them will require some sort of routing/proxy system.

fulafel · on Sept 14, 2017

We already have routing systems for large numbers of nodes in the internet technology stack, it's not obvious to me why we another one on the HTTP layer.

manigandham · on Sept 14, 2017

Many of those routing systems are proxies, and they can apply at any layer.

fulafel · on Sept 14, 2017

I'm not sure I follow you. Do you mean that the routing systems at the lower networking layers can be thought of as proxies in the sense that they copy data in and copy data out? That's technically correct, but they're not conventionally called proxies.

manigandham · on Sept 15, 2017

Proxies, by definition, are intermediaries. Everything on the internet is connected by a giant network of proxies at some layer - NATs, gateways, firewalls, etc.

Kubernetes runs a cluster of machines that act like a mini internet, with many containers running many apps. These apps communicate with each other across containers and machines through a series of proxies so that apps only have to worry about a single address or service name. Kubernetes does allow for headless services which will publish all of the pod IPs under a DNS name if you want but this is usually not the common scenario.

Beyond just knowing endpoints, apps may need to worry about healthchecking, failover, load balancing, rate limiting, security, observability, routing decisions and more. It's far simpler to consolidate all this functionality rather than leaving it to every single app to implement it all over again.

An ingress controller is in charge of running a specific proxy that deals with traffic into and out of the cluster rather than within it. There are several implementations other than nginx and they all require various levels of tuning to fit the needs of the cluster, but it's an optimal solution since you might not want or have access to control traffic on the other side.

philipcristiano · on Sept 13, 2017

What would you use to provide a single endpoint to multiple instances of an app server?

endorphone · on Sept 13, 2017

There are scenarios where your app servers might be varied as well -- I've leveraged reverse proxies in front of a PHP application that had parts in .NET and parts in Go, for instance.

Technologies/competencies change as projects evolve, and being able to effortlessly reorganized and reroute is so profoundly powerful.

fulafel · on Sept 14, 2017

Sure, I'm symphatetic to this kind of "in the trenches" application of reverse proxies - just not doing it by default.

fulafel · on Sept 13, 2017

What about just exposing the multiple instances of app server, and have the frontend code select one for load balancing or failover purpouses? There could be a load balancing config read by the client, or you can have static rules in the frontend js, like choosing shard number based on a hash from the client ip address.

Round-robin DNS might also work or complement this.

manigandham · on Sept 13, 2017

So you're answer to not using a reverse proxy is to fake your own via client-side logic? It's far better to have a tested, reliable, dynamic, and scalable solution right next to the actual app servers instead.

Almost everything on the internet is behind layers of proxies, it's not a bad thing and isn't much cause for concern.

fulafel · on Sept 14, 2017

I think your viewpoint might be somewhat inflexible if routing logic in the client & server looks like "faking a reverse proxy" to you. That's where the rest of the logic is, after all, and when designing systems we generally prefer to have the logic in fewer places.

It's a proven design rule (the end-to-end principle) to prefer the smarts at the edges of your system, and the problems stemming from the reverse proxy described in the article, in my book, counts as further evidence for this idea.

manigandham · on Sept 14, 2017

> prefer the smarts at the edges of your system

That's exactly what reverse proxies do - leaving the internal apps free to just serve requests instead of worrying about the perimeter.

The problems described in this article have nothing to do with reverse proxies but rather the ingress controller and config settings.

fulafel · on Sept 14, 2017

Reverse proxies making routing decisions based on request content and making server load balancing decisions is putting policy in the middle.