I've used varnish in the past, but I have trouble understanding where it fits in the stack right now.
It doesn't do TLS. Something else needs to be put in front of it to terminate HTTPS connections. HTTP/2 support is planned but (again) with no TLS, making it a dud as browsers don't implement non-TLS HTTP/2.
The mass storage engine isn't part of the opensource release and my experience with larger-than-memory scenarios in the standard varnish has been less than stellar. Again, needing something else to handle this bit or control what's fed to it.
My conclusion is that if I always have to put an nginx in there somewhere, I might as well not add varnish into the mix.
Doing HTTP separately from TLS in 2016 isn't doing "one thing" well, it's doing "half a thing."
If you're serving HTTPS to clients and your backends are also HTTPS, you quickly get into a mess of extra components opening unnecessary sockets around just to do TLS.
PHK has argued that the SSL/TLS libraries we have is a nightmare when it comes security design. He rather favor "jailing" these processes as a separate process. So if hell breaks lose, like with Heartbleed, your Varnish process should still be "safe".
I agree. But that doesn't require showing that separation to users. Varnish could internally separate TLS from the rest while still behaving as a single service.
I see a lot of people put Varnish into their stack, but mostly to serve static files, which feels like a waste - Nginx can handle that fine for most people's stacks, I'd imagine.
For really high traffic sites where static assets are off-loaded to other servers, I wonder if Varnish even makes sense (versus an enterprise solution with edge servers across the world).
Using Varnish to cache http request that are returned from your code (e.g. the HTML returned from an application) is where I'd imagine you'd see the highest gains. However, that gets pretty complex, especially when dealing with cookies for logged in users.
What I would find really useful are guides on getting utility out of Varnish, rather than just "cache http responses" - for example:
* Ability to do cloudfront/s3 style signed-requests to create temporary or one-time downloads for authenticated users
* Cache OTHER PEOPLE's API's that your site calls out to, especially if the other API doesn't change often. In this scenario, your app would call on a Varnish server, which in turn would be caching responses from an external API (a reversal of the typical way you'd think people use Varnish). One possible use case might be in deployment, grabbing .zip files from Github.
* Take a look at this blog post [0]. That can cover a lot of different security scenarios. The next step would be linking in an external system (database, API, etc) to better track more complex security policies.
This is a bit offtopic, but since it seems they're using Varnish, does anyone have a clue why the City of DC's website always returns invalid content on the first attempt to access? Reloading returns the content. Browser Spy reports they're running Drupal 7 on Apache with Varnish 1.1.
No security layer though right? Meaning if we want to restrict access to certain content based on user token or something, how to do that with Varnish? How to prevent John from reading Mary's tweets?
They seem (at first glance) to have no concept of security in the sense of restricting people's access to certain content like in the case of all social networks.
In most setups, Varnish will perform a lot better. 10x better performance should be expected for any high traffic site.
Varnish is a lot easier to configure if you have some complexity, for example if you want to cache the same page but for different languages that is sent from the browser.
There is also another alternative, Apache Traffic Server.
In the end, it may be Nginx, Apache Traffic Server which are the right tools for your problem. None of them solves every caching problem.
There is this concept, especially in enterprise level architecture which is similar to a separation of concerns; basically that you use tools to solve a particular need. Sure, one could get by using the built-in caching of nginx or Apache httpd (version 2.4.x, which is just as performant as nginx), but when the cache becomes a bottleneck, and it WILL at some point when using a "does everything solution", you need something that simply does caching and does it WELL. Varnish will simply beat the pants off of nginx and httpd in head-to-head caching tests. Plus, the level of configurability in Varnish is again much, much better than what exists in nginx or httpd.
It seems kind of nasty to say this, but nginx isn't the be all and end all. Now, yeah, you could say I'm biased, but this is because instead of being swayed by marketing, PR and FUD, I instead am swayed by real world scenarios. nginx is very, very good, but it is no longer heads-and-shoulders above httpd or anything else really. It is weird, and wrong, to take every other tool which does a part of what nginx does and immediately want to criticize it or say "nginx does that too! What do we need Foo for?!" When doing architecture, pick the right tools for the right job. And most of the time that means picking a caching layer, a TLS termination layer, dynamic content, Authn&Authz layer, etc...
Most of the problems stem from the thread pool - and most of my personal observations come from using both in production.
Up to a certain high load, Varnish works well, then all hell breaks loose - Nginx in comparison was much more stable and used less resources, had a higher max throughput, and when it hit the max - (assuming it filenode limit was high enough) would still work..just slowly. Varnish just died.
There is this sad idea that "threads are bad" and that event is the cat's pajamas. But even nginx admits that there are situations where threads make sense, hence their semi-recent addition to adding them.
I encourage people to read http://capriccio.cs.berkeley.edu/pubs/threads-hotos-2003.pdf. Yeah, it's old, but so is the 10k Concurrency paper which is still touted as the reason why people should use nginx instead of anything else. Threads will only get better.
If you really need a large malloc, I highly recommend running Varnish on FreeBSD instead of Linux as the Linux kernels VM system is horrible when memory is overcommitted.
Interesting - curious about what the actual failures were as I've never seen failures like that with fairly large Varnish caches. nginx wasn't an option for us until they made it handle the HTTP Vary header correctly so I don't have recent comparative experience with it as a cache.
It doesn't do TLS. Something else needs to be put in front of it to terminate HTTPS connections. HTTP/2 support is planned but (again) with no TLS, making it a dud as browsers don't implement non-TLS HTTP/2.
The mass storage engine isn't part of the opensource release and my experience with larger-than-memory scenarios in the standard varnish has been less than stellar. Again, needing something else to handle this bit or control what's fed to it.
My conclusion is that if I always have to put an nginx in there somewhere, I might as well not add varnish into the mix.