Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: What do you use for site caching?
6 points by jjoe on Feb 3, 2017 | hide | past | favorite | 11 comments
I'm curious what people here use for page caching in particular. Not just for static assets but also dynamic pages.

I realize that most put their site behind a CDN but that doesn't always get you proper page caching.

Do you think about page caching at all?

What kind of setup do you have?

Do you correlate fast pages (low TTFB) with SEO / SERP?

Thanks!



Caching is a massive topic that covered a multitude of different points along your chain from page creation through to the browser. I'm going to give a very brief overview of a few different solutions:

* Cache frequent database requests with an in-memory database, key/value store. There's two big players in this field, memcached and Redis. Personally I prefer the latter.

* Cache static content. This would be a CDN and can run either as a separate domain or between the browser and your servers.

* Cache partial page assets (eg half pages on dynamic sites). There's a multitude of ways you can do this including using a key/value store. This reduces the amount of content that needs to be dynamically generated even if some of the page still has to be.

* Cache the entire page. This can be done via solutions like varnish or via a CDN. CDNs that support this level of caching can be configured either via their management portal or by setting caching headers in your your HTTP response. Usually it will different caching headers to the ones your browser will take notice of, but this is often configurable.

* Browser caching. This is one that I've found developers et al often overlook. It's also the level of caching that can cause the most headaches. But it's the cheapest to implement and cheapest to run so it's definitely worth getting browser caching right first.

[edit]

I forgot to add:

* Full page caching with a Javascript API model. This is something a lot of popular dynamic sites do. Basically the entire HTML page served is a template and the Javascript library draws the dynamic content on top of it. This way all of your assets can be cached and a lot of needless dynamic text rendering is distributed amongst the clients instead of handled on the server. The downside to this is it's more complicated to build with more places to break (users running weird browsers, disabling JS, attackers, etc). But the gains can also be significant.


Browser caching. This is one that I've found developers et al often overlook. It's also the level of caching that can cause the most headaches. But it's the cheapest to implement and cheapest to run so it's definitely worth getting browser caching right first.

Yes this one's tricky. Especially when you update page layout and assets often.


You can get around frequent changes by having a cache busting version number. eg

    <script src="http://example.com/css/all.css?v=1">
then your next update will be

    <script src="http://example.com/css/all.css?v=2">
The obvious issue is you then cannot cache the base HTML file. But that's something you'd never want to cache for long periods anyway.


Will there ever be an easier way to do all these kinds of caching? As in, do nothing at all?


That depends on the popularity of your site and how the content is generated. If it's a relatively static site (eg the pages are generated into flat HTML files before uploading) and/or you only get a few visitors a day then you can probably get away without any server side caching.

However if you're running a relatively heavy site for content that doesn't change regularly (eg many CMS solutions like WordPress) then you'd want to do caching to lesson the burden on your web servers. And if you run a popular site then caching will reduce your bandwidth and thus allow you to serve more traffic on existing (virtual) hardware. Both those reason will reduce your hosting costs in the long run plus make the site feel snappier (thus improving its user experience)

However browser caching is something I'd recommend even on quiet and static sites because that will have a direct impact on the speed the site loads and can also potentially save your visitors bandwidth (this really matters if you're targeting people from less privileged backgrounds or where internet costs are extortionate).

Personally I view caching as important part of the web development and infrastructure design process as writing the code and building the web servers.


> you'd want to do caching

That's not what I meant. I understand caching helps.

What I meant is will there ever be a way for caching to happen automatically behind the scenes without me ever having to do anything at all to make all this happen.

For example, the OS caches files, but I don't have to configure the OS file cache by hand. The OS takes care of it.


Your example is pretty bad for a number of reasons:

1) If you're building a high availability file server then you would expect the sysadmin to configure the way the OS caches files rather than run with the defaults. Likewise, if you're building a busy site then you'd need to configure caching to fit your specific application.

2) The OS file cache can run with pretty basic defaults because file system files are static (yes they can change, but you have to go through the kernel ABIs anyway so easy to track changes). Website content cannot be guessed upon because even static content can and often is dynamically generated. No assumptions can be accurately made. This is also why there's so many different levels of caching that happen on busy web sites.

3) File caching on the OS only needs to happen at one place (as touched on in previous point). However websites are built from a plethora of different frameworks which are literally far too numerous to name.

That all said, there will be some specific web frameworks which will ship with caching defaults (more typically in the case of browser caching) and some web applications will ship with recommended plugins for enhanced caching (eg Redis / memcached). However pragmatically I think if a web developer is smart enough to write code then they should be smart enough to implement caching. In the days of frequent website attacks, the unpredictability of which sites go viral nor when, and the ease of which anyone can build and host a website; I really do think some web developers need to up their game rather than blaming the complexity of the tools nor the lack of sane defaults. Yeah the current web model is a mess of edge cases and hidden traps, but if you're a developer then there's no excuse not to properly learn the tools you've been given regardless (or maybe especially because) of how poor those tools are at protecting you from sawing your own hand off.


With a fast enough computer and network, eventually you just don't need one.

Although the speed of light can be a pesky problem.


Caching isn't just about performance. It can be about reducing costs too. Eg reducing the number of web servers you need, scaling down the size of your persistent storage database (traditionally a RDBMS but these days NoSQL DBs are common too) and shrinking your bandwidth costs. Hosting can be a costly affair at scale.


Varnish.


Hi Dave

Do you deploy Varnish on your own server(s)?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: