Web accelerators (caching) can make a Magento installation deliver a more responsive user experience with less hardware. A famous quote however is “There are only two hard things in Computer Science: cache invalidation and naming things” – Phil Karlton. Yes, caching was mentioned first! Magento 2 has included a lot of changes from Magento 1 to improve its built in support for caching, making it easier to deploy caching and take some the complexity out of caching for developers. This post provides a high level summary of the new caching strategy.
What is Caching?
It was not long before people realized that as well as caching in a web browser, caching is also useful at the server side as the same request might come from different users. For most pages containing dynamic content (e.g. content from a database), cache is faster than hitting your main web server. This means you can serve more requests per second by using your cache. Every so often you should get a new copy of the page from the real web server and put that into your cache to keep the served content fresh, or discard old content that has not been accessed to reclaim cache space for other pages.
Using a web accelerator such as Varnish is one approach to caching. You can put a copy of Varnish in your data center next to your web server and it will reduce the traffic that hits your web server allowing increased traffic.
Content Delivery Networks (CDNs) support another form of caching, “edge caching”. Akamai is one of the well-known CDN providers. The idea is to reduce network latency by getting the cache closer to the user’s web browser. With Varnish you would have only one cached copy of the page (in your data center). With a CDN your content can be cached in multiple locations around the world to get your content that bit closer to the end user. (Note: Varnish itself can be used at the edge in a CDN solution. In this post I only talk about Varnish being used in the data center to take load off the web server.)
It is common for a site to combine both forms of caching, the importance of which may become more apparent later when it comes to cache invalidation strategies below.
How Long to Cache For?
So how long should content in a cache last for? There is no single answer to this question. One approach is to cache pages for a very long time (days, weeks, years) so returning users get better performance. Another approach is to cache for a very short time (minutes or seconds). Say what? Cache for minutes or seconds? Well, for high volume sites with hundreds of requests per second, a cache even for seconds can still reduce the traffic by an order of magnitude or more. The short cache lifetime also means you don’t need to worry about how to invalidate the cache – it will expire by itself in a few seconds anyway.
So how long to cache for on your site? If you are a Merchant, this is where I would point you at finding a good partner to work with! Remember, “there are two hard things…”
What Can’t You Cache?
Hey, caching sounds great, you get more done with reduced costs and better performance. So what can’t you cache? Examples on a commerce site include the customer’s name (if they have logged on) and the number of items in their shopping cart. This will be different for each user (it is private to that user) – it cannot be cached in a shared (public) cache.
Edge Side Includes (ESI) are one technology to address this. An edge side include is where a web server returns an HTML page that can be cached, but where parts of the page are replaced with an “include” reference (URL) that will return just the content for that part of the page (such as the customer’s name). Using ESI there is reduced load on the web server – most of the heavy work is cached, with HTTP requests for small parts of a page where the content needs to be different per user.
If a page has 10 edge side includes, though, it may end up slower. There is an overhead per HTTP request even if the request is quick for a web server to respond to. It requires careful thought and tuning to get the right number of includes on a page to make sure performance is optimized.
Magento 2 and Private Content
Magento 2 includes improved caching support built in. It however has chosen not to use ESI for private content (content specific to one user, such as their name). Magento 2 does use ESI in some circumstances, but for a different use case. This is really important to sink in – it is a common source of confusion. When you see ESI in Magento 2, it is not for private (uncacheable) content. More on this later.
- A single AJAX call can fetch all the private user content instead of one request per part of a page to be replaced (as is done with ESI). This can reduce the number of HTTP requests.
- The private content is also cacheable by the web browser. A customer’s name is not likely to change for example, so why not keep it in the web browser cache and avoid future AJAX calls?
There is the question of how to refresh the private content cached in the web browser (for example if a customer adds an item to their cart then the ‘number of items in cart’ will change). This is addressed by flushing the web browser cache every time a HTTP POST request is made. HTTP POST is how you send a form or do some action on the site. So if a user browses around a site reading pages they will just be doing GET requests which can be completely cached. If they do a POST (e.g. clicks a button to add an item to their cart) then the cache in the web browser will be flushed and an AJAX call will be done to fetch an updated copy of the private content (so details such as the number of the items in the cart will be updated).
(Please note: I have left all sorts of details out here for simplicity. For example, different pages may have different private content, so there is complexity about what private content needs fetching and how to cache it. I leave that level of detail for the official documentation.)
Magento 2 and Public Content
So what about public content? This is where I again refer back to the “two hard things” statement at the top of the post. It would seem simple to cache public content wouldn’t it? The problem is no content really lives forever. So let’s consider some different types of content.
- Images tend to be long lived. It is generally safe to cache them a long time. If you want to change a product image, give it a different URL.
- Details on a specific product might be able to be cached fairly well as product details do not change so often (e.g. the description). But when the details do change, how to flush the caches to get the new details to site?
- What if available quantity of an item is added to a product page? Each time the product was sold, the product page would need to be refreshed.
- What if a page has two blocks of somewhat expensive content to generate (maybe a category page with a merchandising block on the side that uses some expensive algorithm). If the category page hid products out of stock maybe you want to update that part of the page without re-computing the merchandising data.
The basic building blocks supported by Magento 2 caching allow different approaches to cater for the above use cases.
- Public cacheable content can be returned with tags for use by the cache (such as Varnish). Tags hold identity information, such as the product number of the product(s) shown on the page. If an administrator updates a product, it can then send Varnish a PURGE request based on the tag to tell it to “flush all pages containing this identity (pages with this tag) from your cache”. This allows selective cache invalidation, instead of wiping the whole cache (which would trigger a large spike on the server).
- Different content can be returned with a different Time To Live (TTL) value. This can be used in combination with ESI requests so that different parts of a page can be cached with different lifetimes.
Remember first that in Magento ESI is only used to cache shared content, not private content for a specific customer. So the use of ESI is not to embed private (uncacheable) content on a cacheable page, but instead to allow different parts of a page to be cached for different lengths of time. This is really only of benefit if there are different parts of a page worth caching separately. Otherwise it is simpler to just cache the whole page and regenerate it when required.
Magento 2 Cache Implementation
Right now there are two forms of caching supported in Magento 2. This may be refactored a bit to improve the modularity (e.g. make it easier for other caches to be supported via extensions), but the current two approaches are a built in cache and support for an external Varnish instance. The Varnish cache will give superior performance and is recommended for production usage. The built in cache is mainly included for developers to use in their personal development environment. (This is not to say a small site could not take advantage of it, but the benefit of caching on a low volume site is lower.) You could enable both caches if you wanted to, but there is little reason to believe the built-in cache will deliver much improvement beyond what the external Varnish cache would catch.
Form Keys and Caching
In Magento 1 “form keys” were introduced for added security against Cross Site Request Forgery (XSRF) attacks. This involved putting a random (secret) hidden string into a form so the web server can verify that POST requests come from an HTML page it returned. This however meant caching of pages with form keys was unsafe. (The form key must be different per retrieved HTML page.)
Magento 2 also supports the concept of form keys but using a new approach such that the returned HTML page does not need to include a random string. This makes the form cacheable again. I am not going to go into the technical details here – the only point you need to know is caching of HTML pages with forms is possible with Magento 2 out of the box without missing out on the additional security provided by form keys.
Hopefully this post gives a useful overview of how caching in Magento 2 has improved over Magento 1. It has been reworked based on lessons from Magento 1 (such as from form keys), as well as improving the cacheability of content (such as private content) compared to techniques such as ESI.
For those with previous ESI experience, it is however important to realize that ESI in Magento 2 is not used for hole punching of cached pages (it is not used to embed user specific private content into a returned page, which ESI is normally used for). It is to allow finer grain control over when to recomputed parts of an expensive page. It may be common for sites not to use ESI at all as the base caching support may be sufficient.