HTTP caching basics
It’s important to understand HTTP caching because at some point an HTTP cache will protect your web platform from going down. Whether this HTTP cache is a reverse caching proxy like Varnish or a full-blown CDN, you need to understand the rules of the game, and you need to understand the basics of HTTP caching.
Luckily there are conventions for this. There are even standardized headers that are part of HTTP’s specification that will allow you to control the behavior of a web cache.
What is HTTP caching?
HTTP has caching capabilities built into the protocol to ensure that clients or proxies can store the HTTP response in the cache for a certain amount of time. By caching the response, clients don’t have to connect to the web server every time they want to access that content.
HTTP caching reduces network traffic and server load, which results in lower response times.
Browser cache versus caching proxies
Historically HTTP responses were cached by the web browser to reduce network traffic. In the early days of the web, bandwidth was limited. Being able to cache HTTP responses in the browser avoided expensive HTTP roundtrips.
Unfortunately browser cache is not reliable: users can flush the cache at any time, and they can even disable the cache. Another disadvantage is the fact that the cache is hosted locally, which means there is a cache per user.
By installing proxy servers closer to the user, either in the office or at the internet provider’s data center, clients can retrieve centrally cached copies of the requested content and act on behalf of the origin web server.
As broadband internet became more common, local caching proxy servers were no longer crucial. Instead the increase of bandwidth shifted the pressure from the client to the server: traffic spikes started jeopardizing the stability of servers. As a consequence, caching proxies also shifted.
Nowadays reverse caching proxies are put in front of the origin web platform to protect it against traffic spikes and prevent the platform from caving in under pressure.
HTTP’s caching policies allow HTTP responses to be cached by both clients and proxies using the same syntax. However, there are also specific instructions that only apply to proxies.
HTTP caching concepts
Cacheability
Not all HTTP responses can or should be cached: if the content is private, it should not be stored in the cache. If the type of request (for example an HTTP POST request) implies a change of the resource, it shouldn’t be cached either. If the returned response uses a Set-Cookie
header to change state, the response shouldn’t be cached.
On the one hand you can decide whether or not to store a response in the cache. On the other hand, you can decide whether or not to serve a cached response from the cache.
These rules can be specifically enforced in the implementation or configuration of the cache. However, the HTTP protocol allows you to control the cacheability under the form of specific header syntax.
Cache-Control: no-cache, no-store
header is the perfect example of enforcing cacheability. The no-cache
part instructs the cache not to serve any cached responses for this resource and instead fetch the corresponding content from the origin web server. The no-store
part prevents the HTTP response from being stored in the cache.Public versus private content
The scope of cacheable content is either public or private.
- Publicly cacheable content can be cached by both the requesting client as well as reverse caching proxies.
- Privately cacheable content can only be cached by the requesting client and not by reverse caching proxies.
Cache-Control: public
and Cache-Control: private
response headers are used to set the caching scope.Cache lifetime
Cached objects are only valid for a limited amount of time. The time to live of a cached object can be defined in the implementation or configuration of the cache. But as expected, the HTTP protocol has ways to enforce the time to live through specific cache header syntax.
Cache-Control: max-age=3600
header is an example of setting the lifetime of cached object to an hour. The HTTP protocol has more directives to set the time to live and these will be covered in the HTTP caching headers section of the tutorial.Revalidation
As long as the time to live of a cached object has not expired, the content is considered fresh. This means it can be served from the cache to requesting clients.
The remaining lifetime of an object is a value that changes every second. Once it hits zero, the content is no longer fresh. Instead it is considered stale and in need of revalidation.
Cache revalidation is the process of connecting back to the origin web server and fetching potentially updated content. As soon as the revalidation is finished, the object is considered fresh again for as long as the time to live allows.
Conditional revalidation
Revalidation can also be done conditionally. This means that the origin web server will only send the payload if the requested resource has changed. If the resource hasn’t changed, a 304 Not Modified
status code will be returned without a response body.
This reduces the amount of data sent over the wire, and it can also result in a lower server resource consumption at the origin level.
When the cache receives an 304 Not Modified
response, the time to live that is defined by the Cache-Control
or Expires
header will be used to set the lifetime of the object after revalidation.
HTTP response headers like Etag
and Last-Modified
allow web servers to identify when a resource has last changed. These values can be presented by the client or a reverse caching proxy under the form of If-None-Match
and If-Modified-Since
request headers to compare versions. If these versions differ, a regular 200 OK
response will be sent, otherwise the 304 Not Modified
status is returned.
Etag
, Last-Modified
, If-None-Match
and If-Modified-Since
are covered in the HTTP caching headers section of this tutorial.Identifiying cached objects
Objects in the cache are generally identified by their URI and Host
header values. These values are part of the HTTP request, as illustrated below:
GET /about HTTP/1.1
Host: example.com
This example is a request to http://example.com/about
and /about
and example.com
are used to create a hash that identifies the object in the cache.
Cache variations
Sometimes, an HTTP resource can have multiple versions that depend on values coming from request headers. One example of this is a multilingual website that uses the Accept-Language
response header to present the resource in multiple languages.
If a resource has multiple versions, knowing that it is identified by its URI, the cached output can be inconsistent. Only using the URI and Host
header will not suffice and that’s where cache variations come into play.
A cache varation will extend the hash that is used to identify an object in the cache by adding the value of a request header. Per version of the resource, a variation is added to the cache.
The origin web server can issue a Vary
header to tell the cache what request header it should use to base its variations on. In the case of the multilingual website, Vary: Accept-Language
is the logical choice.
The goal is to store enough cache variations to cover the available versions of a resource. But if a resource has too many variations, caching all variations will have a detrimental effect on the hit rate and will fill up the cache.
Make sure you have your variations under control, otherwise you’re better off not caching the response at all.
HTTP caching headers
Here’s an overview of HTTP caching headers that you can leverage to control the cache from your origin web platform.
Cache-Control
The Cache-Control
header is probably the most common HTTP caching header. Its syntax is quite extensive and has directives to control the following aspects of HTTP caching:
- Time To Live
- Cacheability
- Scope (
public
orprivate
) - Revalidation
Public and private
The public
keyword is used to announce that the resource can be cached by both web clients and caching proxies. If private
is used instead, a caching proxy will not store the object in cache whereas a web client will.
Here’s an example where a public resource is announced through the Cache-Control
header:
Cache-Control: public
Here’s the equivalent for private content:
Cache-Control: private
Max-age and s-maxage
The max-age
directive is used to set the lifetime of an object in the cache. Here’s an example:
Cache-Control: public, max-age=3600
This header instructs the cache to store the object for 3600 seconds, which corresponds to an hour.
The s-maxage
directive does the same as max-age
, but it is intended for caching proxies rather than for web clients.
Here’s an example where a caching proxy is instructed to cache the object for a day:
Cache-Control: public, s-maxage=86400
It is also possible to combine these directives:
Cache-Control: public, max-age=3600, s-maxage=86400
This will result in the web client caching the resource for an hour and the caching proxy applying a time to live of a day.
Stale-while-revalidate
The stale-while-revalidate
directive sets the allowed staleness of a cached object, allowing expired content that has passed its expiration time to be served from the cache.
The stale-while-revalidate
value sets the amount of seconds past the expiration time that stale content can be served while the cache is revalidating the content.
Here’s an example:
Cache-Control: public, max-age=900, stale-while-revalidate=100
In this example the cached object is considered fresh for 900 seconds. After that revalidation needs to take place. But because of the stale-while-revalidate=100
directive, the object can be served from the cache for another 100 seconds while the cache is asynchronously revalidating with the origin web server.
Stale-if-error
When staleness is allowed, the end user will not be impacted by potentially slow backends during the revalidation process. Thanks to stale-while-revalidate
, the cache can be instructed to serve stale data while a new version of the resource is being fetched.
But what happens when the origin web server is down?
As long as the stale-while-revalidate
value is high enough, stale content will be served and the failed revalidation will go unnoticed as far as the user is concerned.
By setting a very high stale-while-revalidate
value, some business rules may be violated in situations where the origin web server is healthy.
The stale-if-error
directive sets the staleness when the origin is down. Here’s an exammple:
Cache-Control: public, max-age=900, stale-while-revalidate=100,
stale-if-error=86400
In this example an object is stored in the cache for 900 seconds and if the origin is healthy, this object may be served up to 100 seconds past the expiration of the object.
If the backend is down, the staleness can be drastically increased. In this case, a stale object may be served a full day past its expiration time.
Must-revalidate and proxy-revalidate
When staleness is not allowed, the must-revalidate
keyword is used to enforce this. Here’s an example:
Cache-Control: public, max-age=3600, must-revalidate
In this case the cached object is fresh for an hour, but as soon as it expires synchronous revalidation is mandatory.
If the web client is allowed to serve stale objects from the cache, but intermediary caches aren’t, you can use the proxy-revalidate
keyword to enforce this.
Cache-Control: public, max-age=3600, stale-while-revalidate=100,
proxy-revalidate
In this case the object is cached for an hour. After that an extra 100 seconds of staleness is allowed while revalidation takes place. Because of proxy-revalidate
, staleness is only allowed by the web client. Caching proxies are not allowed to serve stale content.
No-cache and no-store
If the web server returns an uncacheable response, the Cache-Control: no-cache, no-store
syntax can be used to instruct the cache not to cache this response.
The no-cache
directive forces the cache not to serve the cached resource to the requesting client and instead revalidate the content with the origin web server.
The no-store
directive instructs the cache not to store this resource in the cache.
Both directives can be used separately, but they can also be combined.
Expires
The Expires
header can also be used to set the time to live of an object. The Expires
header doesn’t use relative numbers like the Cache-Control
header does. Instead it sets the date and time of expiration.
Here’s an example:
Expires: Sat, 4 May 2024 08:00:00 GMT
This cacheable resource is considered fresh until Saturday May 4th 2024 at 8 o’clock GMT.
By setting a date and time in the past, the Expires
header can instruct the cache not to store the object.
Vary
The Vary
header is used to create cache variations. As explained earlier, cache variations are used to create multiple variations of a cached object, based on a request header.
Here’s an example:
Vary: Accept-Language
This example will create a cache variation for this resource based on the value of the Accept-Language
request header. This will allow multilingual websites that use the same URL structure for multiple languages to be properly cached.
Here’s another example:
Vary: X-Forwarded-Proto
This example will create a cache variation based on the value of the X-Forwarded-Proto
request header. This header is not sent by the client, but by a TLS proxy that terminates the TLS connection. Possible values are https
and http
.
This cache variation ensures that there’s an HTTP and an HTTPS version of each page to avoid mixed content.
Etag and If-None-Match
The Entity tag that is returned through the Etag
response header is used to identify a specific version of the resource.
The Etag
value could be any value, but it must be unique to the version it represents. Consider it a fingerprint of the content.
When a web server returns an Etag
header, the value can be presented by the client upon subsequent requests under the form of an If-None-Match
request header.
This If-None-Match
header represents the version of the resource it currently has. This value can be compared to the Etag
that is returned by the web server. If the values are identical, the content hasn’t changed and a 304 Not Modified
status code can be returned without attaching a body to the HTTP response.
If the values of If-None-Match
and Etag
differ, the content has changed and a regular 200 OK
response is returned that includes a body.
Last-Modified and If-Last-Modified
The Last-Modified
response header also identifies a specific version of a resource. Unlike the Etag
header, it uses a last modified date to identify that resource.
Here’s an example:
Last-Modified: Mon, 8 Nov 2021 18:28:00 GMT
This value represents the last time the resource was modified. The value of that response header can be presented to the web server upon subsequent requests under the form of a If-Modified-Since
header.
Here’s an example:
If-Modified-Since: Sun, 7 Nov 2021 13:18:21 GMT
The value of the If-Modified-Since
header is older than the one presented by the Last-Modified
header. This means the content has changed and a 200 OK
response should be returned.
If the values of If-Modified-Since
and Last-Modified
were identical, the client has the most recent version of the resource and a bodyless 304 Not Modified
response could be returned.
Age
An Age
header is used to inform the client how long the object has been stored in cache.
Here’s an example:
Age: 100
This means the object has been stored in the cache for 100 seconds.
Imagine the following example:
Cache-Control: max-age=300
Age: 100
We know that the max-age=300
directive sets the Time To Live of the cached object to 300 seconds. The fact that the Age
header is set to 100 seconds means that the cache object has a remaining lifetime of 200 seconds.
HTTP caching flow
When a reverse caching proxy server like Varnish is used to accelerate your origin server, there is a specific flow depending on the scenario.
We’d like to present four scenarios:
- The cache miss flow
- The cache hit flow
- The cache revalidation flow
- The conditional revalidation flow
Cache miss flow
When a client requests content from an empty cache, a cache miss occurs and the cache has to fetch the content from the origin. The following diagram illustrates this process:
Although we try to keep origin fetches to a minimum, a cache miss is not necessarily a bad thing. A cache miss is simply a hit that hasn’t happened yet.
When the origin web server responds, the caching proxy will store the response in the cache with a lifetime that was specified by the Cache-Control
or Expires
header and will serve the cached object to clients requesting it.
Cache hit flow
Once the object is stored in cache, subsequent requests will result in a cache hit, as illustrated in the diagram below:
As you can see, no connection to the web server is needed. This is by design and takes away the pressure from that origin web server while the caching proxy is serving the cached version of those origin responses.
Cache revalidation flow
At some point the cached object will expire and the content will need to be revalidated with the origin web server.
This involves an origin fetch, just like a cache miss. But unlike the cache miss scenario, the cache can choose to serve the stale content while asynchronously revalidating with the origin.
The diagram below clarifies the revalidation flow:
If you pay close attention, you’ll see that the order of execution is different: the client response can be returned before the origin revalidation response is received.
The following Cache-Control
will enable asynchronous revalidation thanks to its stale-while-revalidate
directive:
Cache-Control: public, s-maxage=3600, stale-while-revalidate=200
If we want to serve stale content when the origin web server is down, we could use the following Cache-Control
header and leverage the stale-if-error
directive:
Cache-Control: public, s-maxage=3600, stale-if-error=86400
Conditional revalidation flow
Revalidation can also be done conditionally. This means that the caching proxy will identify the version of the object through specific request headers, such as If-None-Match
or If-Modified-Since
.
The values of these headers are come from the Etag
or Last-Modified
response headers that are part of the cached object.
If the latest version matches the version that is advertised by the proxy, the origin will acknowledge this and not send the full payload. A 304 Not Modified
response is returned, the stale content is then considered fresh again and revalidation is paused until the content expires again.
A version matches if the Etag
and If-None-Match
values are identical or if the Last-Modified
and If-Modified-Since
values are identical:
If the versions differ, the full response is sent by the origin and the content is considered fresh again:
304 Not Modified
response without the body, instead of returning the payload of the cached object.Conditional revalidation allows backends to consume less bandwidth by only adding payload to the HTTP response if the content has changed. If the origin is optimized for conditional revalidation, CPU, memory and disk I/O consumption can also be reduced.