Cache invalidation

Tags: vcl (29) invalidation (3) purge (2) ban (2) ykey (1)

Introduction

A good caching strategy not only defines how the content should be cached, but most importantly, how it should be invalidated and evicted from cache. An object inserted in cache can be served to other clients until it expires, is evicted to make room for other objects, or is invalidated.

The TTL (Time to Live) of an object define how long an object can be cached. An object’s TTL is set when the content is generated (by the backend) or when it’s inserted (in Varnish). The TTL can be set via HTTP caching headers (i.e. Expires or Cache-Control) or via VCL. Either way, Varnish will respect the defined TTLs and evict the object when its Time to Live has expired, making room for fresher content to be inserted in cache.

By default, Varnish will handle content insertion and invalidation of the cache, but you can still define a more specific eviction strategy. This tutorial will show you how to invalidate objects using VCL, CLI, Varnish Broadcaster, and Varnish Controller.

PurgeBanYkey
TargetSpecific object with its variants*Regex patterns
VCLYesYesYes
CLINoYesNo
Varnish BroadcasterYesYesYes
Varnish ControllerYesYesYes

* Variants defined by the Vary header.

What is a purge?

A purge request is when an object, with all its variants, is immediately discarded from cache, freeing up space. It’s invoked through HTTP with the PURGE method, which is a request method, like HTTP GET.

If you use PURGE instead of GET, the object that would otherwise be hit and served to the client, will be purged from the cache with all its variants.

For this to work, the requests need to have the same hash, as computed in the vcl_hash subroutine. The default vcl_hash will take the Host header and the URL into account, which includes any query parameters that are present in the requests. There also needs to be explicit VCL code to respect the PURGE keyword, as explained below.

Using VCL

Apply the following snippet to your VCL file:

# Access Control List to define which IPs
# can purge content
acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		# check if the client is allowed to purge content
		if (!client.ip ~ purge) {
			return(synth(405,"Not allowed."));
		}
		return (purge);
	}
}

Configuration reload

Reload the new VCL to apply changes and then run the following command:

systemctl reload varnish

When using Varnish Controller, follow this guide to deploy VCL files.

Purge request

Now you can start purging content, but you’ll have to issue a PURGE HTTP request. Use your preferred tool to trigger a HTTP request. The following two examples use HTTPie and curl:

Examples

# HTTPie
http PURGE "www.example.com/foo"

# curl
curl -X PURGE "www.example.com/foo"

Both commands will purge the /foo resource coming from the host example.com.

What are bans?

Bans can be used to invalidate content in cache. When an object is banned, it will no longer be used to fulfill incoming requests. Bans leverage regular expression syntax to invalidate content, so to use any object properly, you have to issue a ban. A ban will only work on objects already stored in cache; it doesn’t prevent new content from entering the cache or being served.

You can either ban content based on req.* or obj.* properties of an object. Like purges, bans will immediately stop the invalidated content from being served, but banned objects won’t immediately be evicted from cache, freeing up memory. Instead, they are tested against the ban in vcl_hit and by a background thread, the ban_lurker (for bans only referencing obj properties). The ban lurker will walk the cache and evict matching objects in the background, completing (and discarding) bans much faster than only testing in vcl. In turn, this limits the amount of simultaneous bans, which can be a performance concern.

How aggressive the ban lurker is can be controlled by the parameter ban_lurker_sleep. The ban lurker can be disabled by setting ban_lurker_sleep to 0.

There are several ways to issue a ban in Varnish. You can use a ban statement in VCL, use the ban command in the Varnish Command Line Interface (CLI), or issue the ban through the Varnish Controller.

Using VCL

It’s important to use a “ban-lurker friendly” expression to be performant, so this VCL snippet copies the request URL as an object header. We can refer to it while banning, without using req.url.

sub vcl_recv {
	if (req.method == "BAN") {
		ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url == " + req.url);
		return(synth(200, "Banned added"));
	}
}

sub vcl_backend_response {
	# Store URL and HOST in the cached response.
	set beresp.http.x-url = bereq.url;
	set beresp.http.x-host = bereq.http.host;
}

sub vcl_deliver {
	# Prevent the client from seeing these additional headers.
	unset resp.http.x-url;
	unset resp.http.x-host;
}

To keep things simple, this snippet only handles one invalidation scheme and doesn’t do access control. A more in-depth look at direct obj bans (including tests) is available here.

Configuration reload

Now you must reload the VCL configuration:

systemctl reload varnish

When using Varnish Controller, follow the guide to deploy VCL files here here.

Ban request

Finally, you can issue HTTP BAN requests to ban content:

curl -X BAN "http://example.com/ -H "ban-url: ^/path/"

The above command will invalidate every matching object under the path /path/.*

Using CLI

Support for bans is built into Varnish and available in CLI (Command Line Interface) via varnishadm. To ban every png object belonging on example.com, issue the following command from the shell:

varnishadm ban req.http.host == example.com '&&' req.url '~' '\\.png$'

Be sure to use the right regular expression syntax as defined by PCRE rules. Note that in the example given above, the quotes are required for execution from the shell and escaping the backslash in the regular expression is required by the Varnish CLI interface.

What is YKey?

vmod_ykey is a Varnish module that adds a tag or a secondary key (in Varnish jargon) to objects, allowing fast purging on all objects matching the assigned tag/secondary key. Like other VMODs, it can be used via VCL configuration.

The purge operation may be hard or soft. A hard purge immediately removes the matched objects from the cache completely, while a soft purge will expire the objects, but keep the objects around for its configured grace. It will also keep timeouts; grace for stale object delivery to clients while the next fetch is in progress, and keep for conditional fetches. In addition, it interfaces with the MSE stevedore, providing persistence of the Ykey data structure on disk for persisted caches.

Using VCL

To use Ykey, you’ll import the Ykey VMOD into your VCL configuration. The keys to associate with an object need to be specified by calling one or more of the add key VMOD functions during vcl_backend_response{}.

This example adds all keys listed in the backend response header named Ykey and a custom one for all URLs starting with /content/image/:

import ykey;

sub vcl_backend_response {
	ykey.add_header(beresp.http.Ykey);
	if (bereq.url ~ "^/content/image/") {
		ykey.add_key("IMAGE");
	}
}

The following example creates a simple purge interface. If a header called Ykey-Purge is present, it will purge using ykey and the keys listed in the header. If not, fall back to regular purge:

import ykey;

# Access Control List to define which IPs
# can purge content
acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		if (client.ip !~ purgers) {
			return (synth(403, "Forbidden"));
		}
		if (req.http.Ykey-Purge) {
			set req.http.n-gone =  ykey.purge_header(req.http.Ykey-Purge);

			# or for soft purge:
			# set req.http.n-gone =  ykey.purge_header(req.http.Ykey-Purge, soft=true);
			return (synth(200, "Invalidated "+req.http.n-gone+" objects"));
		} else {
			return (purge);
		}
	}
}

Configuration reload

Since you’ve made VCL changes, the Varnish service has to be reloaded.

YKey request

To purge or softpurge one or more objects in cache, issue a HTTP request which includes the Ykey-Purge header. For example:

curl -X PURGE -H "Ykey-Purge: purging_key" "http://example.com/path/.*"

The curl request will purge every object in cache matching the key, purging_key, and defined as header.

Using Varnish Broadcaster

The Varnish Broadcaster replicates requests to multiple Varnish caches from a single entry point. The goal is to facilitate purging/banning across multiple Varnish Cache instances. The broadcaster consists of a web server with a REST API, which will receive HTTP requests and distribute them to all configured caches.

Installation

The first step is to install Varnish Broadcaster. The full guide can be found here.

Configuration

To use the Broadcaster you have to define a configuration file under the path /etc/varnish/nodes.conf, which defines the nodes to broadcast against. The format of this file is similar to the INI format.

This example is from a valid configuration file and has two clusters (Europe/US), each with its own nodes:

# this is a comment
[Europe]
First = 1.2.3.4:9090
Second = 9.9.9.9:6081
Third = example.com

[US]
Alpha = http://[1::2]
Beta = 8.8.8.8

Service reload

Once you have a configuration file, start or reload the broadcaster service:

systemctl start broadcaster

Invalidation

Now that the Broadcaster is properly configured and running, you can issue PURGE and BAN HTTP requests to it the same way you would to a Varnish server. There’s one exception though; for PURGE requests, you need to force the host header in the curl request (-H "host: my.domain.com) to ensure the requests are hashed as the objects you want to purge.

Using Varnish Controller

Bans, purges, and yKey invalidations can be issued via Varnish Controller, as well. It has a CLI, Rest API, and a graphical user interface that can used to perform invalidation.

Varnish Controller uses a different approach to perform invalidation. The invalidation request is sent via the controllers interfaces (CLI/API/GUI). Then the actual invalidations will occur on each Varnish Cache server concurrently for all given paths that should be invalidated. The controller also supports monitoring of currently down varnish servers and can perform a previous invalidation request once the server is up and running again.

Invalidation examples and more information about using the Varnish Controller to invalidate can be found here:

Additional reading and notes

Check out the purge tutorial and the ban tutorial too.