Example VCL template
Varnish’s built-in VCL is very conservative and focuses on not caching stateful and personalized content. By following the HTTP standard for caching rules, Varnish is safe by default. Unfortunately, in many real-life situations where backend servers do not send good caching headers, this will result in a low hit rate.
In this tutorial we’ll present a collection of customizable VCL examples for Varnish. We’ll focus on the individual VCL template features and in the end we’ll bring it all together into a single VCL file.
1. Backend definition
The first step is to define the backend. This is the origin server where Varnish fetches the content from.
backend server1 {
.host = "127.0.0.1";
.port = "8080";
.max_connections = 100;
.probe = {
.request =
"HEAD / HTTP/1.1"
"Host: localhost"
"Connection: close"
"User-Agent: Varnish Health Probe";
.interval = 10s;
.timeout = 5s;
.window = 5;
.threshold = 3;
}
.connect_timeout = 5s;
.first_byte_timeout = 90s;
.between_bytes_timeout = 2s;
}
Backend connection information
Every backend definition has a name. In the example above this is server1
. The .host
and .port
attributes contain the address and port number of your backend.
In the example above, the origin, which is probably an Apache or Nginx server, is hosted on the same machine as Varnish. Hence the IP address 127.0.0.1
, which corresponds to localhost
. Since HTTP clients connect to port 80, the conventional web server is set up to listen to a different port. In the example above, port 8080
is used, and the backend needs to be configured to listen on the same port.
.host
attribute. The attribute supports both IP address and hostname values.The .max_connections
attribute will limit the number of simultaneous backend connections that Varnish establishes. If this limit is exceeded, backend requests will start failing.
Health probe
By defining a health probe, Varnish is made aware of the current health of the backend. The backend is probed at regular intervals and is considered healthy if the backend responds correctly often enough.
Backend health is used first and foremost when load balancing through the vmod_directors
module. In these cases, a director load balances between several backends, each representing one server. Backends that are considered sick are not included in the load-balancing rotation.
In VCL, the function std.healthy(backend)
can be used to check if a given backend is healthy.
In our sample VCL we send an HTTP HEAD request to the backend. This is done using the .request
attribute, which allows us to send a raw HTTP request. This can include any valid HTTP request header.
Host: localhost
request header is sent by the health probe. If your origin web server is not configured to handle requests for the localhost
hostname, please update the value of the Host
header in the .request
attribute.The health probe is configured to send a request to the backend every ten seconds. This is done through the .interval
attribute.
If the backend doesn’t respond in five seconds, the poll is considered unsuccessful. This is configured through the .timeout
attribute.
If three out of five polling attempts fail, the backend is considered sick. If the backend comes online again, it is considered healthy if three of the last five polling attempts succeed.
The polling window is configured through the .window
attribute and the .threshold
attribute defines the number of (un)successful polling attempts that determine the backend health.
Timeouts
Varnish will wait for the backend to respond, but does impose timeouts to avoid excessive waiting times.
In the example above the .connect_timeout
attribute is set to five seconds, which means Varnish will wait for a maximum of five seconds while attempting to connect to the backend.
The .first_byte_timeout
attribute, which is set to 90 seconds in the example, refers to the amount of time Varnish waits after successfully opening a connection before the first byte is received.
And finally the .between_bytes_timeout
specifies the maximum time to wait between bytes when reading the response. In this example it is set to two seconds.
connect_timeout
, first_byte_timeout
and between_bytes_timeout
runtime parameters. By specifying them as backend attributes, you have the ability to override them on a per-backend basis.2. Purging ACL
One of the features of the sample VCL template is to allow HTTP-based content purging through the PURGE
request method. It’s important to avoid unauthorized access to the purge logic.
Therefore, an ACL (Access Control List) should be defined that lists the IP addresses, hostnames and subnets that are allowed to purge content.
Here’s the VCL code that defines an ACL called purge
and only allows access from the local server:
acl purge {
"localhost";
"127.0.0.1";
"::1";
}
3. Host header normalization
If your Varnish server listens for incoming connections on multiple ports or port numbers other than 80
, the port number may end up in the Host
header.
For requests to http://example.com:80
, the port number will probably be stripped off by your web client, but for other port numbers that won’t be the case.
Because Varnish uses the Host
header as part of the object hash, it makes sense to strip off the port number in VCL. Otherwise multiple hashes will be created for what ultimately is the same content. This will reduce the cache capacity and lower the hit rate.
The following VCL code removes the port number from the Host
header to avoid this unwanted duplication:
sub vcl_recv {
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
}
4. Httpoxy mitigation
Httpoxy is a set of vulnerabilities that affect certain applications through the Proxy
request header.
The general advice is to remove this header to mitigate the impact of httpoxy.
Here’s the VCL code to do this:
sub vcl_recv {
unset req.http.proxy;
}
5. Sorting query string parameters
Varnish uses both the URL and the Host
header to compose the hash that identifies objects in cache. As you’ve seen earlier, Host
header normalization is required to avoid cache duplication. However, this also applies to the URL.
Query string parameters, such as, for example /?id=1&gid=5
, are also part of the URL, and although their order doesn’t impact the response, it does impact the hash.
That’s why we encourage you to alphabetically sort the query string parameters using the following VCL code:
import std;
sub vcl_recv {
set req.url = std.querysort(req.url);
}
import std;
to your VCL file. Otherwise the std
namespace is not available.6. Stripping off a trailing question mark
Here’s another attempt at avoiding cache duplication: removing a trailing question mark.
If your URL is /?
, the question mark indicates that query string parameters will follow. But if there are none, the trailing question mark serves no purpose.
The following VCL code will strip off the trailing question mark:
sub vcl_recv {
set req.url = regsub(req.url, "\?$", "");
}
7. Removing Google Analytics URL parameters
A lot of websites use Google Analytics to analyze their traffic. Specific query string parameters are added to the URL to track the user journey.
Although they are beneficial to those who analyze incoming traffic, they are responsible for cache duplication. Removing these parameters has no detrimental effect on the tracking: Google Analytics does not rely on server-side code and only uses client-side JavaScript.
Here’s how to remove the Google Analytics query string parameters:
sub vcl_recv {
if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=") {
set req.url = regsuball(req.url, "&(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "");
set req.url = regsuball(req.url, "\?(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "?");
set req.url = regsub(req.url, "\?&", "?");
set req.url = regsub(req.url, "\?$", "");
}
}
8. Allow purging
Earlier on, we defined an ACL that contains the IP addresses, hostnames and subnets of the clients that are allowed to purge.
Now it’s time to match the client IP address to this ACL, prevent unauthorized clients from purging, and execute return(purge)
for authorized clients.
Here’s the VCL code:
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return (synth(405, client.ip + " is not allowed to send PURGE requests."));
}
return (purge);
}
}
The !client.ip ~ purge
statement will evaluate to true
if the client IP address doesn’t match an entry from the purge
ACL that we defined earlier. If that is the case, a synthetic response is returned with a 405 Method Not Allowed
status.
If the request method equals PURGE
and that client IP address matches the purge
ACL, return(purge)
is used to purge the requested object from the cache.
9. Dealing with websockets
Although websockets use the same TCP connection as the incoming HTTP request, websockets do not use the HTTP protocol to communicate. Since Varnish only supports HTTP, sending the request directly to the backend using the normal return(pass)
procedure will not suffice.
Instead a return(pipe)
is required: this opens up the TCP connection to the backend and sends the upgrade request. The websocket data that is returned will not be interpreted by Varnish, and no attempt is made to treat the response as HTTP.
This is the VCL code that is required to make this happen:
sub vcl_recv {
if (req.http.Upgrade ~ "(?i)websocket") {
return (pipe);
}
}
sub vcl_pipe {
if (req.http.upgrade) {
set bereq.http.upgrade = req.http.upgrade;
}
return (pipe);
}
10. Piping other non-HTTP content
Although the previous VCL snippet was tailored around websockets, there are other situations that grant the use of return(pipe)
.
When the request method is not one of the following, we can conclude that we’re not dealing with a valid HTTP request:
GET
HEAD
PUT
POST
TRACE
OPTIONS
PATCH
DELETE
Here’s the VCL code to pipe content to the backend when the request method doesn’t match our expectations:
sub vcl_recv {
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "PATCH" &&
req.method != "DELETE") {
return (pipe);
}
}
sub vcl_recv
.If you know that your site only supports a subset of the above, you should consider synthesizing a response in Varnish indicating this:
sub vcl_recv {
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "POST" &&
req.method != "OPTIONS") {
return (synth(405, "Method Not Allowed")));
}
}
sub vcl_synth {
if (resp.status == 405) {
set resp.http.Allow = "GET, HEAD, POST, OPTIONS";
set resp.body = "Method not allowed";
return (deliver);
}
}
11. Only cache GET and HEAD requests
Now that we’re certain that we’re dealing with actual HTTP requests, we can decide to bypass the cache for non-cacheable request methods.
Under normal circumstances, only GET
and HEAD
requests are cached. Other request methods, such as POST
or DELETE
, are designed to explicitly change the requested resource.
Why would you cache a resource that will be changed once it is called? That’s why the following VCL example will bypass the cache when a request method other than GET
or HEAD
is used:
sub vcl_recv {
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
}
12. Remove tracking cookies
Varnish’s built-in VCL behavior bypasses the cache when a Cookie
request header is detected. As mentioned at the start of this tutorial: Varnish is very conservative when it comes to its standard behavior.
Because cookies imply a level of personalization, caching private data is tricky without knowing the purpose of these cookies.
However, tracking cookies aren’t really an issue: they are handled by the browser and can safely be removed by Varnish to ensure a better hit rate.
Google Analytics
The following VCL code will remove tracking cookies set by Google Analytics:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "(__utm|_ga|_opt)[a-z_]*=[^;]+(; )?", "");
}
HubSpot
The following VCL code will remove tracking cookies set by HubSpot:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "(__)?hs[a-z_\-]+=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "hubspotutk=[^;]+(; )?", "");
}
Hotjar
The following VCL code will remove tracking cookies set by Hotjar:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "_hj[a-zA-Z]+=[^;]+(; )?", "");
}
Google advertising products
The following VCL code will remove tracking cookies set by Google advertising products:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "(NID|DSID|__gads|GED_PLAYLIST_ACTIVITY|ACLK_DATA|ANID|AID|IDE|TAID|_gcl_[a-z]*|FLC|RUL|PAIDCONTENT|1P_JAR|Conversion|VISITOR_INFO1[a-z_]*)=[^;]+(; )?", "");
}
Other tracking cookies
The list of tracking cookies that was featured in this tutorial only covers some of the big analytics and advertising tools. You might use other tools, which use other cookies.
If your website features other tracking cookies, please strip them off using the code below:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "cookiename=[^;]+(; )?", "");
}
Replace cookiename
with the name of the tracking cookie you want to remove, or a regular expression pattern that matches multiple names.
Remove semicolon prefix
After you have stripped off all tracking cookies, you might be left with a ;
prefix as the remaining value of the Cookie
header. If that is the case, it needs to be removed as well:
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "^;\s*", "");
}
Remove empty cookies
When stripping off tracking cookies results in an empty string, or a collection of whitespace characters, you can safely remove the entire Cookie
header as illustrated below:
sub vcl_recv {
if (req.http.cookie ~ "^\s*$") {
unset req.http.cookie;
}
}
Being able to remove the entire Cookie
header means that there are no functional cookies left and that the entire page is cacheable.
13. Setting the X-Forwarded-Proto header
The X-Forwarded-Proto
header is used to transmit the request protocol that was used by the client. This is useful information to Varnish because the open-source version of Varnish doesn’t support native TLS.
The lack of native TLS support is circumvented by the use of a TLS proxy in front of Varnish. The TLS proxy that handles the HTTPS connection should add an X-Forwarded-Proto
header to indicate whether or not HTTPS was used as the request protocol.
If that header wasn’t set, the VCL code below will set it to either https
or http
depending on the protocol that was used for the request:
import std;
sub vcl_recv {
if (!req.http.X-Forwarded-Proto) {
if(std.port(server.ip) == 443 || std.port(server.ip) == 8443) {
set req.http.X-Forwarded-Proto = "https";
} else {
set req.http.X-Forwarded-Proto = "http";
}
}
}
sub vcl_hash {
hash_data(req.http.X-Forwarded-Proto);
}
Varnish has no awareness of the requested protocol and will serve the same object. This might lead to mixed content or even a redirect loop when an HTTP to HTTPS redirection ends up getting cached.
That’s why it makes sense to add the value of the X-Forwarded-Proto
header to the hash when requesting the object from cache. This will ensure that each page has an HTTP version and an HTTPS version stored in cache.
13. Caching static content
By now you should be aware that Varnish doesn’t cache when a Cookie
request header is presented unless you write the appropriate VCL.
But for static content, such as images, CSS files, JavaScript files and other similar resources, we can force them to be cached. This also means stripping off any potential cookies.
The following VCL snippet will force caching for requests matching the following file extensions:
7z
avi
bmp
bz2
css
csv
doc
docx
eot
flac
flv
gif
gz
ico
jpeg
jpg
js
less
mka
mkv
mov
mp3
mp4
mpeg
mpg
odt
ogg
ogm
opus
otf
pdf
png
ppt
pptx
rar
rtf
svg
svgz
swf
tar
tbz
tgz
ttf
txt
txz
wav
webm
webp
woff
woff2
xls
xlsx
xml
xz
zip
sub vcl_recv {
if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
unset req.http.Cookie;
unset req.http.Authorization
# Only keep the following if VCL handling is complete
return(hash);
}
}
sub vcl_backend_response {
if (bereq.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
unset beresp.http.Set-Cookie;
set beresp.ttl = 1d;
}
}
This means that the Cookie
and Authorization
request headers are stripped off and the request is immediately looked up in cache. In case of a cache miss, a potential Set-Cookie
response header is stripped off as well and the Time-To-Live of the object is set to a day.
14. ESI support
When your application supports Edge Side Includes (ESI), we need to write some VCL code to enable ESI parsing.
The idea is that we announce ESI support through the Surrogate-Capability: key=ESI/1.0
request header. When the application notices this, it should send a corresponding Surrogate-Control
header that also contains ESI/1.0
. When Varnish sees this Surrogate-Control
response header, it will enable ESI parsing as illustrated below:
sub vcl_recv {
set req.http.Surrogate-Capability = "key=ESI/1.0";
}
sub vcl_backend_response {
if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
unset beresp.http.Surrogate-Control;
set beresp.do_esi = true;
}
}
15. Setting grace mode
When content has expired from the cache, Varnish needs to revalidate that content with the origin server. This usually involves a backend fetch, and the client that requested the content has to wait until the origin responds. This could potentially be detrimental to the user experience.
Thanks to grace mode, Varnish can asynchronously revalidate content while serving the stale version to the client.
The standard grace value is set to ten seconds, which means content will be asynchronously revalidated until an object is ten seconds past the expiration of its Time-To-Live.
If you want to change this value in your VCL file, you can use the following code:
import std;
sub vcl_recv {
if (std.healthy(req.backend_hint)) {
set req.grace = 10s;
}
}
sub vcl_backend_response {
set beresp.grace = 6h;
}
This example will set the grace value to six hours, but will only use ten seconds of grace at request time when the backend is healthy.
If the origin server is down, Varnish will still serve an outdated version of the object for another six hours, which can potentially be a lifesaver. When the origin recovers, the grace value of ten seconds will be used again.
16. Putting it all together
Here’s the all-in-one VCL file that has all the previous snippets:
vcl 4.1;
import std;
backend server1 {
.host = "127.0.0.1";
.port = "8080";
.max_connections = 100;
.probe = {
.request =
"HEAD / HTTP/1.1"
"Host: localhost"
"Connection: close"
"User-Agent: Varnish Health Probe";
.interval = 10s;
.timeout = 5s;
.window = 5;
.threshold = 3;
}
.connect_timeout = 5s;
.first_byte_timeout = 90s;
.between_bytes_timeout = 2s;
}
acl purge {
"localhost";
"127.0.0.1";
"::1";
}
sub vcl_recv {
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
unset req.http.proxy;
set req.url = std.querysort(req.url);
set req.url = regsub(req.url, "\?$", "");
set req.http.Surrogate-Capability = "key=ESI/1.0";
if (std.healthy(req.backend_hint)) {
set req.grace = 10s;
}
if (!req.http.X-Forwarded-Proto) {
if(std.port(server.ip) == 443 || std.port(server.ip) == 8443) {
set req.http.X-Forwarded-Proto = "https";
} else {
set req.http.X-Forwarded-Proto = "https";
}
}
if (req.http.Upgrade ~ "(?i)websocket") {
return (pipe);
}
if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=") {
set req.url = regsuball(req.url, "&(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "");
set req.url = regsuball(req.url, "\?(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "?");
set req.url = regsub(req.url, "\?&", "?");
set req.url = regsub(req.url, "\?$", "");
}
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return (synth(405, client.ip + " is not allowed to send PURGE requests."));
}
return (purge);
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "PATCH" &&
req.method != "DELETE") {
return (pipe);
}
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
unset req.http.Cookie;
return(hash);
}
set req.http.Cookie = regsuball(req.http.Cookie, "(__utm|_ga|_opt)[a-z_]*=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "(__)?hs[a-z_\-]+=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "hubspotutk=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "_hj[a-zA-Z]+=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "(NID|DSID|__gads|GED_PLAYLIST_ACTIVITY|ACLK_DATA|ANID|AID|IDE|TAID|_gcl_[a-z]*|FLC|RUL|PAIDCONTENT|1P_JAR|Conversion|VISITOR_INFO1[a-z_]*)=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^;\s*", "");
if (req.http.cookie ~ "^\s*$") {
unset req.http.cookie;
}
}
sub vcl_hash {
hash_data(req.http.X-Forwarded-Proto);
}
sub vcl_backend_response {
if (bereq.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
unset beresp.http.Set-Cookie;
set beresp.ttl = 1d;
}
if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
unset beresp.http.Surrogate-Control;
set beresp.do_esi = true;
}
set beresp.grace = 6h;
}