Configuring Varnish for Drupal
Drupal is an open-source content management framework that is popular for mid-market and enterprise solutions that require more complex content types and workflows.
This tutorial is a step-by-step guide on how to configure Varnish for Drupal.
1. Install and configure Varnish
If you are already running a Drupal CMS and you want to use Varnish to accelerate it, you’ll have to decide where to install Varnish:
- You can install Varnish on a dedicated machine and point your DNS records to that server.
- You can install Varnish on the same server as your Drupal site.
For a detailed step-by-step Varnish installation guide, we’d like to refer you to one of the following dedicated tutorials:
- Installing Varnish on Ubuntu
- Installing Varnish on Debian
- Installing Varnish on CentOS
- Installing Varnish on Red Hat Enterprise Linux
2. Reconfigure the web server port
The web server that is hosting your Drupal CMS is most likely set up to handle incoming HTTP requests on port 80. For Varnish caching to properly work, Varnish needs to listen on port 80. This also means that your web server needs to be configured on another listening port. We’ll use port 8080 as the new web server listening port.
Depending on the type of web server you’re using, different configuration files need to be modified. Here’s a quick how-to for Apache and Nginx.
Apache
If you’re using Apache as your web server, you need to replace Listen 80
with Listen 8080
in Apache’s main configuration file.
The individual virtual hosts will also contain port information. You will need to replace <VirtualHost *:80>
with <VirtualHost *:8080>
in all virtual host files.
Here’s how to change Apache’s listening port for various Linux distributions:
- Change Apache’s listening port on Ubuntu
- Change Apache’s listening port on Debian
- Change Apache’s listening port on CentOS
- Change Apache’s listening port on Red Hat Enterprise Linux
These changes will only take effect once Apache is restarted.
Nginx
If you’re using Nginx, you’ll only have to replace listen 80;
with listen 8080;
in all virtual host files.
Here’s how to change Nginx’s listening port for various Linux distributions:
- Change the Nginx listening port on Ubuntu
- Change the Nginx listening port on Debian
- Change the Nginx listening port on CentOS
- Change the Nginx listening port on Red Hat Enterprise Linux
These changes will only take effect once Nginx is restarted.
3. Install Drupal purging modules
Drupal has a collection of modules that can be used to invalidate the cache. For Drupal to support Varnish, the following modules need to be installed:
The Purge
module has the following set of submodules that also should be enabled, depending on your preferences:
- Purge Drush
- Purge Tokens
- Purge UI
- Purge Cron processor
- Purge Late runtime processor
- Purge Core tags queuer
The Generic HTTP Purger
module also has a Generic HTTP Tags Header
submodule that needs to be enabled.
You can download these modules yourself, or install them from the /admin/modules
panel. However, the quickest way to install these modules is by using the following commands:
composer require drupal/purge drupal/purge_purger_http
This command will install the required dependencies via the composer
package manager for PHP.
drush en purge_drush \
purge_processor_lateruntime \
purge_queuer_coretags \
purge_processor_cron \
purge_tokens \
purge_ui \
purge \
purge_purger_http \
purge_purger_http_tagsheader
This command will enable the required modules in Drupal.
4. Configure caching and purging in Drupal
The Performance
section of the Drupal Administration Configuration allows you to tune caching and cache invalidation settings.
Set cache Time To Live
Before we can configure how Drupal invalidates objects from the cache, we must first ensure the objects are properly stored in cache.
Please follow these steps to configure the caching Time To Live:
- Go to the Drupal admin panel
- Select
Configuration
>Performance
- Choose a
Browser and proxy cache maximum age
value - Click the
Save configuration
button at the bottom of the window
Configuring tag-based cache invalidation
Drupal uses a Purge-Cache-Tags
response header to register tags for every page. These tags are cached and can be matched by a ban expressions in Varnish’s ban()
VCL function. This will invalidate multiple pages at once.
For this to work, the Purge
module needs to be configured. This can be done by following these steps:
- Go to the Drupal admin panel
- Select
Configuration
>Performance
- Click the
Purge
tab - Click the
Add purger
button to add the HTTP Purger - Select the
HTTP purger
option - Click
Add
- Select the
Configure
dropdown option next to the newly created HTTP Purger - Assign a name to the new purger (e.g.
Varnish - Tag
) - Keep
Tag
as the selected value of theType
field - Ensure the
Request
tab is selected - Set the hostname of your Varnish server in the
Hostname
field (defaults tolocalhost
) - Set the port number of your Varnish server in the
Port
field (defaults to80
) - Keep
/
as the value of thePath
field - Keep
BAN
as the selected value of theRequest Method
field - Keep
http
the selected value of the Scheme` field - Select the
Headers
tab - Add a new header by setting
Purge-Cache-Tags
as theHEADER
field value - Set
[invalidation:expression]
as the value of theVALUE
field - Click the
Save configuration
button at the bottom of the window
localhost
value in the Hostname
field to the hostname of your Varnish server and change the port number of the Port
value accordingly.When you run the following command, all pages that require an update will be purged from the Varnish cache:
drush p-queue-work
The invalidation will also take place automatically through the Late runtime processor that was also configured. There is also a Cron processor that will process the purge queue at set intervals.
Allow purging all items from the cache
When content on pages changes the Late runtime processor, the Cron processor or drush p-queue-work
will ensure that these changes result in the right cache invalidation calls: only the affected content will be purged from the cache.
However there are situations where you want the entire cache to be flushed. The Purge
module allows you to do this through the following command:
drush p:invalidate everything
For this to work, the everything
invalidation type needs to be configured. The following steps will help you configure this:
- Go to the Drupal admin panel
- Select
Configuration
>Performance
- Click the
Purge
tab - Click the
Add purger
button to add the HTTP Purger - Select the
HTTP purger
option - Click
Add
- Select the
Configure
dropdown option next to the newly created HTTP Purger - Assign a name to the new purger (e.g.
Varnish - Everything
) - Select
Everything
as the value of theType
field - Ensure the
Request
tab is selected - Set the hostname of your Varnish server in the
Hostname
field (defaults tolocalhost
) - Set the port number of your Varnish server in the
Port
field (defaults to80
) - Keep
/
as the value of thePath
field - Keep
BAN
as the selected value of theRequest Method
field - Keep
http
the selected value of the Scheme` field - Select the
Headers
tab - Add a new header by setting
Purge-Cache-Tags
as theHEADER
field value - Set
.*
as the value of theVALUE
field - Click the
Save configuration
button at the bottom of the window
localhost
value in the Hostname
field to the hostname of your Varnish server and change the port number of the Port
value accordingly.The drush p:invalidate everything
will remove all Drupal pages from cache as soon as the right purger configuration is created.
5. Deploy the custom Drupal VCL
A custom VCL file containing the necessary caching rules is needed to guarantee a decent performance. This file is located in /etc/varnish/default.vcl
and also contains the backend definition that Varnish uses to connect to the web server.
The Drupal VCL file
Here’s the complete VCL file you can use:
vcl 4.1;
import std;
backend default {
.host = "127.0.0.1";
.port = "8080";
}
# Add hostnames, IP addresses and subnets that are allowed to purge content
acl purge {
"localhost";
"127.0.0.1";
"::1";
}
sub vcl_recv {
# Announce support for Edge Side Includes by setting the Surrogate-Capability header
set req.http.Surrogate-Capability = "Varnish=ESI/1.0";
# Remove empty query string parameters
# e.g.: www.example.com/index.html?
if (req.url ~ "\?$") {
set req.url = regsub(req.url, "\?$", "");
}
# Remove port number from host header
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
# Sorts query string parameters alphabetically for cache normalization purposes.
set req.url = std.querysort(req.url);
# Remove the proxy header to mitigate the httpoxy vulnerability
# See https://httpoxy.org/
unset req.http.proxy;
# Add X-Forwarded-Proto header when using https
if (!req.http.X-Forwarded-Proto) {
if(std.port(server.ip) == 443 || std.port(server.ip) == 8443) {
set req.http.X-Forwarded-Proto = "https";
} else {
set req.http.X-Forwarded-Proto = "http";
}
}
# Ban logic to remove multiple objects from the cache at once. Tailored to Drupal's cache invalidation mechanism
if(req.method == "BAN") {
if(!client.ip ~ purge) {
return(synth(405, "BAN not allowed for this IP address"));
}
if (req.http.Purge-Cache-Tags) {
ban("obj.http.Purge-Cache-Tags ~ " + req.http.Purge-Cache-Tags);
}
else {
ban("obj.http.x-url ~ " + req.url + " && obj.http.x-host == " + req.http.host);
}
return (synth(200, "Ban added."));
}
# Purge logic to remove objects from the cache
if(req.method == "PURGE") {
if(!client.ip ~ purge) {
return(synth(405,"PURGE not allowed for this IP address"));
}
return (purge);
}
# Only handle relevant HTTP request methods
if (
req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "PATCH" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE"
) {
return (pipe);
}
# Remove tracking query string parameters used by analytics tools
if (req.url ~ "(\?|&)(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=") {
set req.url = regsuball(req.url, "&(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "");
set req.url = regsuball(req.url, "\?(utm_source|utm_medium|utm_campaign|utm_content|gclid|cx|ie|cof|siteurl)=([A-z0-9_\-\.%25]+)", "?");
set req.url = regsub(req.url, "\?&", "?");
set req.url = regsub(req.url, "\?$", "");
}
# Only cache GET and HEAD requests
if ((req.method != "GET" && req.method != "HEAD") || req.http.Authorization) {
return(pass);
}
# Mark static files with the X-Static-File header, and remove any cookies
# X-Static-File is also used in vcl_backend_response to identify static files
if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
set req.http.X-Static-File = "true";
unset req.http.Cookie;
return(hash);
}
# Don't cache the following pages
if (req.url ~ "^/status.php$" ||
req.url ~ "^/update.php$" ||
req.url ~ "^/cron.php$" ||
req.url ~ "^/admin$" ||
req.url ~ "^/admin/.*$" ||
req.url ~ "^/flag/.*$" ||
req.url ~ "^.*/ajax/.*$" ||
req.url ~ "^.*/ahah/.*$") {
return (pass);
}
# Remove all cookies except the session & NO_CACHE cookies
if (req.http.Cookie) {
set req.http.Cookie = ";" + req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(S?SESS[a-z0-9]+|NO_CACHE)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.cookie ~ "^\s*$") {
unset req.http.cookie;
} else {
return(pass);
}
}
return(hash);
}
sub vcl_hash {
# Create cache variations depending on the request protocol
hash_data(req.http.X-Forwarded-Proto);
}
sub vcl_backend_response {
# Inject URL & Host header into the object for asynchronous banning purposes
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
# Serve stale content for 2 minutes after object expiration
# Perform asynchronous revalidation while stale content is served
set beresp.grace = 120s;
# If the file is marked as static we cache it for 1 day
if (bereq.http.X-Static-File == "true") {
unset beresp.http.Set-Cookie;
set beresp.ttl = 1d;
}
# If we dont get a Cache-Control header from the backend
# we default to 1h cache for all objects
if (!beresp.http.Cache-Control) {
set beresp.ttl = 1h;
}
# Parse Edge Side Include tags when the Surrogate-Control header contains ESI/1.0
if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
unset beresp.http.Surrogate-Control;
set beresp.do_esi = true;
}
}
sub vcl_deliver {
# Cleanup of headers
unset resp.http.x-url;
unset resp.http.x-host;
unset req.http.X-Static-File;
}
Customizations
The VCL file for Drupal contains two sections that might require some customization:
- The backend definition
- The access control list (ACL)
Here’s the backend definition:
backend default {
.host = "127.0.0.1";
.port = "8080";
}
This backend definition makes Varnish connect to 127.0.0.1
on port 8080
when content needs to be fetched. Assuming that Varnish is hosted on the same server as your Drupal CMS, this value can be left unchanged.
If you’re hosting Drupal on another server or on another port, you’ll need to modify the .host
and .port
values.
The current values inside the purge
access control list (ACL) all refer to the local machine, as you can see in the snippet below:
acl purge {
"localhost";
"127.0.0.1";
"0.0.0.0";
"::1";
}
Again, assuming that Drupal and Varnish are hosted on the same machine, these values can be left untouched. If your Drupal site is hosted on another machine, the right IP address needs to be added to the list.
If invalidations happen from external locations, the IP addresses, the IP ranges, or the hostnames of these locations have to be added to the ACL.
6. Restart the services
If you’re using Apache as a web server, you’ll run the following command to restart it:
sudo systemctl restart apache2
If you’re using Nginx instead, please run the following command to restart your web server:
sudo systemctl restart nginx
And finally, you’ll have to run the following command to restart Varnish:
sudo systemctl restart varnish
After the restart, your web server will accept traffic on port 8080
, Varnish will handle HTTP traffic on port 80
. The restart will also ensure the right VCL file is loaded, which will ensure that requests for your Drupal CMS can be properly cached.
7. Flushing the cache
In the final step we will flush all Drupal caches to ensure a consistent state:
drush cr
drush p:invalidate everything