Removing cookies in Varnish
Caching and cookies don’t always go hand in hand. Varnish is very conservative when it comes to handling cookies because the very nature of a cookie is to keep track of state and personalize the response.
The built-in VCL indicates that Varnish will not serve an object from the cache if the request contains a Cookie
header.
This standard behavior doesn’t work in the real world where cookies are omnipresent.
This tutorial presents a VCL-based solution that removes cookies in situations where they are not required.
What does a cookie look like in Varnish?
Cookies are a collection of key-value pairs, separated by a semicolon. They are transported via the Cookie
request header.
Here’s an example of a Cookie
header:
Cookie: lang=en; _ga=GA1.3.292651669.1502954402; sessionID=0aef28c82761e4507d5f8ae49a259284
There are three distinct cookies in this header (lang
, _ga
and sessionID
). Varnish treats the Cookie
request header like any other header. This means that a cookie is nothing more than a string in Varnish.
Accessing individual cookies and cookie values requires the use of pattern matching functions like regsub()
and regsuball()
.
It’s safe to remove tracking cookies
By removing cookies in Varnish, the origin application won’t be able to use these cookies, which may result in inconsistent behavior or attempts to reset the cookie.
However, for cookies that are processed on the client it doesn’t really matter. Tracking cookies are the perfect example: they are processed by third-party libraries in JavaScript and are not processed by the server.
Removing all cookies
You can remove all cookies by unsetting the Cookie
header in VCL. This ensures that websites that use cookies become cacheable.
Here’s the code to do that:
vcl 4.1;
sub vcl_recv {
unset req.http.Cookie;
}
This is a pretty drastic action that is only justified if you know for sure that the origin application doesn’t need any of the cookies that were set. If you only have tracking cookies, this is a good solution.
Conditionally removing all cookies
However, if some parts of your application rely on a cookie, such as a session cookie for example, you can conditionally remove all cookies.
Here’s an example where all cookies are removed, except for /admin
requests:
vcl 4.1;
sub vcl_recv {
if(req.url !~ "^/admin(/.*)?$") {
unset req.http.Cookie;
}
}
Removing individual cookies
Instead of removing all cookies at once by unsetting the Cookie
request header, you can also remove individual cookies. As mentioned before, to Varnish cookies are just a string. A find and replace action needs to happen to remove an individual cookie.
Here’s how to do that using the regsuball()
function:
vcl 4.1;
sub vcl_recv {
set req.http.Cookie = regsuball(req.http.Cookie, "lang=[^;]+(; )?", "");
if (req.http.Cookie ~ "^\s*$") {
unset req.http.cookie;
}
}
The VCL snippet will remove the lang
cookie from the Cookie
header. Because this regsuball()
function replaces cookies with an empty string, the Cookie
header needs to be removed if only the empty string remains.
Removing tracking cookies
Removing individual cookies becomes useful when you want to get rid of tracking cookies.
Here’s an example where tracking cookies are explicitly removed:
vcl 4.1;
sub vcl_recv {
# Some generic cookie manipulation, useful for all templates that follow
# Remove the "has_js" cookie
set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", "");
# Remove any Google Analytics based cookies
set req.http.Cookie = regsuball(req.http.Cookie, "__utm[^=]+=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "_ga[^=]*=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "_gcl_[^=]+=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "_gid=[^;]+(; )?", "");
# Remove DoubleClick offensive cookies
set req.http.Cookie = regsuball(req.http.Cookie, "__gads=[^;]+(; )?", "");
# Remove the Quant Capital cookies (added by some plugin, all __qca)
set req.http.Cookie = regsuball(req.http.Cookie, "__qc.=[^;]+(; )?", "");
# Remove the AddThis cookies
set req.http.Cookie = regsuball(req.http.Cookie, "__atuv.=[^;]+(; )?", "");
# Remove a ";" prefix in the cookie if present
set req.http.Cookie = regsuball(req.http.Cookie, "^;\s*", "");
# Are there cookies left with only spaces or that are empty?
if (req.http.cookie ~ "^\s*$") {
unset req.http.cookie;
}
}
regsuball()
calls actually match multiple cookies at once because of the regular expression that is used.Only keep required cookies
The example above where tracking cookies are removed is quite explicit: you know exactly which cookies are removed. But this can get tedious if you have a lot of tracking cookies. You also need to keep this list up to date if new tracking cookies are introduced.
That’s why you can also remove all cookies, except the ones that you need for server-side processing. Here’s the VCL code to do this:
vcl 4.1;
sub vcl_recv {
if (req.http.Cookie) {
set req.http.Cookie = ";" + req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(sessionID|cart)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.cookie ~ "^\s*$") {
unset req.http.cookie;
}
}
}
This VCL example will remove every single cookie, except the sessionID
cookie and the cart
cookie. It also removes the Cookie
header entirely if it’s empty. All this logic is only executed if the request actually contains a Cookie
header.
In the end if a sessionID
or cart
cookie is set, Varnish will not cache, assuming the origin application needs the values of the cookies to compose the output.
What about the Set-Cookie header?
Cookies are stored client-side and are sent to the server through a Cookie
request header. But when you visit a website for the first time, you will not have any cookies in your cookie jar for that site.
When you perform an action on the website that should set a cookie, the server will attach a Set-Cookie
header to the response. The client will process this header and will store the cookie value in its cookie jar.
The built-in VCL indicates that Varnish will not store the response in the cache if it contains a Set-Cookie
header.
When a response contains a Set-Cookie
header, the object will end up on the Hit-For-Miss list for the next two minutes. All subsequent requests for that resource will bypass the cache for that duration or until the next response doesn’t contain a Set-Cookie
header.
The end result of setting a cookie is a cache miss. Usually, the Set-Cookie
header remains untouched because the value of the cookie it sets usually matters.
There are exceptions though.
Removing the Set-Cookie header for static files
If your application starts setting cookies when static files are requested, you probably want to remove them. As the name implies, the files are static and their output doesn’t depend on a cookie.
Here’s an example where a potential Set-Cookie
header is removed for all images, CSS files, JavaScript files and web font files:
vcl 4.1;
sub vcl_backend_response {
if (bereq.url ~ "^[^?]*\.(css|gif|ico|jpeg|jpg|js|png|svg|webp|woff|woff2)(\?.*)?$") {
unset beresp.http.Set-Cookie;
}
}
You can also use the Content-Type
header to determine whether the content is static:
vcl 4.1;
sub vcl_backend_response {
if (bereq.http.Content-type ~ "^(((image|video|font)/.+)|application/javascript|text/css).*$") {
unset beresp.http.Set-Cookie;
}
}
This VCL snippet will strip off the Set-Cookie
header for the following content types:
- All image content types that start with
image/
, like for exampleimage/jpeg
orimage/png
- All video content types that start with
video/
, like for examplevideo/mpeg
- All font content types that start with
font/
, like for examplefont/woff2
- JavaScript content with the
application/javascript
content type - CSS content with the
text/css
content type
Removing the Set-Cookie header for misbehaving applications
Setting a cookie is usually done when you log in to a website or when you add your first product to a shopping cart. These actions usually set a session identification cookie.
Unfortunately, some websites already set a session ID cookie on the homepage, just in case it is needed later. This results in a cache miss on your homepage, which is arguably your most visited page.
Here’s a VCL example where the Set-Cookie
header is removed unless the /admin
pages are visited:
vcl 4.1;
sub vcl_backend_response {
if(bereq.url !~ "^/admin(/.*)?$") {
unset beresp.http.Set-Cookie;
}
}
Removing cookies with vmod_cookie
As of Varnish Cache version 6.4, vmod_cookie
is an in-tree VMOD. This means the module is installed and can be imported in your VCL.
vmod_cookie
as part of the Varnish modules collection. Once you download the source, you need to compile the module collection from source.vmod_cookie
provides an API that makes interacting with cookies a lot easier.
Removing individual cookies with vmod_cookie
The cookie.delete()
function is an easy way to remove individual cookies. Instead of coming up with complicated regular expressions that are parsed by regsuball()
, you can simply call this one function.
The following example deletes the lang
cookie:
vcl 4.1;
import cookie;
sub vcl_recv {
if (req.http.Cookie) {
cookie.parse(req.http.Cookie);
cookie.delete("lang");
set req.http.Cookie = cookie.get_string();
}
}
You can also remove multiple cookies:
vcl 4.1;
import cookie;
sub vcl_recv {
if (req.http.Cookie) {
cookie.parse(req.http.Cookie);
cookie.filter("lang,cart");
set req.http.Cookie = cookie.get_string();
}
}
This example removes the lang
cookie and the cart
cookie.
You can even filter out cookies using a regular expression pattern:
vcl 4.1;
import cookie;
sub vcl_recv {
if (req.http.Cookie) {
cookie.parse(req.http.Cookie);
cookie.filter_re("__utm.");
set req.http.Cookie = cookie.get_string();
}
}
This example removes the various __utm
cookies set by Google Analytics, such as __utma
and __utmz
Keeping select cookies with vmod_cookie
One of the previous examples featured a set of regsuball()
calls to remove all cookies except a list of cookies that were needed by the server. The cookie.keep()
function can replace these five lines of code as illustrated below:
vcl 4.1;
import cookie;
sub vcl_recv {
if (req.http.Cookie) {
cookie.parse(req.http.Cookie);
cookie.keep("sessionID,lang");
set req.http.Cookie = cookie.get_string();
}
}
This VCL example will remove all cookies, except the sessionID
and lang
cookies.
The cookie.keep_re()
function does the same thing, but using a regular expression pattern.
Here’s an example where we will only keep a select set of session ID cookies:
vcl 4.1;
import cookie;
sub vcl_recv {
if (req.http.Cookie) {
cookie.parse(req.http.Cookie);
cookie.keep_re("(session(ID)?|PHPSESSID)");
set req.http.Cookie = cookie.get_string();
}
}
This example will remove all cookies, except the session
, sessionID
and PHPSESSID
cookies.