Logging in Varnish Cache is one of the unique features of Varnish that in my mind sets it apart from the rest of the software world. It combines both logging in great detail with both performance, manageability, sensible privacy defaults and access to great debugging help. The shared memory log, logs everything that happens without the need to adjust log levels or without significantly affecting performance, eliminating the need for turning on any debug switch.
Many of you have surely been in a situation where you have an application that is misbehaving. Then you increase the log level to debug and the problem magically disappears - obviously caused by the application slowing down just enough that the race condition you encountered earlier is now gone.
The weakness of logging everything is that so much information is available that the administrator can sometimes be overwhelmed by all the information presented. It’s a figurative firehose of information and drinking from it can be painfull.
Martin has reimplemented a new logging framework in Varnish Cache 4.0. Out of all the new stuff in Varnish Cache 4.0 this might be the most significant one. It’s also the most complex feature, requiring quite a bit of time to fully understand how it works.
Work items and VXIDs
In Varnish Cache 4.0 we are introducing a new term. The VXID - Varnish Transaction ID. It’s applied to lots of different kinds of work items. A unique VXID is assigned to each:
- client request
- backend request
- ESI sub-request
- session
You can use this when you view the log through varnishlog. The default is to group the log on VXID, which basically makes varnishlog act more or less the way it does in Varnish 3.0. When viewing a log for a simple cache miss, you’ll see the backend request, the client request and then the session. They are displayed in the order they end. Some people find it a bit counter intuitive that the backend request is logged before the client request, but if you think about it is perfectly natural.
Client request grouping
My default grouping mode will probably be client request. In client request grouping mode the various work items are logged together with their originating client request. So, a client request might trigger a backend request, which in turn triggers two more ESI subrequest, which in turn might trigger yet another ESI subrequest. Then comes the response to the client. All these are arranged in the order they are initiated, which is somewhat easier on the brain than the VXID grouping.
When a subrequest occurs in the log it tell us about the relationship to its parent request through the “Link” statement. Link will deliver the VXID of the parent request. varnishlog will indent based on the level of the request, making it easier to see what level the current request is on.
Example output
$ varnishlog -g request -q "ReqUrl eq '/'" -i Timestamp,Begin,ReqMethod,ReqUrl,ReqHeader -d * << Request >> 2 - Begin req 1 rxreq - Timestamp Start: 1397057490.760344 0.000000 0.000000 - Timestamp Req: 1397057490.760344 0.000000 0.000000 - ReqMethod GET - ReqURL / - ReqHeader User-Agent: curl/7.30.0 - ReqHeader Host: localhost:8000 - ReqHeader Accept: */* - ReqHeader X-Forwarded-For: ::1 - Timestamp Fetch: 1397057490.766380 0.006036 0.006036 - Timestamp Process: 1397057490.766391 0.006047 0.000011 - Timestamp Resp: 1397057490.766548 0.006204 0.000157 ** << BeReq >> 3 -- Begin bereq 2 fetch -- Timestamp Start: 1397057490.760419 0.000000 0.000000 -- Timestamp Bereq: 1397057490.760635 0.000216 0.000216 -- Timestamp Beresp: 1397057490.766170 0.005751 0.005535 -- Timestamp BerespBody: 1397057490.766408 0.005989 0.000238
The example above showcases a couple of new items in addition to the grouping. As you can see, we're using the -i to only show the log records that are listed. One of these are the timestamps. Varnish now timestamps the log at certain points, giving you detailed timing information as the request flows through Varnish. For those of you who are doing weird stuff in VCL you might appreciate that there now is a VCL function called timestamp in the std module, which will insert a custom timestamp with a user provided label.
Query log language
One more thing you might have noticed from the output above was the -q option. It allows us to add a query to varnishlog. Think of it as a sort of select filter for varnishlog. It works together with the grouping so that if the query matches some part of any of the work items in the transaction group the whole group matches and gets displayed.
So, in our previous example we mentioned that we might have a reqeust that might trigger a backend request and then two levels of ESI subrequests. Let us say the innermost ESI subrequst fails. Let us also say we have a varnishlog running which looks like this
$ varnishlog -g request -q 'RespStatus>= 500'
This will match the innermost (failing) ESI subrequest. Since we're grouping on the client request everything that leads up to this error will be output. Pretty awesome, right? Having one of these statements running on your Varnish server will probably be very valuable when you're hunting down backend errors.
Zero copy in varnishlog
The new logging code is substantially faster than the old. All dynamic memory allocations have been removed and the resulting code is much faster than the previous versions. Basically it only deals with pointers into the shared memory log. The fact that Martin actually pulled it all off without creating a soup of segfaults impresses me quite a bit.
Everything I said about Varnishlog also goes for the other logging tools
The grouping and the query log processing all happens in the Varnish logging API. This means that the other programs using the API automatically get grouping and query language. So, if you want varnishhist to only cover transactions delivering text/html that should be easy.
$ varnishhist -q "RespHeader eq 'Content-Type: text/html'"
Or you want varnishncsa to only log stuff that is related to a certain domain:
$ varnishncsa -q "ReqHeader ~ '^Host: .*\.example.com'"
This will dump everything pointing to example.com or any of its subdomains into this output.
Pretty awesome, huh?