If you have a really big web farm you'll have lots of backends and some sort of complex mapping between virtual hosts and web servers.
Currently you can choose to have some script that builds the backend definitions and then include that in your VCL or use the DNS director. Neither are ideal. Changing VCL is a hassle. The DNS director, while being built for exactly that purpose, requires you to shove everything into one network and then maintain a zone. Besides, the DNS director doesn't work very well with IPv6.
Both lack the power and flexibility that is the hallmark of Varnish and for quite some time we've been thinking of a way to improve the situation. They are not very Varnishy.
So, for some time we've been trying to figure our something new. These have been the requirements, contributed by various people we've met over the last couple of years.
- Backends should be defined in a lazy fashion. Unless you need it it shouldn't need to be created.
- It should be flexible and powerful. Backend definitions should be provided by a VMOD, which in turn can get data from almost anything.
- There must be some built in caching of definitions. Calling out to external programs or network services is dead slow.
- Which means there must be some way to revalidate backends. DNS is a good example here. Amazon S3 hosts change all the time.
Proposed solution
Let's have a backend type "dynamic".
director dyn1 dynamic {
# some defaults could be overridden here
}
Now, that doesn't give us much. Let me explain what is going on behind the curtain. Each dynamic director contains a data structure (an alist or a hash or something similar) that bind the backend to a name. The name is a arbitrary string, unique for each backend in that director. At least initially, at some point we might want Varnish to balance load between two backends and giving them the same name might be a good way to signal your intent.
We also need some functions to do some operations in VCL. In order to contain the function calls that deal with dynamic backends somewhat we prefix stuff with dyn. So we'll need the following functions in VCL:
- dyn.create(director, name, IP, TTL)
- dyn.delete(director, name)
Then there might be a couple of functions that might be useful:
- dyn.exists(director, name)
- dyn.set_ttl(director,name, TTL)
- dyn.get_ttl(director,name)
- (..)
The missing piece is how to tie the requests to the correct backend. I suggest we introduce a new variable on each requests. req.routing or similar. The director will look at req.routing and compare it to the strings identifying each backend. If one matches, great. If there is no match then we use the default (if any) or throw an error.
The really cool thing
The great thing about this is that we can get the definition from any source. If we build a DNS VMOD, which would be quite simple, defining a backend might look like this:
dyn.create(dyn1, "foo", dns.lookup("foo"), dns.lookup_ttl("foo"));
or we could get the data from /etc/hosts, Berkeley DB, CDB files, memcache, redis or some awful XML file if that what is what you like. I talked to someone having the host mapping in some proprietary database being accessed through multicast - it would probably be a day or two of work to get Varnish to understand that.
A more complete VCL example
vcl_recv {
# www.foo.com ==> foo
set req.routing = regsub(req.http.host, "(?i:)(www.)(.*).com","\2");
set req.http.x-routing = req.routing;
if (! dyn.exists(d1, req.routing) {
# Create “foo” backend here with a TTL of 3600 secs.
dynbackend.create(d1, req.routing,
dns.lookup(req.routing),
3600 );
if (! dyn.exists(d1, req.routing) {
error;
}
}
Current status
This is currently not implemented. We're currently looking for companies to work together with in order to make this a reality. If your company needs this please send me an email at perbu @ varnish-software.com or leave a comment.Feedback is more than welcome.
Image is (c) 2014 Stephen Donaghy used under Creative Commons license.