DNS based GSLB (Global Server Load Balancer)
Introduction
The Global Server Load Balancer (GSLB) in a CDN is responsible for the distribution of HTTP clients globally. It distributes clients to specific locations (POPs) or to specific cache nodes, depending on the implementation and architecture.
Varnish is agnostic to the GSLB function and can be integrated with different types such as DNS-based, network routing-based (anycast) and HTTP302-based solutions.
In this tutorial we will set up a simple self-hosted GSLB that distributes clients to the closest healthy cache node using DNS. The following diagram shows what our architecture looks like.
Benefits and challenges with a DNS based GSLB
Benefits:
- DNS is network agnostic. It does not require any special networking equipment or routing protocol knowledge and can be used across cloud providers and on-prem deployments in different networks.
- DNS is transparent to the HTTP layer, which means that it is possible to use a DNS-based GSLB with any HTTP use case.
- The setup is simple and requires few components/dependencies.
Challenges:
- DNS propagation time is potentially long and can become a disruptive factor during unforeseen downtime. This will also need to be considered when doing maintenance on the CDN.
- There are myriad DNS implementations on the client side, and their behavior is not always aligned.
Accuracy:
- IP geolocation will be based on the client’s IP subnet if the client’s DNS server supports relaying this information. The fallback is to use the IP address of the client’s DNS server, which in most cases is reasonably close to the client. For global distribution this is usually sufficient, but for distribution between multiple locations within a single country it may not be.
Call flow
Components involved
- The DNS functionality will be handled using the PowerDNS Authoritative Nameserver.
- IP geolocation will be done using the GeoLite2 database from Maxmind. Maxmind also provides databases with higher accuracy if needed.
Prerequisites
In order to replicate this tutorial, you will need:
- Several nodes for Varnish in different locations with public IP addresses. These will act as the caching nodes in the CDN. Use one of the supported platforms for Varnish.
- Minium two nodes for PowerDNS in different locations with public IP addresses. These will act as DNS servers in the CDN. More nodes can be added for more redundancy and performance. Use one of the supported platforms for PowerDNS.
- A subdomain to be used for the CDN.
In this tutorial, we have the following resources available:
Five data centers (US west, US east, EU west, AS west and AS east).
Six cache nodes with Varnish spread over these five data centers running CentOS 7.
cache01.us-west.example.com
at192.168.1.10
cache02.us-east.example.com
at192.168.2.10
cache01.eu-west.example.com
at192.168.3.10
cache02.eu-west.example.com
at192.168.3.11
cache01.as-west.example.com
at192.168.4.10
cache01.as-east.example.com
at192.168.5.10
Two DNS nodes with PowerDNS spread over the US east and EU west data centers running CentOS 7.
ns01.us-east.example.com
at192.168.2.5
ns01.eu-west.example.com
at192.168.3.5
One origin in EU west:
origin.example.com
at192.168.3.2
example.com
and will use the subdomain cdn.example.com
for the CDN. The IP addresses listed above are used as examples. In a real environment they would be in publicly available and routable IP networks.Setup
Step 1 - Prepare the caching nodes with Varnish
- Follow the quick start tutorial to install Varnish on each of the caching nodes.
- Deploy a VCL configuration to the caching nodes with the origin as the backend. Specify also a URL that can be used for health probes from the DNS server.
Example:
vcl 4.1;
import std;
# origin.example.com
backend origin {
.host = "192.168.3.2";
# Use port 443 on Varnish Enterprise
# Switch the port to 80 if you're using Varnish Cache
.port = "443";
# Set .ssl = true on Varnish Enterprise
# Remove the .ssl option if you're using Varnish Cache
.ssl = true;
}
sub vcl_recv {
# URL to be used for health probes
if (req.url == "/varnish-status") {
if (std.file_exists("/etc/varnish/maintenance")) {
# If the file exists, the cache node is in maintenance mode
# and will be excluded automatically in PowerDNS.
return(synth(503, "Maintenance"));
} else {
return(synth(200, "OK"));
}
}
set req.backend_hint = origin;
}
Step 2 - Install PowerDNS
This step of the tutorial covers PowerDNS Authoritative Server version 4.4 on CentOS 7. For other versions of PowerDNS or other platforms than CentOS 7, please refer to the PowerDNS install documentation.
On the DNS nodes, create the file /etc/yum.repos.d/powerdns.repo
with the following contents (please refer to the PowerDNS repositories for the most up to date information):
[powerdns-auth-44]
name=PowerDNS repository for PowerDNS Authoritative Server - version 4.4.X
baseurl=http://repo.powerdns.com/centos/$basearch/$releasever/auth-44
gpgkey=https://repo.powerdns.com/FD380FBB-pub.asc
gpgcheck=1
enabled=1
priority=90
includepkg=pdns*
Enable the Extra Packages for Enterprise Linux (EPEL) repository by installing the epel-release
package:
sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Install PowerDNS and its backend to handle IP geolocation lookups:
sudo yum install pdns pdns-backend-geoip
Step 3 - Fetch the IP geolocation database
Register at Maxmind and download the GeoLite2 database with city granularity. Copy this MMDB file to /etc/pdns/GeoLite2-City.mmdb
on the router nodes and make it world readable:
sudo chmod 644 /etc/pdns/GeoLite2-City.mmdb
MaxMind provides updated databases at regular intervals. It is recommended to automate the process of updating the database, but it is outside the scope of this tutorial.
Step 4 - Configure PowerDNS
Put the following in /etc/pdns/pdns.conf
:
bind-config=/etc/pdns/named.conf
cache-ttl=0
consistent-backends=yes
daemon=no
edns-subnet-processing=yes
enable-lua-records=yes
geoip-database-files=/etc/pdns/GeoLite2-City.mmdb
launch=bind,geoip
query-cache-ttl=0
query-logging=yes
setgid=pdns
setuid=pdns
Create /etc/pdns/named.conf
and add the zone that will be managed by the DNS server:
zone "cdn.example.com" in {
type native;
file "/etc/pdns/cdn.example.com.zone";
};
Create /etc/pdns/cdn.example.com.zone
and add the SOA
, A
and AAAA
records:
$ORIGIN cdn.example.com.
@ IN SOA ns01.eu-west.example.com. hostmaster.example.com. 2 7200 3600 86400 60
@ 30 IN LUA A ("ifurlup('https://cdn.example.com/varnish-status', {'192.168.1.10', '192.168.2.10', '192.168.3.10', '192.168.3.11', '192.168.4.10', '192.168.5.10'}, {selector='pickclosest'})")
# A similar AAAA record can be made for IPv6 support
#@ 30 IN LUA AAAA ("ifurlup('https://cdn.example.com/varnish-status', {'2001:0db8:85a3:::8a2e:0370:7334', '2001:0db8:85a3:::8a2e:0370:7335', ...}, {selector='pickclosest'})")
The ifurlup
section of configuration will enable health probing from PowerDNS to Varnish. The probes will send the following HTTP request to each of the IP addesses specified every 5 seconds:
GET https://cdn.example.com/varnish-status HTTP/1.1
User-Agent: PowerDNS Authoritative Server
Host: cdn.example.com
Accept: */*
The cache nodes will be considered healthy if they respond with 200 OK
within the default timeout, which is two seconds.
From the list of healthy IP addresses, the selector='pickclosest'
will pick the IP address that is closest to the IP subnet of the client (if the client’s DNS server support RFC 7871) or the IP address of the client’s DNS server.
Configure PowerDNS to start automatically at boot:
sudo systemctl enable pdns
Restart PowerDNS manually (or reboot the DNS nodes):
sudo systemctl restart pdns
Step 5 - Delegation of cdn.example.com
The subdomain cdn.example.com
is delegated to the PowerDNS nodes using NS
records. Add the following DNS records to the example.com
zone configuration:
cdn.example.com 3600 IN NS ns01.us-east.example.com
cdn.example.com 3600 IN NS ns01.eu-west.example.com
Step 6 - Testing
The environment can now be tested. Verify first that DNS lookups work as expected. The IP address should correspond to the cache node that is both healthy and has the shortest distance from the client.
$ dig cdn.example.com
[...]
;; QUESTION SECTION:
;cdn.example.com. IN A
;; ANSWER SECTION:
cdn.example.com. 30 IN A 192.168.3.11
[...]
Enable tracing to get information about the delegation path all the way from the root name servers down to the CDN nodes: dig +trace cdn.example.com
.
Verify that a client is able to get DNS responses if one of the DNS servers is unavailable. Unavailability of single DNS servers may affect the latency of the DNS responses, but should not affect the availability of the DNS service.
Operations and monitoring
Given the VCL example above, individual cache nodes can be drained in a non-disruptive way by creating the file /etc/varnish/maintenance
on a cache node. If this file exists on a cache node, the DNS servers will stop sending clients to this node. Existing clients will go to other nodes as soon as the DNS entry expires and is refreshed. Simply remove the file to let the cache node back into active duty again.
Health monitoring of the DNS nodes can be done using the check_dns plugin for Nagios and the built-in Prometheus interface from PowerDNS. For more information about managing PowerDNS, please refer to the PowerDNS documentation.
Conclusion
This tutorial shows how a simple and self-hosted DNS-based GSLB can be set up to balance clients between multiple data centers. It takes the location of the client and the health of the cache nodes into account.
For more advanced use cases, this setup can be extended in several ways. Examples of next steps are:
- To expose the DNS servers using anycast
- Put the cache nodes behind layer 4 load balancers (with DR/DSR) to provide a single point of entry in each data center
- And/or take utilization of each cache node into account.