left header graphic The Network People
Solutions for Hosting Providers
right header graphic

overview home : internet : dns : interland : methods caching

DNS caching server load testing.

My testing is designed to determine how caching name servers perform under a variety of network conditions. Some tests are against a dns wall and should directly reflect the ability of the cache to lookup and answer queries. Other tests are run against primed caches reflecting a dns servers ability to retrieve cached data. Still other tests are run against "real world" name servers forcing the caches to deal with latency, failures, and timeouts.

Test 1: Raw lookup performance& raw cache performance.

This test requires the dns server to query a dns wall running on a separate host. The dns wall is a simple program named "walldns" that generates matching forward and reverse dns answers for blocks of IP space. The test setup looks like this:

Role
CPU
RAM
Program
Used RAM
DISK
LIMITS
LOG
dns wall PIII 550 1GB walldns 900k RAID 5
LVD SCSI
-d250000 none
dns cache PIII 600 512MB dnscache 1.05
BIND 8.2.3
BIND 9.1.1
290MB
8MB
12MB
RAID 5
LVD SCSI

-o 5000 *
unlimit
unlimit

none
none
none
dns client 2 x PIII 850 1GB dnsfilter <5MB RAID 5
LVD SCSI
-c 10
-c 100
full

*MAXUDP compiled with value of 2000.

Configuring walldns is simplicity itself, install the software and start it up. It listens on port 53 just like a standard dns server but it answers queries by making up a hostname or IP. For example, I send it a query for 216.122.0.1 and it makes up a hostname of "1.0.122.216.in-addr.arpa". Do a lookup on that hostname and of course it'll resolve to that IP. It's very convenient and as testing will show, it's very fast too. :-)

Configuring the caches was pretty basic. Run unlimit for BIND 8 and BIND 9 and add a couple lines to the options section (forward only; forwarders { 216.122.x.x; };) to convince it to only query the dns wall. Dnscache was also easy to configure: (echo "216.122.x.x" > /service/dnscache/root/servers/122.216.in-addr.arpa.). That tells dnscache to forward all requests for the 216.122 network off to walldns.

The client configuration turned out to be quite tricky. I've looked at quite a few different test programs for dns. Netperf3 is supposed to be a good one but I've had no luck getting it working on FreeBSD and I'm not patient enough to keep fiddling with it. I've also played a bit with the Net::DNS perl modules and the author supplied mresolv and mresolv2 but none of the perl "dns testers" could generate a meaningful amount of load. I was left back where I started with dnsfilter.

Dnsfilter is a C program supplied with djbdns that takes a list of IP's and does lookups on them. It writes the output back to STDOUT and I piped all the output to files to verify the accuracy of the results. After much testing of dnsfilter and it's limitations, I deduced that setting it's number of parallel lookups higher than 100 effectively chokes it after around 12,000 very quick queries. Keeping the number low prevented that. I ran most tests at the default value of 10 parallel lookups unless otherwise noted. The only reason to use a -c value higher than 100 is when querying real world data where you need lots more parallel lookups because you'll have a high number of time-outs or other failures.

What follows is the output of my first batch of tests. I ran the following command 3 times for each dns cache: "time dnsfilter < iplist.wall > out[1-3]". The first test reflects the caches need to fetch the results from the dns wall and return them to the client. The two subsequent tests reflect the caches ability to server results from it's cache. The file iplist.wall simply contains 65,536 ip addresses representing the class B address of 216.122.0.0.

Name Server
time(s)
qps
dnscache - 290MB RAM 26 2520
  26 2520
  26 2520
     
BIND 8 - 8MB RAM 18 3640
  39 1680
  39 1680
     
BIND 9 - 12MB RAM 79 830
  29 2260
  29 2260

Memory usage isn't meaningful for dnscache as it's a startup parameter. You tell it how big a cache you want to maintain and once it's full it throws out the oldest entries. I consider that to be better than allowing your cache to grow until it exhausts all your physical RAM and swap (which I do later :-)). Between the BINDs, version 8 starts out with 2MB, version 9 starts out with 4MB. After the 65,536 queries, v9 has grown by 8MB where v8 has only increased by 6MB. Apparently v8 is more memory efficient in how it stores cached data.

I went back and re-tested these runs a couple times because the results just didn't seem right. In every case, all three dns caches resolved all sixty five thousand IP addresses correctly. What I found to be the most odd was BIND 8 was able to serve the results faster when it didn't cache them. :-| That little revelation I found to be quite surprising. What it does end up proving is that BIND 9 and dnscache both have a faster cache storage/lookup algorithm. v8 was the fastest at resolving uncached queries and v9 was the slowest.

The next thing I did was to spread out the client loads. I did this by splitting the file "iplist.wall" into three equal sized chunks and copying them to three servers with dnsfilter installed (hardware specs the same for all dns client servers). So, each dns client would be responsible for looking up about twenty thousand IP addresses. I then executed the following command on all three servers at the same time: "time dnsfilter < /usr/iplist.wall.seg > /usr/out[1-3]". Client time is the combined time spent by all three clients looking up data. Time is elapsed time taken to run the test. Here are the results:

Name Server
client time(s)
time(s)
qps
dnscache - 290MB RAM 200 67 981
  93 31 2112
  86 29 2286
       
BIND 8 - 8MB RAM 51 17 3855
  114 38 1725
  114 38 1725
       
BIND 9 - 12MB RAM 239 80

822

  82 27 2397
  81 27 2427

I've ran the tests a couple more times and got similar results. I'm fairly confident that I've reached the maximum abilities of each dns server on the current hardware. I'm also quite confident that the testing is yielding accurate results.

I'm getting to do another battery of tests against our production servers resolving the entire Class B. I believe the results of that testing are also quite valuable as it determines how the dns server deals with timeouts and lookup failures.


Last modified on 4/26/05.