DekGenius.com
I l@ve RuBoard Previous Section Next Section

14.2 Troubleshooting Tools and Techniques

We went over nslookup, dig, and the name server's debugging output in the last two chapters. Before we go on, let's introduce some new tools that can be useful in troubleshooting: named-xfer, name server database dumps, and query logging.

14.2.1 How to Use named-xfer

named-xfer is the program that BIND 4 and 8 name servers start to perform zone transfers. (BIND 9 name servers, you'll remember, are multithreaded, so they don't need a separate program to do inbound zone transfers: they just start a new thread.) named-xfer checks whether the slave's copy of the zone data is up to date and transfers a new zone if necessary. (In Versions 4.9 and 8, named checks if a zone is up to date first, to avoid starting up a child process when no transfer is necessary.)

In Chapter 13, we showed you the debugging output a BIND 8 slave name server logged as it checked its zone. When the slave server transferred the zone, it started a child process (named-xfer) to pull the data to the local filesystem. We didn't tell you, however, that you can also start named-xfer manually instead of waiting for named to start it, and that you can tell it to produce debugging output independently of named.

This can be useful if you're tracking down a problem with zone transfers but don't want to wait for named to schedule one. To test a zone transfer manually, you need to specify a number of command-line options:

% /usr/sbin/named-xfer
Usage error: no domain
Usage: named-xfer
        -z zone_to_transfer
        -f db_file
        [-i ixfr_file]
        [-s serial_no]
        [-d debug_level]
        [-l debug_log_file]
        [-t trace_file]
        [-p port]
        [-S] [-Z]
        [-C class]
        [-x axfr-src]
        [-T tsig_info_file]
        servers [-ixfr|-axfr]...

This is the output from a BIND 8.2.3 version of named-xfer. Earlier versions of named-xfer won't have all of these options.

When named starts named-xfer, it specifies the -z option (the zone named wants to check), the -f option (the name of the zone data file that corresponds to the zone, from named.boot or named.conf ), the -s option (the zone's serial number on the slave from the current SOA record), and the addresses of the servers the slave was instructed to load from (the IP addresses from the masters substatement in the zone statement in named.conf, or from the secondary directive in named.boot). If named is running in debug mode, it also specifies the debug level for named-xfer with the -d option. The other options aren't usually necessary to troubleshoot problems; they have to do with incremental zone transfers, TSIG signing zone transfers, and such.

When you run named-xfer manually, you can also specify the debug level on the command line with -d. (Don't forget, though, that debug levels above 3 will produce tons of debugging output if the transfer succeeds!) You can also specify an alternate filename for the debug file with the -l option. The default log file is /var/tmp/xfer.ddt.XXXXXX, where XXXXXX is a suffix appended to preserve uniqueness or a file by the same name in /usr/tmp. And you can specify the name of the host to load from instead of its IP address.

For example, with the following command line, you can see whether zone transfers from terminator.movie.edu are working:

% /usr/sbin/named-xfer -z movie.edu -f /tmp/db.movie -s 0 terminator
% echo $?
4

In this command, we specified a serial number of zero because we wanted to force named-xfer to attempt a zone transfer even if it wasn't needed. Zero is a special serial number—named-xfer will transfer the zone regardless of the actual zone serial number. Also, we told named-xfer to put the new zone data file in /tmp rather than overwriting the zone's working zone data file.

We can tell if the transfer succeeded by looking at named-xfer 's return value. If you're running BIND Version 8.1.2 or older, your named-xfer has four possible return values:

0

The zone data is up to date and no transfer was needed.

1

Indicates a successful transfer.

2

The host(s) named-xfer queried can't be reached, or an error occurred and named-xfer may have logged an error message to syslog.

3

An error occurred and named-xfer logged an error message to syslog.

As of BIND 8.2, three new return values have been added to accommodate incremental zone transfers:

4

Indicates a successful AXFR (full) zone transfer.

5

Indicates a successful IXFR (incremental) zone transfer.

6

Indicates that the master name server returned an AXFR to named-xfer 's IXFR request.

It's perfectly legal for a name server—even one that supports IXFR—to return a full zone transfer to a request for an incremental zone transfer. For example, the master name server may be missing part of the record of the changes made to the zone.

Note that BIND 8.2 and later named-xfer s don't use return value 1 anymore. Return value 1 has been replaced by return values 4 through 6.

14.2.2 What If I Don't Have named-xfer?

If you've upgraded to BIND 9 and don't have a named-xfer binary, you can still use nslookup or dig to do a zone transfer. Either query tool will give you some of the information that named-xfer would have given you.

For example, to use dig to do the same zone transfer we showed you earlier, you could run:

% dig @terminator.movie.edu movie.edu axfr

With nslookup, you could change your name server and use the ls -d command from interactive mode.

Unfortunately, both dig and nslookup are more subtle than named-xfer is in reporting errors. If nslookup can't transfer a zone, it usually reports an "unspecified error":

> ls movie.edu
[terminator.movie.edu]
*** Can't list domain movie.edu: Unspecified error

This could be caused by an allow-transfer access list, the fact that terminator.movie.edu isn't actually authoritative for movie.edu, or a number of other problems. To tell which, you may just have to send other, related queries or check the syslog output on the master name server.

14.2.3 How to Read a Database Dump

Poring over a dump of the name server's internal database—including cached information—can also help you track down problems. The ndc dumpdb or rndc dumpdb command causes named to dump its authoritative data, cached data, and hints data to named_dump.db in BIND's working directory (or in /usr/tmp/named_dump.db or /var/tmp/named_dump.db, for BIND 4).[1] An example of a named_dump.db file follows. The authoritative data and cached entries, mixed together, appear first in the file. At the end of the file is the hints data:

[1] BIND 9.1.0 is the first version of BIND 9 to support dumping the database.

; Dumped at Tue Jan  6 10:49:08 1998
;; ++zone table++
; 0.0.127.in-addr.arpa (type 1, class 1, source db.127.0.0)
;   time=0, lastupdate=0, serial=1,
;   refresh=0, retry=3600, expire=608400, minimum=86400
;   ftime=884015430, xaddr=[0.0.0.0], state=0041, pid=0
;; --zone table--
; Note: Cr=(auth,answer,addtnl,cache) tag only shown for non-auth RR's
; Note: NT=milliseconds for any A RR which we've used as a nameserver
; --- Cache & Data ---
$ORIGIN .
.   518375  IN      NS  G.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  J.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  K.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  L.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  M.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  A.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  H.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  B.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  C.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  D.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  E.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  I.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
    518375  IN      NS  F.ROOT-SERVERS.NET.   ;Cr=auth [128.8.10.90]
EDU  86393  IN      SOA A.ROOT-SERVERS.NET.  hostmaster.INTERNIC.NET. (
         1998010500 1800 900 604800 86400 )   ;Cr=addtnl [128.63.2.53]
$ORIGIN  0.127.in-addr.arpa.
0        IN    SOA cujo.movie.edu. root.cujo.movie.edu. (
         1998010600 10800 3600 608400 86400 )        ;Cl=5
         IN    NS  cujo.movie.edu.   ;Cl=5
$ORIGIN  0.0.127.in-addr.arpa.
1        IN    PTR localhost.    ;Cl=5
$ORIGIN EDU.
PURDUE   172787  IN  NS  NS.PURDUE.EDU.           ;Cr=addtnl [192.36.148.17]
         172787  IN  NS  MOE.RICE.EDU.            ;Cr=addtnl [192.36.148.17]
         172787  IN  NS  PENDRAGON.CS.PURDUE.EDU.  ;Cr=addtnl [192.36.148.17]
         172787  IN  NS  HARBOR.ECN.PURDUE.EDU.    ;Cr=addtnl [192.36.148.17]
$ORIGIN  movie.EDU.
;cujo    593     IN  SOA  A.ROOT-SERVERS.NET. hostmaster.INTERNIC. NET. (
;        1998010500 1800 900 604800 86400 );EDU.; NXDOMAIN  ;-$
   ;Cr=auth [128.63.2.53]
$ORIGIN   RICE.EDU.
MOE      172787  IN  A   128.42.5.4        ;NT=84 Cr=addtnl [192.36.148.17]
$ORIGIN   PURDUE.EDU.
CS       86387   IN  NS  pendragon.cs.PURDUE.edu.    ;Cr=addtnl [128.42.5.4]
         86387   IN  NS  ns.PURDUE.edu.              ;Cr=addtnl [128.42.5.4]
         86387   IN  NS  harbor.ecn.PURDUE.edu.      ;Cr=addtnl [128.42.5.4]
         86387   IN  NS  moe.rice.edu.               ;Cr=addtnl [128.42.5.4]
NS       172787   IN  A  128.210.11.5        ;NT=4 Cr=addtnl [192.36.148.17]
$ORIGIN   ECN.PURDUE.EDU.
HARBOR   172787  IN   A  128.46.199.76       ;NT=6 Cr=addtnl [192.36.148.17]
$ORIGIN   CS.PURDUE.EDU.
galt     86387   IN   A  128.10.2.39                 ;Cr=auth [128.42.5.4]
PENDRAGON  172787  IN  A  128.10.2.5         ;NT=20 Cr=addtnl [192.36.148.17]
$ORIGIN   ROOT-SERVERS.NET.
K        604775    IN  A  193.0.14.129       ;NT=10 Cr=answer [128.8.10.90]
A        604775    IN  A  198.41.0.4         ;NT=20 Cr=answer [128.8.10.90]
L        604775    IN  A  198.32.64.12       ;NT=8 Cr=answer [128.8.10.90]
B        604775    IN  A  128.9.0.107        ;NT=9 Cr=answer [128.8.10.90]
M        604775    IN  A  202.12.27.33       ;NT=20 Cr=answer [128.8.10.90]
C        604775    IN  A  192.33.4.12        ;NT=17 Cr=answer [128.8.10.90]
D        604775    IN  A  128.8.10.90        ;NT=11 Cr=answer [128.8.10.90]
E        604775    IN  A  192.203.230.10     ;NT=9 Cr=answer [128.8.10.90]
F        604775    IN  A  192.5.5.241        ;NT=73 Cr=answer [128.8.10.90]
G        604775    IN  A  192.112.36.4       ;NT=14 Cr=answer [128.8.10.90]
H        604775    IN  A  128.63.2.53        ;NT=160 Cr=answer [128.8.10.90]
I        604775    IN  A  192.36.148.17      ;NT=102 Cr=answer [128.8.10.90]
J        604775    IN  A  198.41.0.10        ;NT=21 Cr=answer [128.8.10.90]
; --- Hints ---
$ORIGIN .
.   3600           IN  NS  A.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  B.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  C.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  D.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  E.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  F.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  G.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  H.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  I.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  J.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  K.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  L.ROOT-SERVERS.NET.     ;Cl=0
    3600           IN  NS  M.ROOT-SERVERS.NET.     ;Cl=0
$ORIGIN   ROOT-SERVERS.NET.
K     3600         IN   A  193.0.14.129      ;NT=11 Cl=0
L     3600         IN   A  198.32.64.12      ;NT=9 Cl=0
A     3600         IN   A  198.41.0.4        ;NT=10 Cl=0
M     3600         IN   A  202.12.27.33      ;NT=11 Cl=0
B     3600         IN   A  128.9.0.107       ;NT=1288 Cl=0
C     3600         IN   A  192.33.4.12       ;NT=21 Cl=0
D     3600         IN   A  128.8.10.90       ;NT=1288 Cl=0
E     3600         IN   A  192.203.230.10    ;NT=19 Cl=0
F     3600         IN   A  192.5.5.241       ;NT=23 Cl=0
G     3600         IN   A  192.112.36.4      ;NT=18 Cl=0
H     3600         IN   A  128.63.2.53       ;NT=11 Cl=0
I     3600         IN   A  192.36.148.17     ;NT=21 Cl=0
J     3600         IN   A  198.41.0.10       ;NT=13 Cl=0

The name server that created this named_dump.db file was authoritative only for 0.0.127.in-addr.arpa. Only two names have been looked up by this server: galt.cs.purdue.edu and cujo.movie.edu. In the process of looking up galt.cs.purdue.edu, this server cached not only the address of galt, but also the list of name servers for purdue.edu and the addresses for those servers. The name cujo.movie.edu, however, doesn't really exist (nor does the zone movie.edu, except in our examples), so the server cached the negative response. In the dump file, the negative response is commented out (the line starts with a semicolon), and the reason is listed (NXDOMAIN) instead of real data. You'll notice the TTL is quite low (593). On BIND 8.2 and later name servers, negative responses are cached according to the last field in the SOA record, which is usually much smaller than the default TTL for the zone.

The hints section at the bottom of the file contains the data from the db.cache file. The TTL of the hints data is decremented, and it may go to zero, but the hints are never discarded.

Note that some of the resource records are followed by a semicolon and NT=. You will only see these on the address records of name servers. The number is the roundtrip time calculation that the name server keeps so that it knows which name servers have responded most quickly in the past; the name server with the lowest roundtrip time will be tried first the next time.

The cached data is easy to pick out—those entries have a credibility tag (Cr =) and (sometimes) the IP address of the server the data came from.[2] The zone data and hint data are tagged with Cl=, which is just a count of the level in the domain tree (the root is level 0, foo would be level 1, foo.foo would be level 2, etc.). Let's digress a moment to explain the concept of credibility.

[2] The name server prints the IP address of the remote name server if it's available. On BIND 8.2 and later name servers, the IP address is available only if you've turned on host-statistics, which we introduced in Chapter 8. On earlier BIND 4.9 and BIND 8 name servers, it's on by default. host-statistics keeps impressive statistics on every name server and resolver you've ever communicated with, which is very useful for some purposes (like figuring out which name server your server got a record from), but consumes a fair amount of memory.

One of the advances between Versions 4.8.3 and 4.9 was the addition of a credibility measure. This allows a name server to make more intelligent decisions about what to do with new data from a remote server.

A 4.8.3 name server had only two credibility levels—locally authoritative data and everything else. The locally authoritative data was data from your zone data files—your name server knew better than to update its internal copy of what came from your zone file. But all data from remote name servers was considered equal.

Here is a situation that could happen and the way a 4.8.3 server would deal with it. Suppose that your server looked up an address for terminator.movie.edu and received an authoritative answer from the movie.edu name server. (Remember, an authoritative answer is the best you can get.) Sometime later, while looking up foo.oreilly.com, your server receives another address record for terminator.movie.edu, but this time as part of the delegation info for oreilly.com (which terminator.movie.edu is a slave for). The 4.8.3 name server would update the cached address record for terminator.movie.edu, even though the data came from the com name server instead of the authoritative movie.edu name server. Of course, the com and movie.edu name servers will have exactly the same data for terminator.movie.edu, so this won't be a problem, right? Yeah, and it never rains in southern California, either.

A 4.9 or newer name server is more intelligent. Like a 4.8.3 name server, it still considers your zone data unassailable—beyond any doubt. But a 4.9 or newer name server distinguishes among the different data from remote name servers. Here is the hierarchy of remote data credibility from most credible to least:

auth

These records are data from authoritative answers—the answer section of a response message with the authoritative answer bit set.

answer

These records are data from nonauthoritative, or cached, answers—the answer section of a response message without the authoritative answer bit set.

addtnl

These records are data from the rest of the response message—the authority and additional sections. The authority section of the response contains NS records that delegate a zone to an authoritative name server. The additional section contains address records that may complete information in other sections (e.g., address records that go with NS records in the authority section).

There is one exception to this rule: when the name server is priming its root name server cache, the records that would be at credibility addtnl are bumped up to answer to make them harder to change accidentally. Notice in the dump that the address records for root name servers are at credibility answer, but the address records for the purdue.edu name servers are at credibility addtnl.

In the situation just described, a 4.9 or newer name server would not replace the authoritative data (credibility = auth) for terminator.movie.edu with the delegation data (credibility = addtnl ) because the authoritative answer has higher credibility.

14.2.4 Logging Queries

BIND Version 4.9 added a feature called query logging that can be used to help diagnose certain problems. When query logging is turned on, a running name server will log every query to syslog. This feature could help you find resolver configuration errors because you can verify that the name you think is being looked up really is the name being looked up.

First you must make sure that LOG_INFO messages are being logged by syslog for the facility daemon. Next, you need to turn on query logging. This can be done in several ways: for BIND 4.9, setoptions query-log in your name server boot file; for BIND 4.9 or BIND 8, start the name server with -q on the command line or send an ndc querylog command to a running name server. For BIND 9.1.0 or later (earlier versions don't support query logging), use rndc querylog. You'll start seeing syslog messages like this:

Feb 20 21:43:25 terminator named[3830]:
                     XX+ /192.253.253.2/carrie.movie.edu/A
Feb 20 21:43:32 terminator named[3830]:
                     XX+ /192.253.253.2/4.253.253.192.in-addr.arpa/PTR

Or, if you're running BIND 9, like this:

Jan 13 18:32:25 terminator named[13976]: info: client 192.253.253.2#1702: query: 
carrie.movie.edu IN A
Jan 13 18:32:42 terminator named[13976]: info: client 192.253.253.2#1702: query: 
4.253.253.192.in-addr.arpa IN PTR

These messages include the IP address of the host that made the query, as well as the query itself. Since the first example comes from a BIND 8.2.3 name server and these queries are recursive, they begin with XX+. Iterative queries begin with just XX. (Name servers older than BIND 8.2.1 don't distinguish recursive from nonrecursive queries.) Inverse queries have a dash before the query type (e.g., an inverse query for an address record is logged as "-A" instead of just "A"). After enough queries have been logged, you can turn off query logging by sending another ndc querylog or rndc querylog command to your name server.

If you're stuck running an older BIND 9 name server, you can still see the queries received in named 's debugging output at level 1.

    I l@ve RuBoard Previous Section Next Section