DekGenius.com
[ Team LiB ] Previous Section Next Section

Recipe 3.10 Logging Hostnames Instead of IP Addresses

Problem

You want to see hostnames in your activity log instead of IP addresses.

Solution

You can let the web server resolve the hostname when it processes the request by enabling runtime lookups with the Apache directive:

HostnameLookups On

Or, you can let Apache use the IP address during normal processing and let a piped logging process resolve them as part of recording the entry:

HostnameLookups Off
CustomLog "| /path/to/logresolve -c >> /path/to/logs/access_log.resolved" combined

Or, you can let Apache use and log the IP addresses, and resolve them later when analyzing the logfile. Add this to http.conf:

CustomLog /path/to/logs/access_log.raw combined

And analyze the log with:

% /path/to/logresolve -c < access_log.raw > access_log.resolved

Discussion

The Apache activity logging mechanism can record either the client's IP address or its hostname (or both). Logging the hostname directly requires that the server spend some time to perform a DNS lookup to turn the IP address (which it already has) into a hostname. This can have some serious impact on the server's performance, however, because it needs to consult the name service in order to turn the address into a name; and while a server child or thread is busy waiting for that, it isn't handling client requests. One alternative is to have the server record only the client's IP address and resolve the address to a name during logfile postprocessing and analysis. At the very least, defer it to a separate process that won't directly tie up the web server with the resolution overhead.

In theory this is an excellent choice; in practice, however, there are some pitfalls. For one thing, the logresolve application included with Apache (usually installed in the bin/ subdirectory under the ServerRoot) will only resolve IP addresses that appear at the very beginning of the log entry, and so it's not very flexible if you want to use a nonstandard format for your logfile. For another, if too much time passes between the collection and resolution of the IP addresses, the DNS may have changed sufficiently so that misleading or incorrect results may be obtained. This is especially a problem with dynamically allocated IP addresses such as those issued by ISPs.

An additional shortcoming becomes apparent if you feed your log records directly to logresolve through a pipe: as of Apache 1.3.24 at least, logresolve doesn't flush its output buffers immediately, so there's the possibility of lost data if the logging process or the system should crash.

See Also

  • The logresolve manpage:

    % man -M  /path/to/ServerRoot /man/logresolve.8
    [ Team LiB ] Previous Section Next Section