Apache Server
Piped Error Logs

As every system administrator knows, hackers are forever trying to break into your system. Just look at your error logs — they are filled with references to "File does not exist:" followed by some filename and/or directory which you've never heard of and likely isn't even relevant to your operating system. In the meantime, the valid error messages about which you need to do something are lost in the flood of messages due to Nimda and other attackers. It's time to take back your error logs.

Filtered Pipes

There is a way to filter out all those hacker-generated error messages — it's called piped error logs. The ErrorLog directive in the Apache documentation mentions that you can pipe the error messages to a script, but doesn't describe the process in any more detail, and alas, the devil is in the details.

A common way to invoke such a script is to set the ErrorLog directive as follows:

ErrorLog "|/path/to/script/script >>/path/to/logfile/logfile"

for example, I use

ErrorLog "|/www/cgi-src/errorlog.pl >>/www/logs/error.log"

Note the leading "|" symbol which says to send (in technical terms pipe, as a verb) each error message to the named script, which in my case is a Perl script called errorlog.pl. The script receives the error message via Standard Input (STDIN) and sends its result to Standard Output (STDOUT). Because we follow the call to the script in the ErrorLog directive with ">>/www/logs/error.log", the script's output is appended to the file error.log.

However, there are two gotchas.

Running The Script

The first gotcha concerns how the Apache Server interacts with the script. First, whenever the server starts, it invokes the script. Now, if your script were to wait for an error message from STDIN, act on it and then terminate, the script wouldn't be around to process the next error message. This means that the script must be in a continuous loop waiting for STDIN to respond, terminating only when the server sends an EOF to STDIN. Thus, the server controls the script entirely through STDIN. The script starts out waiting for a line of input from STDIN, processes the line, and then waits for another line.

Using the scripting language Perl, this translates to surrounding the non-static portion of the script with something like

while (<STDIN>) { ... }

When the server has an error message to display, it sends it to STDIN, the script wakes up and does its thing. Then the script goes back to the while statement and waits for more input which occurs only when the server has another error message. At that point, the code inside the while loop repeats until the server terminates at which point the script has an opportunity to clean up.

In the simplest case, you might think that the entire program can be as short as

#!/usr/local/bin/perl print while <STDIN>;

Where's My Output?

However, were you to try this program you would find that it appears not to work, which brings us to the second gotcha: buffered output. In short, Perl (as with many other languages) doesn't display its output immediately — instead, the output is cached (in a buffer) and is spit out only when the buffer fills (or is flushed). Thus, the program appears not to be working only because there is no output. In fact, it is working, but like a squirrel with a nut in its cheek, it just hasn't displayed anything as yet. The solution is to tell Perl to use unbuffered output, adding one more statement which, finally, produces a working program:

#!/usr/local/bin/perl $|=1; # Use unbuffered output print while <STDIN>;

Working Filters

Now that we know how the script works, let's add some code to filter out unwanted error messages. At this point, the problem is down to how to identify unwanted messages, for example,

[Sat Jan 1 06:10:30 2005] [error] [client 203.200.203.29] user not found: /_vti_bin/_vti_aut/author.exe

One way is to examine the file mentioned at the end of the message. For example, you might be running a Unix-like system and the files typical to a Windows system have no meaning on your system. Thus you might want to filter out messages which end with .exe or .dll or .asp. More specifically, you might want to filter out files with explicit names, such as cmd.exe, root.exe, wpad.dat, or formmail.pl, to name a few.

Another way is to filter out error messages which name a specific directory, such as ads, banners, free, passport, etc.

The following code implements these ideas. However, you should review your error logs regularly to see what new tricks the black hat hackers have up their sleeves. The following is my current collection:

# Skip those error messages which reference any of the following directories: $dir = 'ads tour slv html\.ng js\.ng passport m download adi adj us\.yimg\.com free ' . 'banners dynamic img caes ice config espana\.starmedia\.com galeria imag ' . 'producto w3c officescan';

$dir = join '/|/', split ' ', "/$dir/"; # Replace spaces with interior '/|/', and # put in leading and trailing directory markers$|=1; # Use unbuffered outputwhile (<STDIN>) # Loop through STDIN { $Msg = $_; # Capture the line of input if ($Msg !~ m!($end)$!io # If it doesn't match the forbidden endings, && $Msg !~ m!($dir)!io) # and it doesn't match the forbidden directories, ... { print $Msg; # Print it } }

Other Scripting Languages

Although the above discussion uses Perl as the scripting language, any other programming language may be substituted. The two key elements are an infinite loop around a call to Standard Input and unbuffered output. Thereafter, regular expressions can easily filter out unwanted error messages.

For example, using PHP, the short version of a working program looks like

#!/usr/local/bin/php <?php $stdin = fopen ('php://stdin', 'r'); ob_implicit_flush (true); // Use unbuffered output while ($line = fgets ($stdin)) { print $line; } ?>

When Not To Use This Technique

Keep in mind that each error message is a wake up call about an attack. Before you choose to discard an error message, you also need to be completely sure you have guarded against such an attack.

For example, you might know that a call to author.exe from a particular directory is bogus because it doesn't exists in that directory, but does it exist in a different directory from which it might be a threat? Might you at some later time install, say, FrontPage which will create the /_vti_bin/_vti_aut/author.exe file? Why are hackers targeting that file anyway — is there something about it you don't know which causes it to be a security risk?

Be sure you answer these and other questions before adopting this technique.

Miscellaneous

Obviously, the same idea can be extended to other log files such as AgentLog, RefererLog, and TransferLog using a script tailored to the particular log type and its contents.
Because the Apache Server invokes the script at startup and runs it continuously, when you make a change to the script you must remember to restart the server (presumably gracefully so as not to drop existing connections).
If you have access to the httpd.conf file and have a spare virtual domain name, it is a good idea to test changes to your script there, rather than on a live system.
Assuming you don't name the language interpreter explicitly in the ErrorLog directive ("|/www/..." vs. "|perl /www/..."), don't forget to set execute permissions on the script (chmod 0755 errorlog.pl).

Author

This page was created by Bob Smith -- please any questions or comments about it to me.

Acknowledgments

Thanks for Rex Swain for his helpful comments.

NARS2000 © 2006-2020
Comments or suggestions? Send them to .

Apache Server Piped Error Logs