Index of /~rousskov/research/cache/sanitar
Name Last modified Size Description
Parent Directory 26-Aug-1998 23:56 -
MD5.pm 14-Oct-1997 11:12 10k
MD5test.pl 14-Oct-1997 09:47 1k
sanitar.pl 29-Oct-1997 22:43 2k
"sanitar.pl" can be used to encode with MD5 algorithm '\s' separated fields in
an ASCII log file (e.g. access.log or store.log). The script is based on a
pure-Perl MD5 implementation.
Grab these three files:
MD5.pm 14-Oct-97 11:12 10K // MD5 implementation
MD5test.pl 14-Oct-97 09:47 1K // Run this one first!
sanitar.pl 14-Oct-97 10:57 2K // the script you need
After testing MD5 on your machine with "MD5test.pl" (in case I did something
non-portable), you can use "sanitar.pl" to anonymize any field(s) in your log
file. Specify field numbers as parameters (starting with 0). When anonymizing
client IP, to MD5 only the last two digits of an IP address, add 'ip' as the
first parameter. For example,
$ gunzip -c access.log.gz | sanitar.pl ip 2 > gzip -c > access-sanitized.log.gz
will MD5 just the last two digits of client IP in Squid "access.log" file
(third field => field# == 2).
It works a bit slow, but, hey, this is a pure-Perl MD5...
Hope this helps,
Alex.
http://www.cs.ndsu.nodak.edu/~rousskov/
P.S. If you want to add extra security to MD5 please read a "security" note at
the top of "MD5.pm". You can also change the length of an MD5 digest by
modifying constants at the top of "sanitar.pl".