Index of /~rousskov/research/cache/sanitar

      Name                    Last modified       Size  Description

[DIR] Parent Directory 26-Aug-1998 23:56 - [TXT] MD5.pm 14-Oct-1997 11:12 10k [TXT] MD5test.pl 14-Oct-1997 09:47 1k [TXT] sanitar.pl 29-Oct-1997 22:43 2k

"sanitar.pl" can be used to encode with MD5 algorithm '\s' separated fields in
an ASCII log file (e.g. access.log or store.log). The script is based on a
pure-Perl MD5 implementation.

Grab these three files:

    MD5.pm                 14-Oct-97 11:12    10K  // MD5 implementation
    MD5test.pl             14-Oct-97 09:47     1K  // Run this one first!
    sanitar.pl             14-Oct-97 10:57     2K  // the script you need
  
After testing MD5 on your machine with "MD5test.pl" (in case I did something
non-portable), you can use "sanitar.pl" to anonymize any field(s) in your log
file. Specify field numbers as parameters (starting with 0). When anonymizing
client IP, to MD5 only the last two digits of an IP address, add 'ip' as the
first parameter. For example,

$ gunzip -c access.log.gz | sanitar.pl ip 2 > gzip -c > access-sanitized.log.gz
  
will MD5 just the last two digits of client IP in Squid "access.log" file
(third field => field# == 2).

It works a bit slow, but, hey, this is a pure-Perl MD5...


Hope this helps,

Alex.
http://www.cs.ndsu.nodak.edu/~rousskov/

P.S. If you want to add extra security to MD5 please read a "security" note at
the top of "MD5.pm". You can also change the length of an MD5 digest by
modifying constants at the top of "sanitar.pl".