$Id: using.html,v 1.5 2003/02/18 20:45:43 jrennie Exp $
General information page: http://people.csail.mit.edu/jrennie/ifile.
Configure:
me% CC=gcc CFLAGS=-O ./configure creating cache ./config.cache checking for gcc... gcc checking whether the C compiler (gcc -O ) works... yes checking whether the C compiler (gcc -O ) is a cross-compiler... no checking whether we are using GNU C... yes checking whether gcc accepts -g... yes checking for a BSD compatible install... /usr/bin/install -c checking for ranlib... ranlib checking for strchr... yes checking how to run the C preprocessor... gcc -E checking for alloca.h... no checking for perl... /usr/bin/perl updating cache ./config.cache creating ./config.status creating Makefile configuring in argp running /bin/sh ./configure --cache-file=.././config.cache --srcdir=. loading cache .././config.cache checking for gcc... (cached) gcc checking whether the C compiler (gcc -O ) works... yes checking whether the C compiler (gcc -O ) is a cross-compiler... no checking whether we are using GNU C... (cached) yes checking whether gcc accepts -g... (cached) yes checking how to run the C preprocessor... (cached) gcc -E checking for a BSD compatible install... (cached) /usr/bin/install -c checking for ranlib... (cached) ranlib checking for getopt.h... no checking for getopt_long... no checking for strerror... yes checking for strndup... no checking for ANSI C header files... yes checking for ssize_t... yes checking for memmove... yes checking for vsnprintf... yes checking for strerror... (cached) yes checking for strings.h... yes checking if vsprintf returns int... yes checking program_invocation_name... no checking for ANSI C header files... (cached) yes checking for string.h... yes checking for memory.h... yes updating cache .././config.cache creating ./config.status creating Makefile
Abbreviations in make output:
$flags = -I. -I./include -I./argp -DHAVE_STRCHR=1 -O $argflags = -I. -DHAVE_STRERROR=1 -DSTDC_HEADERS=1 -DHAVE_MEMMOVE=1 -DHAVE_VSNPRINTF=1 -DHAVE_STRERROR=1 -DHAVE_STRINGS_H=1 -DVSPRINTF_RETURNS_INT=1 -DSTDC_HEADERS=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -O
Make:
me% make gcc -c $flags -o database.o database.c gcc -c $flags -o error.o error.c gcc -c $flags -o hash_table.o hash_table.c gcc -c $flags -o int4str.o int4str.c int4str.c: In function `ifile_int4str_free_contents': int4str.c:382: warning: passing arg 1 of `free' discards qualifiers from pointer target type gcc -c $flags -o istext.o istext.c gcc -c $flags -o lex-define.o lex-define.c gcc -c $flags -o lex-email.o lex-email.c gcc -c $flags -o lex-indirect.o lex-indirect.c gcc -c $flags -o lex-simple.o lex-simple.c gcc -c $flags -o opts.o opts.c gcc -c $flags -o primes.o primes.c gcc -c $flags -o scan.o scan.c gcc -c $flags -o stem.o stem.c gcc -c $flags -o stoplist.o stoplist.c gcc -c $flags -o stopwords.o stopwords.c gcc -c $flags -o util.o util.c ar rc libifile.a database.o error.o hash_table.o int4str.o istext.o lex-define.o lex-email.o lex-indirect.o lex-simple.o opts.o primes.o scan.o stem.o stoplist.o stopwords.o util.o ranlib libifile.a cd argp ; make libargp.a make[1]: Entering directory `/pub/src/mail/spam/ifile-1.1.5/argp' gcc -c $argflags -o argp-ba.o argp-ba.c gcc -c $argflags -o argp-fmtstream.o argp-fmtstream.c gcc -c $argflags -o argp-fs-xinl.o argp-fs-xinl.c gcc -c $argflags -o argp-help.o argp-help.c gcc -c $argflags -o argp-parse.o argp-parse.c gcc -c $argflags -o argp-pv.o argp-pv.c gcc -c $argflags -o argp-pvh.o argp-pvh.c gcc -c $argflags -o argp-xinl.o argp-xinl.c gcc -c $argflags -o argp.o argp.c gcc -c $argflags -o pin.o pin.c gcc -c $argflags -o strndup.o strndup.c gcc -c $argflags -o getopt.o getopt.c gcc -c $argflags -o getopt1.o getopt1.c ar rc libargp.a argp-ba.o argp-fmtstream.o argp-fs-xinl.o argp-help.o argp-parse.o argp-pv.o argp-pvh.o argp-xinl.o argp.o pin.o strndup.o getopt.o getopt1.o ranlib libargp.a make[1]: Leaving directory `/pub/src/mail/spam/ifile-1.1.5/argp' gcc -c -I. -I./include -I./argp -DHAVE_STRCHR=1 -O -o ifile.o ifile.c gcc -O ifile.o -o ifile -L. -lifile -L./argp -largp -lm rm -f ifilter.mh cat ifilter.mh.pl | sed -e 's,/usr/bin/perl,/usr/bin/perl,' > ifilter.mh chmod a+x ifilter.mh rm -f irefile.mh cat irefile.mh.pl | sed -e 's,/usr/bin/perl,/usr/bin/perl,' > irefile.mh chmod a+x irefile.mh rm -f knowledge_base.mh cat knowledge_base.mh.pl | sed -e 's,/usr/bin/perl,/usr/bin/perl,' > knowledge_base.mh chmod a+x knowledge_base.mh rm -f news2mail cat news2mail.pl | sed -e 's,/usr/bin/perl,/usr/bin/perl,' > news2mail chmod a+x news2mail
"make install" gives the following installed files, if configured with the usual prefix /usr/local:
/usr/local/bin +-----ifile +-----ifilter.mh +-----irefile.mh +-----knowledge_base.mh +-----news2mail | /usr/local/man +-----man1 | +-----ifile.1
To clean and remove configuration stuff:
me% make clean me% rm -f config.cache config.log config.status Makefile
When I create a spam database for ifile, I use these sources:
- Spam I've saved (~4,100 messages).
- Grant Taylor's spam archive (~2,400 messages): http://www2.picante.com:81/~gtaylor/download/spam.tar.gz
- Spam from the UK junk email corpus (~670 messages): http://clg.wlv.ac.uk/projects/junk-email/corpus-no-duplications.tar.gz Good collection, but originally in HTML instead of mbox format.
- The news.admin.net-abuse.sightings newsgroup (~27,000 messages).
- Some non-spam messages culled from my own mailboxes.
I tried using the collections at http://www.iit.demokritos.gr/~ionandr/pu1_encoded.tar.gz and http://www.iit.demokritos.gr/~ionandr/lingspam_public.tar.gz, but both were useless; they were modified to the point where I couldn't recreate the original mail messages.
I use a program called spmail to split an mbox-formatted file into one or more directories containing one message per file. Directories are numbered starting at 1, and messages are numbered within each directory from 000 or 001-999. This speeds up the process, because ifile can accept multiple messages on one command line when creating its database.
I delete any Chinese/Korean spam, because I don't need ifile to handle it; I can either use procmail or a much smaller program called hibits to see if a given message has lots of 8-bit characters.
Here's my directory setup:
+-----00-PRIVATE | +-----0 | | +-----001 | | +-----002 | | +-----003 | | +-----004 | | +-----005 ... | | +-----995 | | +-----996 | | +-----997 | | +-----998 | | +-----999 | +-----1 | | +-----000 | | +-----001 ... | | +-----998 | | +-----999 | +-----2 (1000 messages, same as 1) | +-----3 ... ... | +-----13 ... | +-----14 | | +-----000 | | +-----001 ... | | +-----841 | | +-----842 | +-----00-SPAM-MESSAGES | +-----ifile | | +-----idata Spam plus my private messages | | +-----idata.spamonly Spam only | | +-----mkifiledb Script to create idata files | | +-----misc | | +-----0 Graham's DB | | | +-----001 | | | +-----002 ... | | | +-----998 | | | +-----999 | | +-----1 | | +-----2 | | | +-----3 My personal spam | | | +-----001 | | | +-----002 ... | | | +-----998 | | | +-----999 | | +-----4 | | +-----5 | | +-----net-abuse news.admin.net-abuse.sightings | | +-----0 | | | +-----001 | | | +-----002 ... | | | +-----998 | | | +-----999 | | +-----1 | | | +-----000 | | | +-----001 .. | | | +-----998 | | | +-----999 | | +-----2 (1000 messages) | | +-----3 ... ... | | +-----26 | | +-----27 | | +-----28 | | +-----29 ... | | +-----30 ... | | | +-----000 | | | +-----001 .. | | | +-----684 | | | +-----685 | | +-----uk-corpus UK junk email corpus | | +-----1 | | | +-----001 | | | +-----002 ... | | | +-----672 | | | +-----673
I use the following ifile options when making the "spamonly" database:
-h, --strip-header Skip all of the header lines except Subject:, From: and To: -i, --insert=FOLDER Add the statistics for each of FILES to the category FOLDER
Here's the script, called "mkifiledb".
I've found that adding spam messages repeatedly can sometimes improve ifile's accuracy, so the script allows you to specify how often you want to add messages from a given spam group. Non-spam messages are only added once.
#!/bin/ksh # # Id: mkifiledb,v 1.6 2002/11/04 22:06:51 vogelke Exp # Source: /src/mail/spam/00-SPAM-MESSAGES/ifile/RCS/mkifiledb,v # # create ifile database from known spam and non-spam messages. PATH=/usr/local/bin:/bin:/usr/bin; export PATH umask 022 cwd=`pwd` top=/pub/src/mail/spam dbdir=$top/00-SPAM-MESSAGES/ifile dbfile=idata test -f $dbdir/$dbfile && rm $dbfile # Accept number of passes, a classification, and a list of numeric # subdirectories. Add each file within each subdirectory to our # ifile db with the given classification. # # Using "ifile -h -k -i" gives *huge* .idata file, no real gain. readmail () { passes=$1 class=$2 shift shift for f in $* do if test -d $f; then k=1 while test $k -le $passes; do echo pass $k: $class $PWD/$f ifile -b $dbdir/$dbfile -h -i $class $f/* k=`expr $k + 1` done (cd $dbdir && ls -l $dbfile) fi done } cd $top # local spam. ( cd 00-SPAM-MESSAGES/local list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamlocal $list ) # Nigerian/African fraud. ( cd 00-SPAM-MESSAGES/fraud list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamfraud $list ) # Credit repair. ( cd 00-SPAM-MESSAGES/credit list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamcredit $list ) # Diplomas. ( cd 00-SPAM-MESSAGES/diploma list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamdiploma $list ) # Drivers license. ( cd 00-SPAM-MESSAGES/license list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamlicense $list ) # gtaylor collection. ( cd 00-SPAM-MESSAGES/gtaylor list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamgt $list ) # UK spam. ( cd 00-SPAM-MESSAGES/uk-corpus list=`/bin/ls -d [0-9]* | sort -n` readmail 1 spamuk $list ) ## net-abuse spam. #( # cd 00-SPAM-MESSAGES/net-abuse # list=`/bin/ls -d [0-9]* | sort -n` # readmail 1 spamnet $list #) # keep a copy of the junk-only database. # make 1 pass through valid messages. cp $dbdir/$dbfile $dbdir/$dbfile.spamonly ( cd 00-PRIVATE list=`/bin/ls -d [0-9]* | sort -n` readmail 1 good $list ) exit 0
mkifiledb takes about an hour to run on a Pentium-133. Output files are roughly this size:
+-----00-SPAM-MESSAGES | +-----ifile | | +----- 587799 Feb 3 16:41 idata | | +----- 391994 Feb 3 16:38 idata.spamonly
The "idata" file is periodically copied to $HOME/.idata.
The file containing only spam results is available here: idata.spamonly
Something about the Nigerian bank-account scams isn't close enough to regular spam to trip the filter, so I set up some other spam categories:
- fraud: bank-account scams
- credit: credit-card offers
- diploma: college diploma offers
- license: driver's license offers
I can get a nice sample of a given spam category (in this case, fraud) by creating a small small idata file using known fraud messages plus a few valid messages, and then doing this:
me% cat findfraud #!/bin/sh # findfraud -- look through net-abuse messages, find anything # that looks like a bank-scam. find ../net-abuse/? ../net-abuse/?? -type f -print | xargs ifile -b idata -q -c | grep fraud exit 0 me% ./findfraud ../net-abuse/0/204 spamfraud ../net-abuse/0/275 spamfraud ../net-abuse/0/431 spamfraud ../net-abuse/0/450 spamfraud ../net-abuse/0/470 spamfraud me% ./findfraud | awk '{print $1}' > list
This gives me a list of files that are highly likely to be fraud-related. I can narrow it down further by using a script that lets me quickly classify these messages by hand:
me% cat quicklook #!/bin/sh # quicklook -- display each message, and ask if it fits the # description. If so, echo "msgno" to the logfile. # Otherwise, echo nothing to the logfile. PATH=/bin:/usr/bin:/usr/local/bin export PATH logfile="fraud" touch $logfile case "$#" in 1) list="$1" ;; *) echo "usage: $0 list"; exit 1 ;; esac test -f $list || exit 2 for msg in `cat $list` do head -25 $msg ans=`grabchars -q'(f)raud, (n)ext, (q)uit: '` case "$ans" in f) echo "fraud" 1>&2; echo $msg >> $logfile ;; n) echo "next" 1>&2 ;; q) echo "done" 1>&2; exit 0 ;; *) echo "${ans}? Please answer y or n." 1>&2 ;; esac done exit 0 me% ./quicklook list
The "quicklook" script accepts a list of messages identified by "findfraud" and shows me the first 25 lines of each one. I can use one keystroke to classify the message as fraud and move on, move to the next message, or quit.
I used this script to split up my current spam collection into smaller batches:
- fraud: 1,200 messages
- credit: 2,600 messages
- diploma: 130 messages
- license: 75 messages
I usually process incoming mail in batches of a few hundred messages at a time, depending on how often I collect it. Processing takes place in four stages:
- Primary whitelist: messages coming from known senders, so they're appended directly to my inbox without additional filtering by procmail. I check for duplicate message-ids, and save a copy of the incoming message header.
- Secondary whitelist: messages coming from known senders which need some non-spam filtering by procmail. (At this point, about half of my incoming mail has been handled)
- Check for spam using ifile: any messages marked as spam are appended to $HOME/mail/spam-folder and then deleted. (At this point, about 75% of my incoming mail has been handled)
- Other filtering: any remaining messages are pushed through procmail, which is set up to catch Chinese/Korean spam.
The stages are arranged so that the fastest processing takes place first, and the more computationally expensive stuff like ifile or procmail is only run when most of the mail has already been dealt with.
Here's a sample run for 139 messages. I don't like keeping my mailbox locked for any length of time, and I don't like endlessly locking/unlocking it, so I generally make a copy if lots of mail has built up:
me% ls inbox ls: inbox: No such file or directory me% lockfile -ml me% cp $MAIL inbox me% cp /dev/null $MAIL me% lockfile -mu
Break up the inbox using spmail:
me% ls data ls: data: No such file or directory me% spmail inbox me% rm inbox me% ls -R data inbox/ data/inbox: 0/ data/inbox/0: 001 017 033 049 065 081 097 113 129 002 018 034 050 066 082 098 114 130 003 019 035 051 067 083 099 115 131 004 020 036 052 068 084 100 116 132 005 021 037 053 069 085 101 117 133 006 022 038 054 070 086 102 118 134 007 023 039 055 071 087 103 119 135 008 024 040 056 072 088 104 120 136 009 025 041 057 073 089 105 121 137 010 026 042 058 074 090 106 122 138 011 027 043 059 075 091 107 123 139 012 028 044 060 076 092 108 124 013 029 045 061 077 093 109 125 014 030 046 062 078 094 110 126 015 031 047 063 079 095 111 127 016 032 048 064 080 096 112 128
Filter the inbox using a script called "runmail". This script saves its output to a tmp file and runs "tail -f" on that file, so I exit with control-c:
me% cd data/inbox/0/ me% runmail saving 3 whitelisted messages 042: not seen 048: not seen 116: not seen filtering 4 whitelisted messages 012 056 094 125 removing 18 spam messages 001 002 003 005 006 ... 136 137 DONE ^C me% cd me% rm -rf data me% /usr/ucb/from | wc 80 560 5062
I only have 80 messages left from 139. Here's the "runmail" script:
1 #!/bin/ksh 2 # 3 # Id: runmail,v 1.5 2003/02/16 21:29:30 vogelke Exp 4 # Source: /space/home/vogelke/bin/RCS/runmail,v 5 # 6 # filter all mail messages in current directory. 7 8 PATH=/bin:/usr/bin:/usr/local/bin; export PATH 9 tag=`basename $0` 10 tmp=$tag.$RANDOM.tmp 11 good=$tag.$RANDOM.good 12 13 die () { 14 echo "$*" >& 2 15 exit 1 16 } 17 18 week="`/bin/date +%Yw%W`" 19 20 # Run whitelist check before anything else. 21 ls ??? > /dev/null 2>&1 || die "no files" 22 23 fgrep -il -f $HOME/.whitelist ??? > $tmp 24 set X `wc -l $tmp` 25 26 case "$2" in 27 0) echo no whitelisted messages ;; 28 29 *) echo saving $2 whitelisted messages 30 cp /dev/null $good || die "can't write $good" 31 32 for file in `cat $tmp` 33 do 34 if formail -D 655360 $HOME/mail/msgid.cache < $file 35 then 36 echo $file: already seen 37 else 38 echo $file: not seen 39 formail -A 'X-Spam: whitelist' < $file >> $good 40 formail -X "" < $file >> $HOME/mail/HEADERS.$week 41 fi 42 done 43 44 xargs rm < $tmp 45 rm -f $tmp 46 47 if lockfile -0 -r0 -ml 48 then 49 cat $good >> $MAIL || die "can't append $good to $MAIL" 50 lockfile -mu 51 rm $good 52 else 53 echo "could not lock $MAIL, see $good" 54 fi 55 ;; 56 esac 57 58 # These messages are not spam but need procmail handling. 59 ls ??? > /dev/null 2>&1 || die "no more files" 60 61 fgrep -il -f $HOME/.whitelist2 ??? > $tmp 62 set X `wc -l $tmp` 63 64 case "$2" in 65 0) echo no whitelisted messages for procmail ;; 66 67 *) echo filtering $2 whitelisted messages 68 for file in `cat $tmp` 69 do 70 echo $file 71 procmail < $file 72 rm $file 73 done 74 ;; 75 esac 76 77 # Check for spam using ifile. 78 ls ??? > /dev/null 2>&1 || die "no more files" 79 80 echo running ifile 81 ifile -c -q ??? | grep 'spam' | cut -f1 -d' '> $tmp 82 set X `wc -l $tmp` 83 84 case "$2" in 85 0) echo ifile found no spam messages ;; 86 87 *) echo removing $2 spam messages 88 sf="$HOME/mail/spam-folder" 89 xargs cat < $tmp >> $sf || die "$sf: can't append" 90 xargs rm < $tmp 91 cp /dev/null $tmp 92 ;; 93 esac 94 95 # Send remaining messages through procmail. 96 ls ??? > /dev/null 2>&1 || die "no more files" 97 98 echo running procmail 99 ( 100 for file in ??? 101 do 102 echo $file 103 procmail < $file 104 done 105 echo DONE 106 ) >> $tmp & 107 108 exec tail -f $tmp 109 exit 0
Here's a quick description of the script:
1-11: Standard Korn-shell header. I like ksh because I can use $RANDOM when creating tmp files.
13-16: Short suicide function which allows one-line tests, like line 21.
18: Get the current week, for storing message headers.
23-24: Look for messages that have addresses in my whitelist. My ~/.whitelist file has records like this, one per line:
announce@freebsd.org apache-ssl@lists.aldigital.co.uk archive@securityinsight.com
26-56: Checks for any whitelisted messages and acts accordingly.
Lines 32-42 walk through each whitelisted message in turn, discard it if it's been seen before (line 34), or save a copy of the headers and add it to a temp file if it hasn't been seen before (lines 39-40).
Lines 44-45 remove the individual whitelisted messages.
Lines 47-54 safely append the temp file to my inbox.
61-75: Does roughly the same thing for whitelisted messages that need some procmail work.
80-93: Run ifile in concise, quiet mode to do a spam check on the remaining files. Any hits are stored in $HOME/mail/spam-folder.
98-106: Run procmail on anything else that's left.
Believe me, it takes much longer to describe than it does to run.
For completion, here's my $HOME/.procmailrc file.
1 # Id: .procmailrc,v 1.52 2002/08/16 16:44:44 vogelke Exp 2 # Source: /space/home/vogelke/RCS/.procmailrc,v 3 # 4 # NAME: 5 # $HOME/.procmailrc 6 # 7 # DESCRIPTION: 8 # "procmail" handles local mail delivery, and you can use this 9 # file to tell it to 10 # - store your mail in a given folder, 11 # - forward or discard mail depending on the contents, or 12 # - run your mail through a program automatically. 13 # 14 # TESTING CHANGES: 15 # If you want to mess with your setup, the safest way is: 16 # 17 # 1. copy an existing mail message to /tmp/msg, 18 # 2. copy .procmailrc to .procmailrc.new, 19 # 3. only make your changes to .procmailrc.new, and 20 # 4. run "procmail -m .procmailrc.new < /tmp/msg" to test. 21 # 22 # AUTHOR: 23 # Karl Vogel <vogelke@dnaco.net> 24 # Sumaria Systems, Inc. 25 # Search path. 26 PATH=/usr/local/bin:/bin:/usr/bin:$HOME/bin 27 # Default mail folder. 28 DEFAULT=/var/mail/vogelke 29 # Current directory while procmail is executing. 30 # All pathnames are relative to this directory. 31 MAILDIR=$HOME/mail 32 # File containing error messages or diagnostics. If this 33 # file does not exist, then said messages will be bounced 34 # back to the message sender. 35 LOGFILE=$MAILDIR/MAILLOG 36 # If yes, keep an abstract of the From and Subject lines of 37 # each delivered message, the folder it was delivered to, 38 # and the size of the message. If no, skip this abstract. 39 LOGABSTRACT=yes 40 # If on, describe actions of procmail in detail. 41 #VERBOSE=on 42 # Number of seconds before procmail zaps a lockfile by force. 43 LOCKTIMEOUT=1 44 # Default shell and umask value. 45 SHELL=/bin/sh 46 UMASK=022 47 # Frequently-used variables. 48 WEEK="`/bin/date +%Yw%W`" 49 # Rules section. 50 #-------------------------------------------------------- 51 # RULE: Save incoming headers in a file called 52 # $HOME/mail/HEADERS.YYYYwNN 53 # where YYYY = year 54 # NN = the week number starting on Monday. 55 :0 chw: $HOME/hdr.lck 56 | /bin/cat - >> $HOME/mail/HEADERS.$WEEK; 57 #-------------------------------------------------------- 58 # RULE: Check if the Message-ID: header has been seen. 59 # Discard the message if so, otherwise continue. 60 :0 Wh: msgid.lock 61 | formail -D 655360 msgid.cache 62 #======================================================== 63 # SPAM: dump message if the message contains a few 64 # 8-bit characters. Simple check for encoded 65 # characters (=[0-9A-F][0-9A-F]) will fail for 66 # messages containing "dd xxx=yyy" commands. 67 # 68 # If you want to check the header only, use ":0 HD" 69 # instead. 70 :0 HBD 71 * -10^1 Subject: 72 * 1^1 =[0-9A-F][0-9A-F]=[0-9A-F][0-9A-F] 73 * 1^1 [ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿] 74 * 1^1 [ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß] 75 * 1^1 [àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ] 76 * 1^1 =[A-F][0-9A-F]=[A-F][0-9A-F] 77 spam-8bit 78 :0 H 79 * ^Subject: =\?.*\?= 80 spam-8bit 81 #-------------------------------------------------------- 82 # SPAM: All-numeric "email addresses" like "you@67890.com" 83 # Messages to "friend@host" or "you@host". 84 :0 H 85 * ^(From|To|Reply-To): .*@[0-9]+\. 86 spam-friend 87 :0 H 88 * ^(From|To|Cc): friend[0-9a-zA-Z]*@ 89 spam-friend 90 :0 H 91 * ^(From|To|Cc): you@ 92 spam-friend 93 #-------------------------------------------------------- 94 # SPAM: pass anything in the whitelist. 95 # http://www.mindrape.org/caffeine/squashing_spam.html 96 :0: 97 * ? formail -x"From:" -x"From" -x"To:" -x"Reply-To:" -x"Cc:" \ 98 | fgrep -is -f $HOME/.whitelist 99 $DEFAULT 100 #-------------------------------------------------------- 101 # SPAM: kill anything in the blacklist. 102 :0: 103 * ? formail -x"From:" -x"From" -x"To:" -x"Reply-To:" -x"Cc:" \ 104 | fgrep -is -f $HOME/.blacklist 105 spam-folder 106 #-------------------------------------------------------- 107 # SPAM: Same from and to: happens legitimately only when 108 # sending mail to oneself. Put this *after* whitelist 109 # filtering; some people on whitelist send to themselves. 110 :0 H 111 * ^From: \/.* 112 * $^To: $MATCH 113 spam-tofrom 114 :0 : 115 $DEFAULT
62-77: This checks for 8-bit spam, or quoted-printable crap. It works, but it's slow; we can do better by adding another stage to the runmail script, and deleting this portion of .procmailrc.
Created by log2html.pl v1.13 | Sun, 16 Feb 2003 17:56:43 -0500 |