[Greylist-users] Re: I need graylisting stats
Scott Nelson
scott at spamwolf.com
Sun Jul 11 21:17:32 PDT 2004
At 01:28 PM 7/11/04 -0600, Steve Murphy wrote:
>
>> Unfortunately, I don't have any real statistics. How do I get
>> statistics?
>
>
>Been thinking about this myself.
>
>Since you will never be able to prove (unless you capture the full
>contents
>of each message before rejecting it) exactly how many different messages
>you WOULD HAVE gotten, had you not had graylisting in place, you CAN
>chart ATTEMPTS to deliver messages.
>
[...]
>
>The best stats are the ones that show the average number of spams
>received before vs. after graylisting is enabled.
>
Even that is problematic, since spam runs have such a large variance.
Over a month, a single spam trap can change from an average of 5 spams
a day to 10, or vice versa.
What I did was split some of my spam traps into four groups,
then run all permutations of greylisting on/off on a different group.
Group 1 did none, group 2 did none for month then did greylisting for a month,
3 did then didn't, and 4 did greylisting for both months.
This can take a long time, even if you don't make mistakes that force
you to start over (which I did, twice).
My choices of split was bad. I sorted names alphabetically then alternated
groups, when in hind-site I should have sorted my traps by spams received,
so each group would have received an approximately equal number of spams.
Looking at the logs is a lot simpler, and though it gives a slightly
inflated number (due to retries) it's not /that/ far off.
RBLs have the same inflation problem, plus they frequently allow spam in,
which is, I suppose, why some spammers will try from three or four
different IPs before giving up.
Even content based systems exaggerate, since they will frequently use
an old spam corpus to judge their effectiveness.
(Spam assassin is almost 100% effective ... on spam over a year old.)
Scott Nelson <scott at spamwolf.com>
More information about the Greylist-users
mailing list