Erik Clark eclark4 at gl.umbc.edu
Sun Oct 26 21:27:31 PST 2003

   i know first hand of a site that is looking into this issue at the
moment and is currently testing it. they moved away from the mysql
dependance and replaced it wth flatfile dbs, to eliminate potential
conflicts that may bot be needed. The site in question handles well over a
million emails a day in total, and often considerably more than that. The
most important thing I can stress is the need for good tools. Definitely
whip up an efficient, effective tool for inserting/deleting
blacklist,whitelist entries, a better reporting method, and some
fashionable method of handling unique id mailinglist emails (there was a
really good chunk of code provided from a guy a few weeks back for the
mailinglist issue). moving to berkeley db or sqlite would remove one
dependancy from the issue, leaving you with just perlmilter+sendmail, not
such a bad thing. since its currently in a test state, there is no info to
provide about load issues.... sorry

On Sun, 26 Oct 2003, Ken Raeburn wrote:

> Evan's paper says greylisting is being done at some sites handling
> millions of messages per day, but most of the actual statistics I've
> been able to dig up in the mail archives (granted, it's entirely
> possible I missed some while skimming the subject lines) seem to be
> from much smaller sites and/or short test runs, mostly by at least a
> couple orders of magnitude.
> I'm hoping to put forward a proposal for a site I get a lot of email
> at that's closer to the high end of the spectrum.  One quarter last
> year they averaged over a quarter million messages per day; I don't
> know where they're at now.  Any available information from these
> bigger sites would be useful, including implementation info (is
> milter+perl+mysql good enough under heavy load?), real statistics
> (users, messages, before-and-after comparisons, etc), and especially
> not neglecting the negative aspects (user complaints, lost legitimate
> email, random net sites needing whitelisting, database size (records
> and megabytes)).
> I also need to investigate resource (disk, CPU, memory) usage for
> "learning" mode, when no mail is blocked.  I expect the initial
> deployment would start with that, then enable blocking for a few test
> addresses as a trial run; it would probably take some time before
> they'd consider doing it for a lot of recipients, if I can get them
> started on it at all.
> I expect Evan's web page would be even more convincing with more
> large-site stats, as well.
> Does anyone have such stats that they'd be willing to share?
> Ken
> "We are Grey.  We stand between the spam and the light."

