[Greylist-users] Periodic clean-up of the garbage triplets in the db?

Evan Harris eharris at puremagic.com
Tue Jun 24 02:44:03 PDT 2003


On Mon, 23 Jun 2003, Philip Kizer wrote:

> I've only got 3000 triplets after running since Saturday, but a friend of
> mine that I was assisting already has 20,000 triplets in his DB after 4
> hours.  Particularly given the "garbage" spam sender addresses that are
> used, we were thinking of performing some kind of clean-up like the
> following every $time_period (hour, day, etc):

Wow, that's a lot of mail.  I'd be interested in seeing some statistics on
his blocking success rate after he gets some more runtime.

I'd also be interested in finding out if he's seeing any significant query
delays as the db size grows to the 10's of millions of records (assuming he
lets it get that far before cleaning).

Assuming you are using my milter, one thing you may need to watch out for is
the limit on the number of connections your db can handle.  Since the milter
will fork as many callback instances as there are concurrent email
deliveries, you may run out of db connections.

During my testing, there were a few particularly bad spam "waves" that
caused my db connections to max out.  In order to limit the db connections,
I changed the sendmail.mc to have this:

define(`confMAX_DAEMON_CHILDREN', `50')dnl

This limits the number of sendmail processes, which also effectively limits
the number of milter instances and their cached db connections.

Keep in mind that if several systems are using the same db, they will each
have to be limited to an appropriate number of db connections.

>   delete from relaytofrom where record_expires < NOW();
>
> or, for safety, perhaps something like:
>
>   delete from relaytofrom where record_expires < NOW()
>                             and origin_type != 'MANUAL';

If you'd like to keep recent statistics but age off older ones (which I
would probably encourage), I'd probably use something more like:

delete from relaytofrom where record_expires < NOW() - interval 30 day and
origin_type = 'AUTO';

I turned around the man/auto clause since I could see reasons for adding
more origin_types than are there currently.

If you can, please post some statistics from your tests.  I'm sure we're all
interested in seeing just how bogus my efficiency claims are.  :)

Evan





More information about the Greylist-users mailing list