[Greylist-users] some comments on spamd

Wed Apr 27 07:22:57 PDT 2005

Now that I have spamd working and understand a bit better how it works,
I have a couple of observations.

I initially had completely misunderstood the model of how this
code works.  I expected it to be a "man in the middle" MTA which
would either store-and-forward, or maybe proxy filter on the fly
in real time as it connects to the target MTA.

As everyone but me knew, that's not how it works.  It's more like a tarpit
or honeypot mailer, i.e. a data sink, which has the ability to redirect
subsequent calls to the real MTA once it has taken the decision to
whitelist the sender's IP.

There are a couple of problems with this approach:  the first is that
it *requires* the sender to attempt THREE deliveries, not two.  A
man-in-the-middle MTA would be able to pass the delivery straight
through on the second attempt.  This code can't.  There's a good
chance that senders will use increasing back-off periods and that the
final delivery will be quite late.

This is compounded by the following problem: spamd disconnects
immediately after sending the tempfail.  It does not wait to see
if there are any more items to be delivered.  If the sender has
a queue of messages to several recipients at your site, and does
not always retry the queue from the top each time (which I know
from experience happens a lot) then the first triple from each
retry of the queue to your site will be different and and the
sending system will not be whitelisted.

Finally, the model that spamd uses is that one triple for a sending
IP causes the whole sending site to be whitelisted.  I believe the
original intention of Evan Harris's paper was to make the whitelisting
decision /per triple/.  At first this looks like an optimisation
and maybe not such a bad thing, *however* I have a (as yet unconfirmed)
suspicion that whitelisting the whole site rather than each triple
will make it far easier for spammers to bypass the greylist mechanism;
for example, they send one mail, wait an hour, then send a huge number.
With greylisting on each triple, they'd have to waste a lot of time
trying every recipient, and also that time would make them
much more visible and likely to be taken down in the
intervening time; but by only sending one test message
they stay under the radar until an hour later at which point
they can deluge us with mail to 10,000 recipients.  They would
at no point be sitting around idle because the "enabling" initial
message would be scheduled in while spamming other sites that
were enabled an hour earlier.

There's one other potential issue that the tarpit+redirect solution
doesn't handle, and that's SMTP callbacks as mentioned in his paper.
With the real MTA replaced by the dummy one in spamd, there's no
way that callbacks will work - unless you are pre-whitelisting
outgoing calls and the pre-whitelisting takes effect immediately.
However pre-whitelisting is an add-on feature to spamd and is not
going to be possible in all configurations (i.e. you need to be
using spamd in a transparent bridge for this to work; if you can't
see the traffic going out from your site's mailers, pre-whitelisting
is not going to work for you).

Similarly if you have peripatetic users who call home to send mail
from their portables using SMTP AUTH.  Asking your users to stay
in the same place for an hour and try three times is probably not
going to fly the first time someone from your board of directors
wants to mail home from the airport between flights :-)

I'm not knocking the code, I actually think it's a pretty neat
hack, but it *is* a hack.  The correct way to do this should be
as an active man-in-the-middle (which *can* be done in an MTA-
independent way, even on-the-fly without store-and-forward, as
long as the target system adheres to the RFCs)

Graham