[Greylist-users] greylist lib in C? + several Q's

Wed Sep 1 04:46:18 PDT 2004

> "William Blunn" <bill--greylist at blunn.org> replies, to Graham Toal:
>>> Do you decide
>>> to either pass the mail or not, for all users, or do you
>>> reject all recipients except one
>>
>> Again, if you do it at RCPT time, you can return results for each
>> recipient.
>>
>> If you do it somewhere else, you need some way of combining the
>> greylisting results for all the recipients.  This is an implementation
>> decision.
>>
>> The algorithm I used was to take the "most rejectional" result out of
>> all the results for all of the recipients.  The possible greylisting
>> results for each recipient are:
>>
>>   0 ACCEPT
>>   1 TEMPREJECT
>>   2 REJECT
>>
>> Then take the maximum value reached for this messsage, and that is the
>> result for the message.
> 
> So if the email is to be (permanently) rejected for one of the
> recipients, and accepted for others, you'd issue a reject after the
> DATA phase?  I hope you don't allow random users to blacklist senders,
> then.

I think we have a shortcoming of understanding.

The only time REJECT is returned is when one of the triples appears with
a BLACK entry in the database.  These can only ever be added manually.

In the normal scheme of things, you could never even get a REJECT
response.

IF a sender sent a message to multiple recipients, where ONE of those
recipients matches a *manual* BLACKlist entry, then yes, the message
would be rejected for *all* recipients.  But you would only ever put in
a BLACKlist entry when you knew that mail for that recipient was
definitely spam.

I have seen cases like these where the spammer sends mail to multiple
recipients at my domain, including one to a spambait address.  By
BLACKlisting the spambait address, you would get to reject the messages
for all the other recipients as well - great!

However *all* of the above is academic because sending TEMPREJECT after
DATA for non-null senders is well-known as a broken technique.  (Some
MTAs ignore TEMPREJECT after DATA.  You should TEMPREJECT at RCPT time.)

>>> and cause the sender to
>>> send individual copies to each person so you can make the
>>> decision on a per user basis?
>>
>> I am unable to relate your question to my understanding of e-mail
>> delivery.  Re-tries are done by the sender's MTA.  The sender's MTA will
>> track the delivery status for each recipient, so if you accept some and
>> tempreject others, the sending MTA will re-try only the ones which were
>> temprejected.
> 
> ... and won't do so individually.

Not so.

If you TEMPREJECT at RCPT time (the only sensible way of doing it), then
the sending MTA will track the delivery status of each recipient and
future delivery attempts will only include recipients where delivery has
not already been completed.

>>> 2) Is there any danger in *always* doing the temporary
>>> reject after the DATA command is complete?  I know that the
>>> whitepaper suggests doing this only for MAIL FROM:<>
>>> (with some hacks for broken mailers) but for my purposes
>>> I'd rather like to do it that way all the time.  One
>>> reason being I want to store the mail, for QA purposes,
>>> so we can be sure that good mail has not been rejected;
>>
>> You're not rejecting it, you're temporarily rejecting it.
>>
>> Your MTA is *always* entitled to tempreject.  The load average could be
>> too high.  That is what temporary rejection is for.  If the sending MTA
>> doesn't re-try, then that is *their* problem, *you* are golden, *they*
>> don't have a leg to stand on.
> 
> Try explaining that to the people at Yahoo Groups.  Try explaining to
> your users that Yahoo Groups misbehaves, and therefore isn't allowed
> to send them email.
> 
> No, small sites you might be able to get fixed or ignore, but unless
> you've got a lot of leverage, there will be the occasional broken
> large site that you can't fix and can't afford to lose lots of mail
> from.  For the near future, you'll need a whitelist, and it will need
> to be maintained manually, and ideally you'd want to be able to
> monitor traffic for possible updates to the whitelist.  If you don't
> mind losing an occasional message, maybe you can do it with logged IP
> addresses and hostnames; if you're more paranoid than that, you want
> to collect message bodies for at least automated analysis.
> Unfortunately that's not very compatible with tempfailing for some
> recipients and not others.
> 
> I'm somewhat on the paranoid side, but even I'm happy with Evan's
> relaydelay and the risk of the occasional lost message from
> misbehaving sending sites.

We should put in exceptions for large recalcitrant senders.

However we should not do this alone:

1. We need to report bugs to the problem senders.  We need to actually
   do this, not just say that we will.

2. Users also need to be educated that certain senders have issues
   outside the control of the local administration which may cause
   problems in any case (i.e. nothing to do with greylisting).  If your
   local mail server is busy and your MTA decides that the load average
   is too high to accept new mail and returns TEMPREJECT to Yahoo Groups
   (which it is perfectly entitled to do), then Yahoo will give up the
   delivery!

   This is a Yahoo Groups problem, but responsible professional
   adminstrators won't just blame a third party if it isn't going to be
   credible to the average user.

   It would be appropriate to be pro-active and educate users that there
   are problems with Yahoo Groups, that it is outide of our control, 
   that Yahoo Groups have been advised, and that if it is desired to
   pursue Yahoo Groups directly, provide contact details for doing that.

>>> another is that I'm considering greylisting *only* if it
>>> fails a spam test - otherwise it is accepted.
>>
>> I'm not sure that this is a good idea.  Greylisting tends to be the
>> better initial triage function as it can weed out a vast proportion of
>> incoming delivery attempts at very low cost.
> 
> Yep.  Most reports are that it cuts down CPU and network resource
> consumption by causing you not to ever look at a large fraction of the
> (supposed) spam traffic.
> 
> Another approach I've deployed recently that also seems to help is
> delaying the greeting banner.  The spec says the client has to wait
> for the banner (for some reasonably long time like 5 minutes) before
> it sends anything.  Spamware and viruses often do not.  So, wait 45
> seconds or so before sending the banner, and if any traffic comes in,
> it's not a properly behaving MTA, so reject everything it sends, or
> drop the connection.  If they disconnect on you, they're also not
> following the specs.
> 
> With the 45s delay, my mail system at home has been discarding about
> half as many attempts as greylisting later tempfails (more, in the
> last day or so).  Blithely assuming that all of those would result in
> database entries, that could be somewhere around 1/3 of the potential
> spammer entries in the greylist database no longer getting added.
> 
> I chose 45 seconds based on a web page I found (and can't find again
> at the moment) where someone had done some monitoring on a system set
> to delay 90 seconds, and found that something like 90% of hosts that
> misbehaved did so in the first 45 seconds.  I suppose I could go for
> 90 or 120 seconds and get a few more....
> 
> (Sendmail 8.13 has this capability built in; it's about 60 lines of C
> code that can easily be dropped into earlier versions, and a little
> addition in sendmail.cf.)

Exim 4 goes one better than this.

Exim 4 checks to see if messages incoming from the remote MTA come in in
the right order relative to outgoing messages.

So, *before* sending the initial greeting, we check the receive buffer.

If the remote mailer has already said "HELO", we report a
synchronisation error and drop the connection.

Similarly, we check to see if the remote MTA jumps the gun with MAIL,
RCPT, DATA.

Any SMTP non-conformance is dealt with in the same way.

>> The Bagley system also stores a status value against each triple, so
>> that if, for example, the greylisting delay is extended, you do not get
>> triples which were previously happy suddenly start temprejecting again.
> 
> So you've kind of got a block-expired flag instead of a block-expires
> time...

No, it's a status value.  It can have several values:

  BLACK      Blacklisted, always reject attempts matching this triple.
             (always entered manually)

  WHITE      Whitelisted, always accept attempts matching this triple.
             (always entered manually)

  DARKGREY   We have seen this triple once, and we TEMPREJECTED it.
             In future, accept mail if the greylisting interval has
             elapsed since the last-seen time for this triple.

  LIGHTGREY  We have seen this triple at least twice, so accept messages
             matching this triple.

  MIDGREY    We have seen this triple once, but the sending network and
             the sending e-mail address matched a LIGHTGREY record; that
             looks like a legitimate sender, so we will ACCEPT this
             triple straight away.

  REVERSE    We have seen an outgoing message where the reply would
             match this triple.  So accept any messages matching this
             triple.

>>> The expiry time is calculated by a simple addition of a
>>> constant, but if you change your policy, wouldn't you want it
>>> to apply retroactively to all entries in your database rather than
>>> just new ones being added?
> 
> Perhaps.  I think it would be an infrequent event, and the problem
> automatically handled in time as older records expire or get updated.
> If you really want to make it retroactive, I think running a simple
> SQL command to add the delta in the appropriate cases wouldn't be to
> much to ask.

I don't agree.

There are a couple of relevant time points after the time the triple was
last seen:

1. The greylisting time

2. The record expiry time (which could be different depending if you have seen
   a re-try or not, or if the record is for reverse processing)

We could calculate both of these and store both of them, but it is uselessly
redundant and is not consistent with best database practice.

The proper way to do it is to store the "last seen" time, and then do
measurements from there using offsets according to what you are trying
to achieve at the time.

Bill