[Greylist-users] Greylisting with MySql I/O performance

William Blunn bill--greylist at blunn.org
Tue Aug 24 03:04:05 PDT 2004


> Anyone seen I/O performance issues with MySql greylisting database?  We 
> process roughly 3M pieces of email each week.  We are running our database 
> on a 3Ghz system, and it is running almost 98% idle cpu.
> 
> However, the database is on a single IDE disk, and it appears to be running 
> 98-100% busy.  We know we would get some benefit from using a SCSI disk 
> instead, but are wondering if we should move to hardware Raid 0 (striping) 
> instead?
> 
> We are seeing delays on our mail servers that are inquiring against the 
> greylist database at this point.  Comments?
> 
> We have tuned the MySql database as best we can with table-cache set to 
> 1200, max_connections to 300, and key_buffer to 20M.  Any other suggestions?
> 
> Also we added an index for record_expires to relaytofrom for the maintenance 
> script that gets run nightly, but it too is taking way too long to run - 
> again I suspect I/O is the problem.

1. How much memory (RAM) is in the machine?

2. Can you confirm that DMA is enabled on the hard disk?
   ( hdparm -d /dev/hda ) (This may sound stupid, but you may be
   surprised at how many machines operated by otherwise competent people
   do not have this enabled.)

3. Does your maintenance run involve doing a large DELETE operation?

   Are you getting problems with the table being locked out for the
   duration of the DELETE operation?
   
   I found a way of addressing that which is to put a LIMIT clause on
   the DELETE operation to limit it to, say, 100 rows.  Then I put the
   DELETE into a loop which runs until the number of rows deleted is
   zero.  That way the table is only locked out for brief periods.

4. I don't know about the database design of the greylisting system you
   are using, but you may be running up against some inefficiences in
   there.

   Smallness of key size is one way of getting good performance.

   (a) The default method of indexing sender and recipient e-mail
       addresses taken by the average programmer is to index against a
       fixed-size prefix of the address.  This can result in a
       contribution to the the key size of 20 to 25 bytes per e-mail
       address.

   (b) The default method of storing IP addresses taken by the average
       programmer is to store a string representation of the IP address.
       For IPv4 addresses, this will result in a 15 character string.
       This is usually indexed straight-up, resulting in a contribution
       of 15 bytes to the key size.

    The Bagley system indexes sender, recipient, and calling network by
    storing a CRC32 hash of the value and then indexing on that.  So the
    constribution to the key size is just 4 bytes for each of these
    three values.
    
    (The Bagley system is really only useful for Exim systems, but there
    is no real reason why the ideas shouldn't be equally applicable to
    other greylisting systems.)
    
    Extract from the Bagley system table specification:
   
      net            bigint unsigned,                     # sending "network"
                                                          # address AND'd with 0xFFFFFF00 for IPv4
                                                          # most significant 64 bits of address for IPv6
    
      net_crc        int unsigned     default 0 NOT NULL, # CRC32 of sending "network"
    
      sender         varchar(255),                        # sender e-mail address, lowercased
      sender_crc     int unsigned     default 0 NOT NULL, # CRC32 of sender e-mail address
                                                          # Although this field is calculable, MySQL does
                                                          # not currently support hash indexes, so we
                                                          # will emulate that with a CRC field
    
      recipient      varchar(255),                        # recipient e-mail address, lowercased
      recipient_crc  int unsigned     default 0 NOT NULL, # CRC32 of recipient e-mail address
                                                          # Same comment as per sender_crc.

      ...

      key triple (net_crc, sender_crc, recipient_crc),


Bill


More information about the Greylist-users mailing list