[Greylist-users] Greylisting with MySql I/O performance
William Blunn
bill--greylist at blunn.org
Tue Aug 24 03:04:05 PDT 2004
> Anyone seen I/O performance issues with MySql greylisting database? We
> process roughly 3M pieces of email each week. We are running our database
> on a 3Ghz system, and it is running almost 98% idle cpu.
>
> However, the database is on a single IDE disk, and it appears to be running
> 98-100% busy. We know we would get some benefit from using a SCSI disk
> instead, but are wondering if we should move to hardware Raid 0 (striping)
> instead?
>
> We are seeing delays on our mail servers that are inquiring against the
> greylist database at this point. Comments?
>
> We have tuned the MySql database as best we can with table-cache set to
> 1200, max_connections to 300, and key_buffer to 20M. Any other suggestions?
>
> Also we added an index for record_expires to relaytofrom for the maintenance
> script that gets run nightly, but it too is taking way too long to run -
> again I suspect I/O is the problem.
1. How much memory (RAM) is in the machine?
2. Can you confirm that DMA is enabled on the hard disk?
( hdparm -d /dev/hda ) (This may sound stupid, but you may be
surprised at how many machines operated by otherwise competent people
do not have this enabled.)
3. Does your maintenance run involve doing a large DELETE operation?
Are you getting problems with the table being locked out for the
duration of the DELETE operation?
I found a way of addressing that which is to put a LIMIT clause on
the DELETE operation to limit it to, say, 100 rows. Then I put the
DELETE into a loop which runs until the number of rows deleted is
zero. That way the table is only locked out for brief periods.
4. I don't know about the database design of the greylisting system you
are using, but you may be running up against some inefficiences in
there.
Smallness of key size is one way of getting good performance.
(a) The default method of indexing sender and recipient e-mail
addresses taken by the average programmer is to index against a
fixed-size prefix of the address. This can result in a
contribution to the the key size of 20 to 25 bytes per e-mail
address.
(b) The default method of storing IP addresses taken by the average
programmer is to store a string representation of the IP address.
For IPv4 addresses, this will result in a 15 character string.
This is usually indexed straight-up, resulting in a contribution
of 15 bytes to the key size.
The Bagley system indexes sender, recipient, and calling network by
storing a CRC32 hash of the value and then indexing on that. So the
constribution to the key size is just 4 bytes for each of these
three values.
(The Bagley system is really only useful for Exim systems, but there
is no real reason why the ideas shouldn't be equally applicable to
other greylisting systems.)
Extract from the Bagley system table specification:
net bigint unsigned, # sending "network"
# address AND'd with 0xFFFFFF00 for IPv4
# most significant 64 bits of address for IPv6
net_crc int unsigned default 0 NOT NULL, # CRC32 of sending "network"
sender varchar(255), # sender e-mail address, lowercased
sender_crc int unsigned default 0 NOT NULL, # CRC32 of sender e-mail address
# Although this field is calculable, MySQL does
# not currently support hash indexes, so we
# will emulate that with a CRC field
recipient varchar(255), # recipient e-mail address, lowercased
recipient_crc int unsigned default 0 NOT NULL, # CRC32 of recipient e-mail address
# Same comment as per sender_crc.
...
key triple (net_crc, sender_crc, recipient_crc),
Bill
More information about the Greylist-users
mailing list