[Greylist-users] greylist lib in C?

Wed Aug 25 03:16:28 PDT 2004

> Question: is a full database like SQL really necessary for storing the
> timing information, or are there any robust but cheap hacks that work
> just as well that you can recommend?
> - it's what spamprobe uses for tokens and seems fairly robust.  I guess
> the main reason for using a proper database here is record locking?

You want it to handle people's incoming e-mail?

Then you want it to be robust.

You want persistent state, and I mean persistent.  With greylisting, you
really cannot get by with in-memory state.  If you do that then every
time to re-start your daemon you will mess up your users' incoming e-
mail.  You really do want it on-disk persistent.

You want to be able to optimised queries on the database.  SQL databases
are designed to do that.  Whilst the UI is a text-based query system, do
not let this deceive you into thinking that it is kludgy; the underlying
data is stored in the most appropriate way, and the queries are very
much optimised using algorithms and implementations that have been
developed over many years by people whose job it is to do that all the
time.

You want to be able to make ad-hoc diagnostic queries on the database?
SQL databases are designed to do that.  You just open up a client and
enter your query.

> I'm giving serious thought to just keeping the data in memory - just
> data structures, no database, and handling all requests via a single
> daemon and sockets...  is that too complicated?  It would lose all
> records on a reboot, which is probably not a bad thing.  (I'm assuming
> unix-like uptimes here of many months, not Windoze uptimes of days ;-) )

Even uptimes of many months would not be good enough for a greylisting
database.

> Would it be too much of a hack simply to use the filing system directories
> as an implicit ISAM index? something like
>             .../sender_ip/env_from/env_to/timestamp.txt ?
> I've seen mail systems which do that for the sending queue, maybe it's
> not entirely silly?

Yes it would be too much of a hack.

MySQL is pretty easy to program against using the C library once you
have seen some example code.

> (to some extent I'm being driven here by the need for a quick proof-
> of-concept hack, that I can always rewrite later using a better structure,
> once I've had time to learn another package, while preserving the
> library interface...)

I have seen one other implementation that uses SQLite.

However one might argue:  How many different *types* of database do you
want to have?

It is not as though there is a *problem* using, for example, MySQL,
other than a little unfamiliarity.

The more types of database you have, the more complexity you have in
your world.  Increased complexity implies increased management, which
implies increased cost.

Also if you use some weird and wonderful database, then if you publish
your system out to the world, you get some problems:

* you have strange dependencies therefore people are less likely to "go
  for" your system,

* you are likely to be using a less popular database system, which has
  less "collective experience".

If you depend on MySQL, then you can count on people already having that
for various other systems they might be using, and also there is a lot
of collective experience (not to mention some pretty good documentation
at http://dev.mysql.com/doc/mysql/en/index.html ).

Bill