spamdyke

A drop-in connection-time spam filter for qmail

Current Release

5/1/2014: Version 5.0.1

Many, many bug fixes, including a large number of bugs that can cause crashes. spamdyke-qrv has also received a great deal of attention and now handles every strange configuration qmail can support.

Download spamdyke version 5.0.1

Previous Releases

1/28/2014: Version 5.0.0

Adds full recipient validation and some new sender filters. Also changes the whitelisting feature to not automatically allow relaying for whitelisted connections and fixes a number of bugs.

Download spamdyke version 5.0.0
View the upgrade notes

1/20/2012: Version 4.3.1

Corrects a bug in the new header blacklist filter that could cause erroneous errors and incorrect message rejections.

Download spamdyke 4.3.1

1/15/2012: Version 4.3.0

Adds the ability to filter messages based on the content of their headers. Also fixes some small bugs, a compile error on Debian 7 and a major series of bugs that could result in buffer overflows (possibly remotely exploitable, depending on the configuration options). Please upgrade immediately!

Download spamdyke 4.3.0

Frequently Asked Questions

General Questions

What is spamdyke and what does it do?
I use email but I don't administer my own mail server. Can I use spamdyke?
How do I install spamdyke?
How do I upgrade spamdyke? What's the significance of the version numbers?
How do I get support for spamdyke? Is there a mailing list?
I don't use qmail. Can I still use spamdyke?
I use Plesk. What is qmail? Can I still use spamdyke?
Do I have to install the programs from the "utils" directory? Does spamdyke use them? Do they use spamdyke or each other?
When I reconfigure spamdyke, do I have to restart qmail?
What is the difference between a right-hand-side blacklist (RHSBL) and a realtime blackhole list (RBL)?
Which RBLs and/or RHSBLs do you recommend?
What operating systems and versions do you test spamdyke with?
I love spamdyke! How can I help? Can I send you money?

Feature Questions

Does spamdyke run its filters in any particular order?
If spamdyke checks IP blacklists before it checks sender whitelists, will whitelisted senders from blacklisted IPs be blocked?
My users authenticate with SMTP AUTH. Can I still use spamdyke?
My users authenticate with POP3-before-SMTP. Can I still use spamdyke?
I want to block all emails unless the sender authenticates. Can spamdyke do that?
Does spamdyke support TLS?
I want to whitelist a large number of IP addresses; can I use wildcards?
I want to disable some filters for a few domains (or users) and enable them for everyone else. Is that possible?
My users' PCs are infected with spambots and are sending spam through my server. Can I force them to authenticate to block the spam?
I want to stop backscatter spam by blocking all messages to invalid or non-existant recipient addresses. Can spamdyke do that?
Why does spamdyke-qrv have to be marked setuid root? Setuid binaries are evil incarnate!

Feature Suggestions

On the mailing list, you often promise changes in an upcoming version. How can I find out what you're working on?
Why doesn't spamdyke use CDB files or a database? A database would be faster and better than text files and strange directory structures!
Why doesn't spamdyke filter using Sender Policy Framework (SPF), Sender ID, Certified Server Validation (CSV), DomainKeys or DomainKeys Identified Mail (DKIM)?
Why can't spamdyke filter based on message headers like "From:" or "Received:"?
Why can't spamdyke block large messages or strip attachments?
My graylist directories are getting huge -- many, many entries. Is this is a problem? What can I do?
Why can't spamdyke automatically delete old graylist entries?
Instead of trying to prevent relaying itself, why doesn't spamdyke just set the RELAYCLIENT environment variable based on authentication/sender address/recipient address?
Why can't spamdyke automatically blacklist (or delay) servers that are rejected too many times?
Why doesn't spamdyke's graylist filter use the IP address of the remote server?

Troubleshooting

Graylisting isn't working! What am I doing wrong?
I want to use spamdyke but some of my users roam and connect from strange places. How can I allow them to send email but still filter spam?
I installed spamdyke and now I'm seeing a lot of timeouts in my logs. Why?
I installed spamdyke and now my server is very slow! Incoming connections have to wait 20 seconds or longer before they see the greeting banner. What can I do speed it up?
I use spamdyke to prevent relaying (because my qmail isn't patched to provide SMTP AUTH) but SpamAssassin has stopped scanning incoming messages. What gives?
My logs contain the message ERROR: unable to write X bytes to file descriptor 1: Broken pipe. What does it mean?
When I use the config-test feature on my Plesk server, the SMTP AUTH tests always fail. But I know authentication works, so what's wrong?
Why isn't spamdyke blocking messages from some blacklisted servers/senders/recipients?
I can't figure out why spamdyke isn't working correctly -- some features are malfunctioning in strange ways OR I'm seeing strange/impossible error messages in my logs. What's wrong?
Why are messages still being rejected even after I've added the sender's domain name to my rDNS whitelist? OR Why aren't messages being rejected after I've added the sender's domain name to my rDNS blacklist?
I enabled the IP-in-rDNS filter, so why isn't spamdyke blocking connections from servers with rDNS names that contain IP addresses?
I'm trying to run spamdyke's config-test feature but it only says "Missing qmail-smtpd command". What's wrong?

What is spamdyke and what does it do?

In a sentence, spamdyke is a drop-in qmail filter for stopping spam at connection-time.

"drop-in" means it can be installed without patching or recompiling qmail, without installing or updating libraries, without drastically reconfiguring anything and without having to become a qmail expert.

"connection-time" means spamdyke evaluates and rejects spam while the remote server is still delivering it. Other filters and anti-spam solutions focus on classifying spam after qmail has accepted it. The spam still has to go somewhere. Even if it's filed in a folder, it still occupies disk space, consumes server resources and must be deleted by someone. When spamdyke rejects the incoming spam completely, no one has to deal with it. It's never on the server at all.

For a complete description of spamdyke and all its features, see the README page.

I use email but I don't administer my own mail server. Can I use spamdyke?

Unfortunately, no. spamdyke must be installed by a mail administrator. It cannot be installed or used by end users, sorry.

How do I install spamdyke?

For installation instructions, see the INSTALL.txt file.

How do I upgrade spamdyke? What's the significance of the version numbers?

Typically, upgrading spamdyke is as simple as compiling the new version and copying the new spamdyke binary over the old one. However, sometimes the new version is not backwards-compatible with the old version and simply replacing the binary will cause problems. The UPGRADING.txt file has details on the backwards-compatibility of each version.

The version numbers are used to show the type of changes in each version:

When major, non-backwards-compatible changes are made, the "major" version number is incremented. For example, 3.0.0 is not completely backwards compatible with 2.6.3, so the major version number changed from 2 to 3.
When new, backwards-compatible features are added, only the "minor" version number is incremented. For example, 3.1.0 includes new features but is backwards compatible with 3.0.0, so the minor version number changed from 0 to 1.
When no features are added but bugs are fixed, only the "revision" version number is incremented. For example, 3.0.1 only contained bug fixes, so the revision version number changed from 0 to 1.

How do I get support for spamdyke? Is there a mailing list?

Yes! Visit the mailing list page to sign up: www.spamdyke.org/mailman/listinfo/spamdyke-users. The mailing list archives are searchable at: www.mail-archive.com/spamdyke-users@spamdyke.org.

All of the documentation and releases can be found at on the spamdyke website at spamdyke.org.

If you can't find answers there, send an email to me at: samc (at) silence (dot) org

I don't use qmail. Can I still use spamdyke?

Not at this time.

Some background: Qmail starts a new process (qmail-smtpd) from a listening daemon (tcpserver or xinetd) every time a new connection is established. spamdyke slips in between those two, so the daemon starts a new copy of spamdyke for each connection and spamdyke starts the qmail process.

Most other mail servers don't work this way; they use a single long-running daemon for handling incoming requests. There's no way to insert spamdyke without rewriting the mail server daemon (no thanks!).

Having said that, spamdyke could be modified to listen for incoming connections itself (replacing qmail's tcpserver) and establish a new connection to the "real" server (presumably listening on a different port, perhaps running on a different interface or machine). This requires some refactoring in the spamdyke code but it's not an insurmountable task.

Look for this feature in a future version.

I use Plesk. What is qmail? Can I still use spamdyke?

Yes. Plesk is a (large, complicated) control panel for configuring your server through a web interface. qmail is mail server software that handles email delivery. Plesk uses qmail under the hood and handles most of its configuration.

spamdyke works very well with Plesk, but there's no automated "push-button" way to install it. You'll need to connect to your server's command line (probably using SSH) to install it. If you're comfortable doing that, see the INSTALL.txt file for installation instructions.

Do I have to install the programs from the "utils" directory? Does spamdyke use them? Do they use spamdyke or each other?

No. domainsplit, domain2path and timefilter are just small utilities for use in writing scripts. spamdyke doesn't use them or depend on them. Conversely, they don't use or depend on spamdyke.

dnsa, dnsmx, dnsns, dnsptr, dnssoa and dnstxt are just small, self-contained examples of how to make DNS queries using libc. They're not really useful on their own. I made them available because I was looking for examples like them when I was writing the DNS code in spamdyke and I couldn't find any. Hopefully someone else will be able to learn from them. Aren't I just a swell guy?

When I reconfigure spamdyke, do I have to restart qmail?

Usually not, but it depends on what you've done.

You do not need to restart qmail when you:

Replace the spamdyke binary
Edit spamdyke's configuration file
Edit a file in a configuration directory
Edit a file referenced from spamdyke's configuration (e.g. adding IP addresses to a whitelist file)

You must restart qmail when you:

Edit spamdyke's command line

What is the difference between a right-hand-side blacklist (RHSBL) and a realtime blackhole list (RBL)?

Right-hand-side blacklists (RHSBLs) are lists of domain names. The name refers to the "right-hand side" of an email address -- the portion after the "@" symbol. Realtime blackhole lists (RBLs) are lists of IP addresses.

Why the difference?

Broadly speaking, most mail servers have one IP address and host many domains. Most mail email providers fall into this category -- none of their hosted domains have enough email traffic to justify having their own server, so a single server is configured to handle as many domains as it can. When the server's IP address is blocked, all of the domains it hosts are blocked. RBLs do this.

Conversely, some domains have enough email traffic to need multiple servers. Those servers are dedicated to that one domain and no others. Since the other servers have different IPs, blocking just one IP address is not enough, the domain name must be blocked instead. RHSBLs do this.

Numerous public RBLs exist for blocking IP addresses. Since spammers don't usually send from multiple servers with the same domain name (and they almost always use fake sender addresses), RHSBLs are rather rare.

Which RBLs and/or RHSBLs do you recommend?

None. Blacklists are a sensitive issue. Their legal status is uncertain, depending on what country and area you (and they) live in.

Blacklist operators each have different policies about how entries are placed on the list and how they are removed. Some operators are very forgiving, others are not. For some, a simple delisting request is enough while others require more elaborate "proof" a server/domain has "been reformed". Some require money. Many operators have been known to list huge blocks of IP addresses or domains in order to pressure an ISP to cancel a spammer's service. This effectively recruits all of the list's subscribers into a large group to boycott the ISP's services.

For all of those reasons, I am not willing to endorse any specific list (although I use several myself). You must decide for yourself if you are comfortable with a list operator's policies. Be sure to look for complaints against about the list and responses from the operators. Sometimes those can be the best indicator of a list operator's true attitude.

What operating systems and versions do you test spamdyke with?

On the following systems, spamdyke's configuration script executes correctly, it compiles without errors or warnings and displays its usage message:

Amazon AWS (latest version at time of release)
CentOS 5.2 64-bit
CentOS 5.5
CentOS 6.4 64-bit
CentOS 7 64-bit
Debian 7 RC2 (wheezy)
Fedora Core 4
Fedora 11
FreeBSD 2.2.2
FreeBSD 4.7
FreeBSD 6.0
NetBSD 3.1

spamdyke's test scripts execute correctly on the following systems:

CentOS 7 64-bit, netqmail 1.05 (patched with TLS and SMTP AUTH) and vpopmail 5.4.10
CentOS 7 64-bit, netqmail 1.05 (no additional patches) and vpopmail 5.4.10

If you are unable to configure or compile spamdyke correctly, please send an email to samc (at) silence (dot) org so the errors can be fixed and your system can be added to the list of test environments.

I love spamdyke! How can I help? Can I send you money?

I appreciate your generosity, but I decline donations. I started writing spamdyke to meet my own needs. I continue writing it as a hobby, not to make money. I get much more pleasure and motivation from reading thank-you emails than I would from the occasional donation.

If you want to help, there are several ways:

Spread the word about spamdyke. Let others know, create links to this site, post an article on your blog, etc. Everyone hates spam and statistics are very persuasive.
Join the spamdyke-users mailing list and help answer questions. The spamdyke community is growing and it needs support.
Proofread the documentation and suggest improvements. Write tutorials for installing spamdyke. Translate existing documentation into other languages.
If you are a C programmer, send an email to samc (at) silence (dot) org and volunteer to implement a new feature.

Does spamdyke run its filters in any particular order?

Yes. spamdyke evaluates its filters in the following order (of course a filter is skipped if it's disabled):

Check if mail is being accepted or filtered at all

Check for an rDNS name

Check for an IP address in a country code rDNS name

Check for an rDNS whitelist entry

Check for an rDNS blacklist entry

Check for an IP whitelist entry

Check for an IP blacklist entry

Check for an IP address and keyword in the rDNS name

Check if the rDNS name resolves

Check DNS whitelists

Check right-hand-side whitelists

Check DNS RBLs

Check right-hand-side blacklists

Check for earlytalkers

The intent is to order the filters from least-to-most expensive, so connections will be rejected as quickly as possible. In a typical setup, DNS queries are more expensive than file searches, pattern matching is more expensive than simply checking for a file's existence, etc.

The following filters are all checked during the SMTP conversation.

Check sender whitelists

Check right-hand-side whitelists for the sender's domain name

Check sender blacklists

Check right-hand-side blacklists for the sender's domain name

Check for sender's domain MX record

Check if the sender's address matches the authentication username

Check recipient whitelists

Block relaying from unauthorized remote hosts

Limit the number of recipients

Check for identical sender and recipient addresses

Check recipient blacklists

Block unqualified recipient addresses

Block invalid/unavailable recipient addresses

Graylisting

The following filters are checked while the message body is being sent.

Check header blacklists

If spamdyke is passing TLS traffic to qmail without processing (i.e. spamdyke wasn't compiled with TLS support or doesn't have access to the server certificate), spamdyke can't see the SMTP conversation. In that situation, none of the filters in the second or third group are run.

When spamdyke's config-dir option is in use, no filters are run before the remote server gives the recipient's address. This is because spamdyke must load any additional configuration files before it will know which filters to run. This doesn't change spamdyke's effectiveness at all, it only (slightly) delays when a rejection message is first sent to the remote server.

If spamdyke checks IP blacklists before it checks sender whitelists, will whitelisted senders from blacklisted IPs be blocked?

No. Whitelists (all whitelists) always override all blacklists and all other filters. The order in which spamdyke checks the whitelists is not relevant; spamdyke will check all whitelists before it rejects a connection.

My users authenticate with SMTP AUTH. Can I still use spamdyke?

Yes! As of version 2.5.0, spamdyke understands SMTP AUTH and disables all of its filtering for authenticated users.

See the README page for complete details.

My users authenticate with POP3-before-SMTP. Can I still use spamdyke?

Probably not. If your POP3 server writes authenticated IP addresses to a file, you can use that file as an IP whitelist with spamdyke. If it keeps track of authenticated IP addresses in some other way, you're out of luck.

For example, Plesk supports POP3-before-SMTP but it writes the authenticated IP addresses to its database, which spamdyke cannot read.

POP3-before-SMTP is really a kludge anyway; consider using SMTP AUTH instead.

I want to block all emails unless the sender authenticates. Can spamdyke do that?

Yes! As of version 4.0.0, spamdyke accepts the smtp-auth-level option. When it is set to require-auth, all connections are rejected unless the sender has authenticated.

Prior to version 4.0.0, the easiest way to accomplish this was to first enable SMTP AUTH. Then, create an IP blacklist file that will block all IP addresses:

0.0.0.0/0.0.0.0

spamdyke will disable its filters for authenticated users and block everyone else.

Does spamdyke support TLS?

As of version 2.6.0, spamdyke supports TLS (which is just another name for SSL). spamdyke will detect TLS and pass it through seamlessly. Obviously, none of its post-connect filters will work (e.g. graylisting) because all of the traffic will be encrypted.

However, if spamdyke has access to a server certificate, it will handle the TLS itself and all of its filters will work as before. Bonus: spamdyke will provide TLS even if your qmail has not been patched to provide TLS!

As of version 4.0.0, spamdyke supports SMTP-over-SSL (SMTPS) when the tls-level option is set to smtps.

See the README page for complete details.

I want to whitelist a large number of IP addresses; can I use wildcards?

Yes, as of spamdyke version 2.2.0. The whitelist and blacklist IP files will work with partial IP addresses to represent ranges.

As of spamdyke version 2.4.0, whitelist and blacklist IP files can also contain IP addresses as dotted quad IP address followed by a netmask as a number of bits. Also, spamdyke supports dotted quad IP addresses followed by a netmask as a dotted quad.

As of spamdyke version 3.0.0, whitelisted IP addresses can also be found by using a DNS realtime whitelist. This is like a DNS RBL that lists whitelisted IP addresses instead of blacklisted ones.

For complete details, see the README page.

I want to disable some filters for a few domains (or users) and enable them for everyone else. Is that possible?

Yes, as of spamdyke version 4.0.0. Configuration directories allow per-domain configuration of most of spamdyke's features.

For example, imagine the following scenario: The system administrator wants to check all connections for rDNS names (reject-missing-rdns), for rDNS name resolution (reject-unresolvable-rdns) and for their IP address with a keyword in their rDNS name (ip-in-rdns-keyword-blacklist-file). However, the recipients in one domain, example.com, don't like the unresolvable rDNS name filter or the keyword filter; they want those two disabled.

To accomplish this, spamdyke's main configuration file might contain the following options:

reject-missing-rdns=1

reject-unresolvable-rdns=1

ip-in-rdns-keyword-blacklist-file=/etc/spamdyke/rdns_keywords.txt

config-dir=/etc/spamdyke/config.d

To disable the two filters for all recipients at example.com, the following options would appear in a file named /etc/spamdyke/config.d/_recipient_/com/example:

reject-unresolvable-rdns=0

ip-in-rdns-keyword-blacklist-file=!!!

To accomplish this for a specific user instead of an entire domain, a longer filename is used. For example, to do the above scenario for joe@example.com instead of all users in example.com, the file would be named /etc/spamdyke/config.d/_recipient_/com/example/_at_/joe. Configuration directories can be used to accomplish many complex tasks. For example, if the file for a specific domain contained options to read blacklist and whitelist files from the domain owner's home directory, the domain owner could update their own blacklists and whitelists without help from the system administrator!

For complete details, see the Configuration Directory documentation.

My users' PCs are infected with spambots and are sending spam through my server. Can I force them to authenticate to block the spam?

Yes. See the above question (I want to disable some filters for a few domains (or users) and enable them for everyone else. Is that possible?). In the configuration directory for each of your local domains, add the option:

filter-level=require-auth

I want to stop backscatter spam by blocking all messages to invalid or non-existant recipient addresses. Can spamdyke do that?

As of version 5.0.0, it can. Using the reject-recipient and recipient-validation-command options, spamdyke can use the same control files as qmail to determine where an incoming message will be delivered, then reject the recipient address if it is going to bounce. It does this by using qmail's actual configuration files and parsing them the same way qmail does. In other words, it's not necessary to maintain a separate list of valid addresses on your server. As soon as an address is created or removed, spamdyke will accept or reject it.

NOTE: This feature is not needed on Plesk servers -- Plesk already includes a filter to block invalid recipients by default.

qmail's recipient validation procedure is incredibly complicated -- there are over 167 thousand different ways an address might be valid or invalid. Before you ask, yes, spamdyke has been tested against every single one of them. It's worth looking at the flowchart in the documentation folder just to get an idea of how big the task really is.

Why does spamdyke-qrv have to be marked setuid root? Setuid binaries are evil incarnate!

Unfortunately, running spamdyke-qrv as root is the only way to access all of the files needed to validate a recipient on a qmail server. Recipient validation requires reading most of the files in /var/qmail/control and /var/qmail/users, reading files from within users' (or domains') mail directories and sometimes reading files from home directories of system accounts. On most qmail servers, there is no single non-root user who has access to all of those files.

spamdyke-qrv was created as a separate program with as little code as necessary to validate a single address and exit. This way, the entire spamdyke binary will not need to run as root and it should be easier to verify its security (please do).

If you don't want to allow spamdyke-qrv to run as root, that is entirely your choice. You just won't be able to use the recipient validation feature without it.

On the mailing list, you often promise changes in an upcoming version. How can I find out what you're working on?

The easiest way is to simply ask. Send an email.

Other than that, check out the TODO.txt file. The file contains notes to indicate the intended major changes in the next few versions. Those notes are not a completely accurate predictor of what will happen but they'll give you an idea of what's happening.

Why doesn't spamdyke use CDB files or a database? A database would be faster and better than text files and strange directory structures!

First, you should know CDB files are not the super-efficient silver bullet you may have read about. Processing a CDB file is computationally expensive, which doesn't pay off unless there are thousands of entries. With fewer entries, a flat text file is faster. As the number of entries grows beyond a few thousand, CDB files become less and less efficient. Also, CDB files cannot be modified once they are written; the only way to add/edit/delete records is to recreate the entire file from its original source(s).

Reasons why files and directory structures are preferable to databases:

Speed. Because spamdyke doesn't run as a daemon, a database engine must be loaded and initialized for every incoming connection. That takes time, so the database engine must be very fast to keep the overall time lower than using plain text files.
Memory. Because one copy of spamdyke is started for every incoming connection, memory usage is a big concern on busy servers. Most qmail installations use DJB's softlimit program to limit memory usage for exactly this reason. Database libraries use memory, often quite a bit, to load/cache/parse data. spamdyke must be able to fit within a reasonable limit.
Concurrency. On a busy mail server, hundreds (possibly thousands) of spamdyke processes could be running at the same time. A database engine would have to handle that kind of simultaneous access without failing.
Portability. At the moment, spamdyke runs on every Unix-like platform I've tested. The only external library it uses is OpenSSL (for TLS support) and even that is optional. I believe spamdyke's simplicity is largely responsible for its popularity -- nothing extra must be installed, no existing programs must be recompiled. Requiring a mail server administrator to install a database engine could scare away potential users.
Accessibility. By using only plain text files and directories, spamdyke is easy for anyone to administer and reconfigure with standard command line tools. Any administrator can understand a file of whitelisted IP addresses without help. Very few administrators know SQL. This is very important to me -- I hate proprietary file formats that can only be accessed with special tools. I can't remember how many times I've stared at a malfunctioning Sendmail server and been unable to determine if a given option was even available, much less enabled. This point and the next point are very closely related.
Safety. Plain text files and directories are easy to understand, easy to back up and easy to restore. They can be printed out, emailed, imported into other programs, etc. Most importantly, it's very difficult to corrupt a plain text file by "improperly" stopping spamdyke or losing power. In an emergency, when an administrator is trying to restore a mail server while users are screaming at him, I don't want him wondering if spamdyke's files are intact. If he has any doubts, he should be able to visually verify them with any available text editor. (Microsoft Exchange Server, I'm looking at you here.)
Availability. Mail servers should depend on as few external systems as possible. spamdyke already depends heavily on DNS but that's unfortunately unavoidable. I've done everything I can to make sure spamdyke fails gracefully if DNS is down. Fortunately, DNS servers are (usually) very stable and very reliable. Database engines are not in the same league -- while many databases are fairly reliable they still have much higher downtime than DNS servers. As stated above, I don't want to force a stressed administrator to restore his database server just to get mail flowing again. If I were forced to do that, I would choose instead to uninstall spamdyke (and I would never use it again).

Database servers like MySQL conflict with all 7 points. Embedded database engines like SQLite conflict with points 1, 2, 5 and 6. CDB files conflict with points 5 and 6.

This doesn't mean spamdyke will never use a database. It just means there's a lot to think about. A database would have to solve a really tough problem before it would be considered. Whether it saves some coding time is really not a factor -- the time required to install and administer spamdyke is more important.

Why doesn't spamdyke filter using Sender Policy Framework (SPF), Sender ID, Certified Server Validation (CSV), DomainKeys or DomainKeys Identified Mail (DKIM)?

Each of those systems is very complex, so adding support to spamdyke will not be a small task. They're also all different, so supporting all of them would be a major undertaking. At this point, there are other ways to improve spamdyke that will take less effort and provide much more benefit.

Additionally, I'm not convinced any of those systems will make any difference. They were each designed to prevent spam being sent from forged addreses so, for example, you won't get spam from president@whitehouse.gov. However, most spammers own their domains (often many thousands of them) and control their own DNS, so they can (and do) add SPF/CSV/etc records so their spam follows the SPF/CSV/etc rules. (Hint: Spammers understand and administer their DNS records better than most ISPs do.) The major ISPs (AOL, Yahoo!, GMail) use these systems and still have spam problems; this is pretty good evidence they aren't living up to the hype.

There is a (small) chance these systems could stop spam coming from botnets. If that happens however, the spammers will just start relaying their spam through the compromised machine's ISP's mail servers. (Hint: That's why having every ISP block all outbound port 25 traffic won't stop botnet spam either.)

If you really, really want to filter your email using one or more of these frameworks, try using SpamAssassin. It already supports most of them and you'll be able to judge their effectiveness for yourself.

Why can't spamdyke filter based on message headers like "From:" or "Received:"?

As of version 4.3.0, it can. See the README page for complete details.

Why can't spamdyke block large messages or strip attachments?

Since header blacklisting was added in version 4.3.0, this may now be possible. However, it will require buffering the incoming message into a file on disk; with large attachments being so common, there is no way spamdyke could possibly (or responsibly) buffer entire messages in memory.

Look for this feature in a future version.

My graylist directories are getting huge -- many, many entries. Is this is a problem? What can I do?

For most servers, it isn't a problem, since most of the files contain no data. However, it might be a problem if the server's filesystem runs out of inodes. (An inode is basically a file index within the filesystem. The number of available inodes is large but not infinite; when a file system runs out of inodes, no new files can be created even though there is available disk space.)

On most Unix-like systems, the df command will show the amount of free space on all of the mounted filesystems. To show the number of available inodes, use the command df -i.

To remove old graylist files on systems with GNU tools installed, the following command should work:

find GRAYLIST_FOLDER -type f -mmin +$[SPAMDYKE_MAX_GRAYLIST_SECS/60] -print0 | xargs -0 rm -f

Be sure to replace GRAYLIST_FOLDER with the path to your graylist directory, as given in the spamdyke command line or configuration file. Also replace SPAMDYKE_MAX_GRAYLIST_SECS with the maximum graylist age from the spamdyke command line (-M) or configuration file.

Be careful about deleting graylist entries -- it is possible to be too aggressive and delete them before the remote server(s) have finished delivering their message(s). If the entries disappear while a remote server is still trying to pass the filter, it will be graylisted again (and again and again). Eventually, the remote server wil stop retrying and the message will bounce back to the sender. To avoid this, you shouldn't delete any graylist entries that are less than two weeks old.

Why can't spamdyke automatically delete old graylist entries?

spamdyke doesn't automatically delete old graylist entries because it doesn't run as a daemon. In other words, spamdyke doesn't run constantly on the server. It only runs while a message is being received. If spamdyke suddenly decided to clean up the graylist directory, it would have to do that work while receiving an incoming email message.

If your graylist directory is large, the cleanup could cause a large enough delay to bounce the message. Also, how is spamdyke to decide when to cleanup? Because a new spamdyke process is started for each incoming connection and those processes don't communicate with each other, there's a good chance multiple spamdyke instances would attempt to cleanup the directory at the same time. That would delay multiple incoming messages and place an unnecessary load on the server.

Instead of trying to prevent relaying itself, why doesn't spamdyke just set the `RELAYCLIENT` environment variable based on authentication/sender address/recipient address?

Unfortunately, that isn't possible because of the way environment variables work.

Some background: environment variables are set when a program is started. Most of us think of them in terms of the login shell (e.g. bash), which keeps a list of environment variables to set for every process it starts (i.e. when commands are typed at the prompt). However, each process that runs on a system has an environment, set by the process that started it (the parent process). Once the child process has been started, the parent process cannot alter its environment any more.

More background: when the RELAYCLIENT environment variable is set before qmail-smtpd is started, qmail will allow the connection to relay. Typically, RELAYCLIENT is set in the access file (e.g. /etc/tcp.smtp) based on the IP address of the remote server.

spamdyke must determine what environment variables to set before it starts the qmail-smtpd process (because after qmail-smtpd has been started, spamdyke can't change its environment). For that reason, spamdyke always sets the RELAYCLIENT environment if it has enough information to run its relaying filter. That way, qmail-smtpd will not prevent relaying if spamdyke determines it is allowed (e.g. because the connection is authenticated).

spamdyke sets RELAYCLIENT and controls relaying if any of these conditions are true:

The relay-level option is allow-all.
The remote IP address is given with the ip-relay-entry option.
The remote IP address is listed in a file given with the ip-relay-file option.
The rDNS name is given with the rdns-relay-entry option.
The rDNS name is listed in a file given with the rdns-relay-file option.
The smtp-auth-level option is ondemand, ondemand-encrypted, always, or always-encrypted.

Why can't spamdyke automatically blacklist (or delay) servers that are rejected too many times?

Technically, this is not possible because spamdyke of the way spamdyke works. Without a long-running daemon to keep track of the statistics and "penalties", each instance of spamdyke would have to perform the accounting work on its own. This would take too long and is very likely to be error-prone.

However, even if it were possible, this kind of feature is very problematic, for several reasons.

The only way spamdyke could possibly decide to block a server would be to examine its rejection statistics (i.e. number of rejections per time period exceeds a threshold). However, if spamdyke is already rejecting messages, what is the purpose of blacklisting the server? In effect, that would enable two filters for blocking a server when one is already enough.
Innocent mistakes by users of large email systems could unfairly penalize every user of that system. For example, if an AOL user sends several messages that are rejected, an automatic blacklist might block all messages from all AOL users. This is not acceptable.
Malicious users could use an automatic blacklist to conduct a denial-of-service attack. By sending multiple invalid messages from a victim's server, an attacker could cause the automatic blacklist to block all messages from the victim server.

In summary, an automatic blacklist is likely to cause more problems than it solves. For that reason, spamdyke is unlikely to ever contain such a feature.

However, just because spamdyke doesn't offer an automatic blacklist doesn't mean one can't be created. A script that analyzes spamdyke's logs and modifies a blacklist file would be just as effective.

Why doesn't spamdyke's graylist filter use the IP address of the remote server?

At first glance, it seems more effective to use three criteria for graylisting: the sender's email address, the recipient's email address and the remote server's IP address. Many graylist filters do this. Whether it actually stops more spam is questionable.

However, using the remote server's IP address often causes more problems than it solves. To understand why, consider a large mail service like GMail, Yahoo! or AOL. Such providers handle so many messages that they must use dozens of outbound servers to keep up with the load. Imagine this scenario:

A user on a large provider sends a message.
Server 1 attempts to deliver it. The graylist filter creates a graylist entry and rejects the connection.
Server 1 puts the message back in the queue to retry later.
Some time later, server 2 grabs the message and attempts to deliver it. Server 2's IP address is different from server 1's, so the graylist filter creates a new entry and rejects the connection.
Server 2 puts the message back in the queue to retry later.
Some time later, server 3 grabs the message and attempts to deliver it. Server 3's IP address is different from server 1's and server 2's, so the graylist filter creates a new entry and rejects the connection.
Server 3 puts the message back in the queue to retry later.
...repeat...
The message is never delivered and bounces back to the sender.

By disregarding the remote server's IP address, spamdyke avoids this problem.

Graylisting isn't working! What am I doing wrong?

Most often, graylisting doesn't work because there are no domain directories. spamdyke is designed to allow some flexibility when configuring graylisting, so you can enable it for some domains and disable it for others.

As of version 4.0.0, spamdyke can automatically create domain directories as they are needed. To enable this, look at spamdyke's graylist-level option. If it is always, change it to always-create-dir. If it is only, change it to only-create-dir.

To enable graylisting for a domain without enabling automatic directory creation, you must create a directory within your graylist directory named for the domain you want to graylist. For example, suppose you created a graylist directory:

/home/vpopmail/graylist.d

Then you ran spamdyke with a command line like this:

/usr/local/bin/spamdyke -g /home/vpopmail/graylist.d ...

No graylisting will take place. To enable graylisting for the domain example.com, you must also create another directory:

/home/vpopmail/graylist.d/example.com

Once you've done that, graylisting will begin for all recipients within the example.com domain. Create a directory for each domain you want to graylist.

spamdyke can automatically catch configuration problems like this. See the README page for details.

I want to use spamdyke but some of my users roam and connect from strange places. How can I allow them to send email but still filter spam?

As of version 2.5.0, this is no problem. spamdyke understands SMTP AUTH, so it can authenticate your users and bypass all of its filtering just for them. Bonus: spamdyke will provide SMTP AUTH even if your qmail has not been patched to provide SMTP AUTH!

See the README page for complete details.

I installed spamdyke and now I'm seeing a lot of timeouts in my logs. Why?

Badly written software on the remote hosts. It seems a lot of spam software doesn't handle error codes at all. It just attempts delivery and expects success. When an error code is sent, the software just sits and waits for the success code it wants. Eventually, the connection times out. Sometimes, a remote server will take a long time to begin delivering a large (legitimate) message, which can cause timeouts.

qmail enforces a 20 minute idle timeout but it does so silently (no logging). It's possible you were already getting timeouts and just didn't know it until spamdyke began logging them.

If you suspect legitimate connections are timing out, there are two things you can do. First, you can increase or disable spamdyke's timeouts. Of course, qmail's 20 minute idle timeout will still apply but at least you'll be back where you were before.

Second, you can use spamdyke's full logging feature to log all incoming connections to files. The log files contain timestamps so you can see how quickly the remote server is sending data and where it's stopping. Hopefully that will yield some clues you can use to fix the problem.

I installed spamdyke and now my server is very slow! Incoming connections have to wait 20 seconds or longer before they see the greeting banner. What can I do speed it up?

Most often, delays like these are due to DNS traffic. Depending on the enabled filters, spamdyke can perform the following DNS queries for each incoming connection:

Find the reverse DNS name for the remote server
Find the IP address for the remote server's reverse DNS name
Check the realtime whitelists for TXT records matching the remote server's IP address
Check the realtime whitelists for A records matching the remote server's IP address
Check the realtime blacklists for TXT records matching the remote server's IP address
Check the realtime blacklists for A records matching the remote server's IP address
Check the righthand-side whitelists for TXT records matching the remote server's rDNS name
Check the righthand-side whitelists for A records matching the remote server's rDNS name
Check the righthand-side blacklists for TXT records matching the remote server's rDNS name
Check the righthand-side blacklists for A records matching the remote server's rDNS name
Check the righthand-side whitelists for TXT records matching the sender's domain name
Check the righthand-side whitelists for A records matching the sender's domain name
Check the righthand-side blacklists for TXT records matching the sender's domain name
Check the righthand-side blacklists for A records matching the sender's domain name

That's a lot of queries. If your DNS servers aren't fast enough to keep up, you're going to see delays in spamdyke. To alleviate this, try installing a caching nameserver on your mail server (most operating systems have a pre-built package for this). If you still see delays, begin disabling DNS-based filters until spamdyke begins running faster.

If you have disabled all of the filters and spamdyke is still running slowly, you may have found a bug. Please report it!

I use spamdyke to prevent relaying (because my qmail isn't patched to provide SMTP AUTH) but SpamAssassin has stopped scanning incoming messages. What gives?

You must be using qmail-scanner from qmail-scanner.sourceforge.net. That package has an interesting flaw: it assumes any time the environment variable RELAYCLIENT is set, no scanning should be performed. When spamdyke prevents relaying, it always sets that variable to keep qmail from interfering with its relaying decision.

To reenable scanning, modify your /etc/tcp.smtp file and add QS_SPAMASSASSIN to all of the connections you want scanned. For example:

127.:allow,RELAYCLIENT="",QMAILQUEUE="/var/qmail/bin/qmail-scanner-queue"

:allow,QMAILQUEUE="/var/qmail/bin/qmail-scanner-queue",QS_SPAMASSASSIN=""

Rebuild the CDB file with qmailctl cdb and you should be fine.

My logs contain the message `ERROR: unable to write X bytes to file descriptor 1: Broken pipe`. What does it mean?

The message means spamdyke was trying to send some data to the remote server, but the connection was closed before all of the data was sent. This can indicate a problem (which is why spamdyke logs it) but most often it indicates a spammer's server disconnected as soon as it received an error message.

This error is usually safe to ignore. You can prevent it from being logged by lowering spamdyke's log-level setting.

When I use the `config-test` feature on my Plesk server, the SMTP AUTH tests always fail. But I know authentication works, so what's wrong?

Plesk uses a program called relaylock to provide some of the features spamdyke includes. One of relaylock's jobs is to offer (and probably process) SMTP AUTH.

However, relaylock won't offer SMTP AUTH to connections that come from the local host. It uses the TCPREMOTEIP environment variable to determine the IP address of the connecting server. Try setting it to something other than the localhost IP. Within the bash shell, you can use this command:

export TCPREMOTEIP=11.22.33.44

Within the zsh shell, this command will work:

setenv TCPREMOTEIP 11.22.33.44

If the TCPREMOTEIP environment variable isn't set at all, spamdyke should set it to 0.0.0.0 before it starts its tests.

Why isn't spamdyke blocking messages from some blacklisted servers/senders/recipients?

Most likely, the problem is TLS. If the remote server is using TLS to encrypt its transmission and spamdyke doesn't have access to the certificate (using the tls-certificate-file option), spamdyke can't decrypt the traffic to monitor the entire connection. In that situation, spamdyke will not block the connection if there is any chance it could be allowed.

For example, if the remote server should be blocked because its IP address is blacklisted, spamdyke normally won't reject the connection until it has exhausted every opportuntity to be allowed. If SMTP AUTH is possible, spamdyke must wait to see if the server authenticates. If sender or recipient whitelists are in use, spamdyke must wait to see if any of those whitelists are matched. However, when TLS is in use and spamdyke can't decrypt the traffic, the SMTP AUTH, the sender and the recipient will all be hidden. spamdyke must allow the connection to continue just in case.

Most of the time, when qmail provides TLS for a connection, it will use the text (DHE-RSA-AES256-SHA encrypted) in the Received header entry it adds to the top of the message. If you see that text in the headers of messages that spamdyke should be blocking, TLS is the issue. Change spamdyke's configuration so it can decrypt the TLS traffic and it will block the connections.

I can't figure out why spamdyke isn't working correctly -- some features are malfunctioning in strange ways OR I'm seeing strange/impossible error messages in my logs. What's wrong?

Of course, the most likely explanation is a configuration problem. The config-test feature can help find common mistakes. You may also have found a bug in spamdyke.

However, occassionally very strange malfunctions/errors are reported that no one is able to explain. In those cases, the culprit often turns out to be low memory. The server may have plenty of RAM but many qmail installations use DJB's softlimit program to limit the amount of memory that can be used by a single connection. The limit must allow enough room for spamdyke, qmail and any other filters that have been added. Determining the correct limit is simply impossible -- spamdyke uses too many library functions to be able to predict how much memory or stack space is appropriate.

If you are using the "softlimit" program, stop. Remove it, now. It causes far, far more problems than it solves (assuming it solves any at all). It is "a solution looking for a problem" and a bad tool for modern installations.

OK, if you don't believe the above advice, here's a little technical background. Unix processes can be limited in a number of ways, including limiting the maximum number of open files, maximum open network connections, maximum size of files they can create, etc. Many of these limits make sense -- limiting the number of child processes is easy to understand.

However, limiting the amount of memory or stack space is completely different. Unix allows processes to allocate much more memory than they actually use -- this is common and very efficient. But when a limit is imposed, the most common place to hit the limit is within a system library like glibc or OpenSSL. If spamdyke crashes with "segmentation fault" inside a glibc function, what does that mean? Is there really a bug in glibc? Unlikely. It's far more likely spamdyke allocated or used almost too much memory, then a function within glibc crossed the line and the entire process crashed.

This isn't necessarily a bug anywhere -- it's just the way programs work in a Unix environment. Neither spamdyke nor glibc were actually going to use all of the memory they allocated, so the server would never have run out of RAM. How does anyone debug this problem? It's impossible! The crash information is useless and the runtime environment can't be reproduced. Instead, everyone spends days/weeks trying to track down a problem that doesn't actually exist.

Overall, the correct solution is: do not use "softlimit".

Why are messages still being rejected even after I've added the sender's domain name to my rDNS whitelist? OR Why aren't messages being rejected after I've added the sender's domain name to my rDNS blacklist?

The rDNS white/blacklist works by matching the remote server's reverse DNS name (the name that is found by querying the remote server's IP address through the DNS system). The sender white/blacklist works by matching the sender's email address.

In other words, they're not the same thing at all. It's very common for reverse DNS names to be different than the email domain name, even in the case of large companies. To allow or block connections based on reverse DNS names (i.e. the "origin_rdns" value in spamdyke's log messages), add the name to the rDNS whitelist or blacklist. To allow or block connections based on email domain names (i.e. the "from" value in spamdyke's log messages), add the name to the sender whitelist or blacklist.

I enabled the IP-in-rDNS filter, so why isn't spamdyke blocking connections from servers with rDNS names that contain IP addresses?

The IP-in-rDNS filter requires two things to block a connection: the rDNS name must contain the IP address AND the rDNS name must contain a keyword. For example, consider the following rDNS name:

11.22.33.44.dynamic.example.com

Obviously it contains an IP address (11.22.33.44), but that's not enough to trigger the filter. If the keyword dynamic were supplied, spamdyke would block the connection.

However, note that the keyword example would not be matched. This is because example is part of the last two segments of the domain name. spamdyke will not search for keywords there because that would lead to lots of false positives. It is possible to use a domain name as a keyword, if it is prefixed with a dot. For example:

.example.com

All connections would be blocked where the rDNS name contained the IP address and the name ended in .example.com.

I'm trying to run spamdyke's `config-test` feature but it only says "Missing qmail-smtpd command". What's wrong?

qmail is a strange beast. Since it's essentially unmaintained, lots of different distributions and patch sets have been produced to add various capabilities. In fact, spamdyke's primary purpose is to provide missing capabilities without requiring qmail to be patched or recompiled.

However, not all of spamdyke's filters are needed (or even appropriate) for everyone, depending on exactly what each particular qmail installation can do. For that reason, spamdyke's config-test feature interrogates qmail to see what it can do, then makes recommendations based on what it sees.

In order to check qmail, spamdyke needs to know the command that is used to start qmail (with all arguments). The error "Missing qmail-smtpd command" means the qmail command isn't being supplied on the command line, so spamdyke can't start qmail and check it.

The correct way to use config-test is to find the full spamdyke/qmail command line in your "run" file (or xinetd config file) and run it after adding --config-test near the beginning.

spamdyke

A drop-in connection-time spam filter for qmail

Current Release

5/1/2014: Version 5.0.1

Previous Releases

1/28/2014: Version 5.0.0

1/20/2012: Version 4.3.1

1/15/2012: Version 4.3.0

Frequently Asked Questions

General Questions

Feature Questions

Feature Suggestions

Troubleshooting

What is spamdyke and what does it do?

I use email but I don't administer my own mail server. Can I use spamdyke?

How do I install spamdyke?

How do I upgrade spamdyke? What's the significance of the version numbers?

How do I get support for spamdyke? Is there a mailing list?

I don't use qmail. Can I still use spamdyke?

I use Plesk. What is qmail? Can I still use spamdyke?

Do I have to install the programs from the "utils" directory? Does spamdyke use them? Do they use spamdyke or each other?

When I reconfigure spamdyke, do I have to restart qmail?

What is the difference between a right-hand-side blacklist (RHSBL) and a realtime blackhole list (RBL)?

Which RBLs and/or RHSBLs do you recommend?

What operating systems and versions do you test spamdyke with?

I love spamdyke! How can I help? Can I send you money?

Does spamdyke run its filters in any particular order?

If spamdyke checks IP blacklists before it checks sender whitelists, will whitelisted senders from blacklisted IPs be blocked?

My users authenticate with SMTP AUTH. Can I still use spamdyke?

My users authenticate with POP3-before-SMTP. Can I still use spamdyke?

I want to block all emails unless the sender authenticates. Can spamdyke do that?

Does spamdyke support TLS?

I want to whitelist a large number of IP addresses; can I use wildcards?

I want to disable some filters for a few domains (or users) and enable them for everyone else. Is that possible?

My users' PCs are infected with spambots and are sending spam through my server. Can I force them to authenticate to block the spam?

I want to stop backscatter spam by blocking all messages to invalid or non-existant recipient addresses. Can spamdyke do that?

Why does spamdyke-qrv have to be marked setuid root? Setuid binaries are evil incarnate!

On the mailing list, you often promise changes in an upcoming version. How can I find out what you're working on?

Why doesn't spamdyke use CDB files or a database? A database would be faster and better than text files and strange directory structures!

Why doesn't spamdyke filter using Sender Policy Framework (SPF), Sender ID, Certified Server Validation (CSV), DomainKeys or DomainKeys Identified Mail (DKIM)?

Why can't spamdyke filter based on message headers like "From:" or "Received:"?

Why can't spamdyke block large messages or strip attachments?

My graylist directories are getting huge -- many, many entries. Is this is a problem? What can I do?

Why can't spamdyke automatically delete old graylist entries?

Instead of trying to prevent relaying itself, why doesn't spamdyke just set the RELAYCLIENT environment variable based on authentication/sender address/recipient address?

Why can't spamdyke automatically blacklist (or delay) servers that are rejected too many times?

Why doesn't spamdyke's graylist filter use the IP address of the remote server?

Graylisting isn't working! What am I doing wrong?

I want to use spamdyke but some of my users roam and connect from strange places. How can I allow them to send email but still filter spam?

I installed spamdyke and now I'm seeing a lot of timeouts in my logs. Why?

I installed spamdyke and now my server is very slow! Incoming connections have to wait 20 seconds or longer before they see the greeting banner. What can I do speed it up?

I use spamdyke to prevent relaying (because my qmail isn't patched to provide SMTP AUTH) but SpamAssassin has stopped scanning incoming messages. What gives?

My logs contain the message ERROR: unable to write X bytes to file descriptor 1: Broken pipe. What does it mean?

When I use the config-test feature on my Plesk server, the SMTP AUTH tests always fail. But I know authentication works, so what's wrong?

Why isn't spamdyke blocking messages from some blacklisted servers/senders/recipients?

I can't figure out why spamdyke isn't working correctly -- some features are malfunctioning in strange ways OR I'm seeing strange/impossible error messages in my logs. What's wrong?

Why are messages still being rejected even after I've added the sender's domain name to my rDNS whitelist? OR Why aren't messages being rejected after I've added the sender's domain name to my rDNS blacklist?

I enabled the IP-in-rDNS filter, so why isn't spamdyke blocking connections from servers with rDNS names that contain IP addresses?

I'm trying to run spamdyke's config-test feature but it only says "Missing qmail-smtpd command". What's wrong?

Instead of trying to prevent relaying itself, why doesn't spamdyke just set the `RELAYCLIENT` environment variable based on authentication/sender address/recipient address?

My logs contain the message `ERROR: unable to write X bytes to file descriptor 1: Broken pipe`. What does it mean?

When I use the `config-test` feature on my Plesk server, the SMTP AUTH tests always fail. But I know authentication works, so what's wrong?

I'm trying to run spamdyke's `config-test` feature but it only says "Missing qmail-smtpd command". What's wrong?