An incoming message
Tracing the path of an incoming mail message is one of the fundamental ways in which one diagnoses one's qmail setup. This requires understanding exactly what qmail does with each message. This information is all contained within the documentation, but a lot of newbies seem to have trouble grasping it all.
If you're wondering how qmail works at a very high level, you should start by looking at "the big picture" at http://www.nrg4u.com/ -- this will help you get a bird's eye view of what's going on.
If you don't understand any of the concepts in this document (like envelopes), please look at http://wooledge.org/~greg/mail.html first.
Ultimately, qmail is a set of running processes and configuration files. The first process in which we're interested when tracking incoming mail is qmail-smtpd. (There are alternatives, but we'll assume a standard installation.)
qmail-smtpd is the qmail SMTP daemon, which listens on port 25 for incoming mail. Actually, it's tcpserver (or possibly inetd, or possibly xinetd) that listens on port 25, and which spawns a new instance of qmail-smtpd every time a new connection is established. That's an important distinction, because qmail-smtpd reads its configuration files every time it's started. Since a new qmail-smtpd is started on every connection, that means you never need to restart anything just because you changed qmail-smtpd's configuration files.
If you haven't patched qmail, then there's really only one configuration file for qmail-smtpd that you need to worry about right now: rcpthosts. This file contains your local (and virtual) domain names, one per line. qmail-smtpd reads this and compares the incoming mail's envelope recipients (one by one) to the contents of rcpthosts. Any recipient whose domain is not listed in rcpthosts is rejected. Any recipient whose domain is listed, is accepted.
If you've applied patches to qmail-smtpd for things like SMTP AUTH, then they may be relevant here; but that's beyond the scope of this page.
When qmail-smtpd has accepted the message, it invokes qmail-queue to put the message into the queue. It's a new message, so it hasn't been sorted into its category yet. qmail-send will do that within a few seconds, hopefully.
qmail-send looks at the message recipients, and for each recipient, it categorizes whether this is a local destination or a remote destination. It uses two config files to decide this: locals and virtualdomains. locals is just a list of domain names, one per line. locals is consulted before virtualdomains, and it has priority! If the recipient's domain is listed in locals, then it's a local destination.
If the recipient's domain is not listed in locals, but is listed in virtualdomains (which has a slightly more complex syntax), then it's still a local delivery, but first, the address is rewritten. We'll get to that in a moment.
If the recipient's domain isn't in either of these files, it's a remote destination, and the message gets chucked into the outgoing bin. (qmail-remote will handle that one. We won't worry about outgoing mail for now.)
Once we know a message is local, it gets chucked into the local bin. When that bin is processed, the only thing we need to worry about is the username. Why is that? Because all of the virtual domain magic has already been taken care of during the rewriting process!
Example: let's suppose we have an incoming message for <bob@vdom.com>. tcpserver answers the phone, fires up qmail-smtpd, which reads rcpthosts. vdom.com is listed in rcpthosts, so qmail-smtpd accepts the message. It calls qmail-queue, which queues it up. qmail-send looks in the "new message" bin, and sees this one. It's for someone at vdom.com, which is not in locals. So it looks in virtualdomains, and sees this line:
vdom.com:tammy
When qmail-lspawn sees the message in the local delivery bin, it looks for the username, and sees "tammy-bob". Now, the hyphen ("-") is special here -- it separates the actual username (tammy) from the extension (bob). So we really look for a user named tammy.
Qmail looks for users according to a very well-defined procedure. First, it looks in /var/qmail/users/cdb which is generated from /var/qmail/users/assign (a human-readable text file). The details of the format of this file are a bit tricky -- see the man pages (man qmail-users and man qmail-getpw).
If we don't see tammy in users/cdb then we'll ask the operating system whether tammy is a user.
If tammy isn't a user in either of those places, the message is given to the user alias.
Once it knows who the message is for, qmail-lspawn has to look up how to deliver it. It does this by looking for dot-qmail files (see man dot-qmail) in the user's home directory.
But wait (you say)... how can a "user" have a "home directory" if it's just a virtual user defined in users/cdb and not a real user at all? Look at the format of users/assign more closely -- there's a home directory in there. That's treated just like a normal Unix user's home directory, and it can contain .qmail files.
So qmail-local (which has been spawned by qmail-lspawn by now) changes directory to the user's home directory (even if it's not a real user), and looks for dot-qmail files. Specifically, it starts by looking at the extension (if there is one). Then it looks for a -default file. Then, as a last resort, it will use the default delivery instruction that it inherited from qmail-lspawn (which in turn inherited it from qmail-start, which in turn probably inherited it from your run script or your /var/qmail/rc script).
Example: qmail-local has a message for tammy-bob. It has been invoked as user tammy (either a real user or a virtual one; it doesn't matter here), and it has been given a default instruction for what to do if it can't find instructions in tammy's home directory.
Since the address has an extension on it, qmail-local first looks for a file named .qmail-bob. If it finds that file, great! That file will tell it how to deliver the message. If not, it will look for a file named .qmail-default. If it finds that file, great! That file will tell it how to deliver the message. If those files don't exist, then it will use the default delivery instruction.
If the username doesn't have an extension, then it's a tiny bit different. If the username for our local delivery were simply tammy and not tammy-bob, then we would only look for a .qmail file in tammy's home directory. If that doesn't exist, then qmail-local would use the default delivery instruction.
Troubleshooting
So you've got some mail problems, and you want to see where things are breaking? Now that you know how everything works, you can just trace the path that the mail follows, until you see where things went wrong.
- Check your address. Did you spell it correctly?
Check DNS. What's the MX record for your domain? Does it point to the correct hostname? Does that hostname resolve to the right IP?
Check your firewall. Can you telnet to port 25 of your qmail server from outside your local area network? Perhaps your ISP blocks that port. Perhaps your firewall is misconfigured.
Check rcpthosts and make sure the domain is listed there.
- Check qmail-smtpd's log files and make sure you see the connection logged.
- Check the qmail-send log files. There should be an entry when the new message is queued.
Check the qmail-send log files again. There should be an entry when the delivery is attempted. It will say to local or to remote.
Check locals and virtualdomains.
Check for a valid user definition in /var/qmail/users/assign.
Check that the timestamp on /var/qmail/users/cdb is newer than the one on users/assign.
- Check the .qmail* files in the user's home directory.
Check the permissions on the .qmail file.
Check the permissions on the home directory.
Check the permissions on every parent directory of the home directory.
I'm really serious about that last one. ls -ld / /home /home/tammy
If any of those directories or files are group- or world-writable, you lose.
If you're delivering to alias because tammy doesn't exist, make sure ~alias/.qmail-tammy exists (and/or .qmail-tammy-bob, .qmail-tammy-default, etc.).
Other shit people do
Did you move qmail binaries from another machine to this one? They contain UIDs and GIDs which were defined on the build system at compile time. If your definitions for users like qmaill aren't the same as the build system's definitions, you lose.
Did you check the permissions on /home as well as /home/tammy? Did you check the permissions on / itself?
- Did you read the log files?
Do you know where your log files are? If you used daemontools to set up the daemons, then you want to read the output of cat /service/qmail-send/log/run to see where the log files live. Probably. Unless you did it differently. You do know what you did, right?
When you read the log files, you know that message numbers get reused very frequently, right? It's not at all uncommon for the same message number to be used twice in a row on two different messages. Don't mix them up.
- Did you check free disk space? Quotas?
If your users are defined externally (NIS, LDAP, etc.), does that work properly? Does your OS use an nsswitch.conf file? Is it correct? Is portmap running (for NIS)? Does ypmatch tammy passwd work (for NIS)?
- If your users' home directories are on NFS, is the server reachable? If they're automounted by autofs or amd, is that working properly? Did you check the logs on the NFS server? Did you check the logs on the NFS client?