Differences between revisions 4 and 22 (spanning 18 versions)
Revision 4 as of 2006-09-08 17:42:55
Size: 5725
Editor: GreyCat
Comment: typo
Revision 22 as of 2008-03-02 20:52:48
Size: 13
Editor: port-83-236-62-203
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This is still a work in progress. Expect some rough edges.

{{{<JoeNewbie> How can I check to see if my game server is still running?
<JoeNewbie> I'll put a script in crontab, and if it's not running, I'll restart it...}}}

We get that question (in various forms) ''way'' too often. A user has some daemon with a bug, and rather than fix the bug (which admittedly lies well outside the scope of a normal system administrator's purview), they simply want to restart it whenever it dies. And yes, one could probably write a bash script that would try to parse the output of {{{ps}}} (or preferably {{{pgrep}}} if your system has it), and try to ''guess'' which process ID belongs to the daemon we want, and try to ''guess'' whether it's not there any more. But that's haphazard and dangerous. There are much better ways.

Most Unix systems already ''have'' a feature that allows you to respawn dead processes: {{{init}}} and {{{inittab}}}. If you want to make a new daemon instance pop up whenever the old one dies, typically all you need to do is put an appropriate line into {{{/etc/inittab}}} with the "respawn" action in column 3, and your process's invocation in column 4.

Some Unix systems don't have {{{inittab}}}, and some system administrators might want finer control over the daemons and their logging. Those people may want to look into [http://cr.yp.to/daemontools.html daemontools], or [http://smarden.org/runit/ runit].

This leads into the issue of self-daemonizing programs. There was a trend during the 1980s for Unix daemons such as {{{inetd}}} to put themselves into the background automatically. It seems to be particularly common on BSD systems, although it's widespread across all flavors of Unix.

The problem with this is that any sane method of managing a daemon requires that you ''keep track of it after starting it''. If {{{init}}} is told to respawn a command, it simply launches that command as a child, then uses the {{{wait()}}} system call; and when the child exits, the parent can spawn another one. Daemontools works the same way: a user-supplied {{{run}}} script establishes the environment, and then {{{exec}}}s the process, thereby giving the daemontools supervisor direct parental authority over the process, including standard input and output, etc.

If a process double-forks itself into the background, it breaks the connection to its parent -- intentionally. This makes it unmanageable; the parent can no longer receive the child's output, and can no longer {{{wait()}}} for the child in order to be informed of its death. And the parent won't even know the new daemon's process ID, so it can't even keep track of it with a simple {{{kill -0}}}.

So, the Unix/BSD people came up with workarounds... they created "PID files", in which a long-running daemon would write its process ID, since the parent had no other way to determine it. But PID files are not reliable. A daemon could have died, and then some other process could have taken over its PID, rendering the PID file useless. Or the PID file could simply get deleted, or corrupted. They came up with {{{pgrep}}} and {{{pkill}}} to attempt to track down processes by name instead of by number... but what if the process doesn't have a unique name? What if there's more than one of it at a time, for example, with {{{nfsd}}} or Apache?

These workarounds and tricks are only in place because of the ''original'' hack of self-backgrounding. Get rid of ''that'', and everything else becomes easy! Init or daemontools or runit can just control the child process directly. And even the most raw beginner could write their own wrapper script:

{{{
   #!/bin/sh
   while true; do
      /my/game/server -foo -bar -baz >> /var/log/mygameserver 2>&1
   done
}}}

Then simply arrange for that to be executed at boot time, with a simple {{{&}}} to put it in the background, and ''voila''! An instant one-shot respawn.

Most modern software packages no longer require self-backgrounding; even for those where it's the default behavior (for compatibility with older versions), there's often a switch or a set of switches which allows one to control the process. For instance, Samba's {{{smbd}}} now has a {{{-F}}} switch specifically for use with daemontools and other such programs.

{{{<JoeNewbie> How do I make sure only one copy of my script can run at a time?
}}}

First, ask yourself ''why'' you think that restriction is necessary. Are you using a temporary file with a fixed name, rather than [wiki:Self:BashFaq#faq62 generating a new temporary file in a secure manner] each time? If so, correct that bug in your script. Are you using some system resource without locking it to prevent corruption if multiple processes use it simultaneously? In that case, you should probably use file locking, by rewriting your application in a language that supports it.

The naive answer to this question, which is given all too frequently by well-meaning but inexperienced scripters, would be to run some variant of {{{ps -ef | grep -v grep | grep "$(basename "$0")" | wc -l}}} to count how many copies of the script are in existence at the moment. I won't even attempt to describe how horribly wrong that approach is... if you can't see it for yourself, you'll simply have to take my word for it.

Unfortunately, bash has no facility for locking a file. You can [wiki:Self:BashFaq#45 use a ''directory'' as a lock], but you cannot lock a file directly.

If environmental restrictions ''require'' the use of a shell script, then you may be stuck using that. Otherwise, you should ''seriously'' consider rewriting the functionality you require in a more powerful language.
vandalized.

vandalized.

ProcessManagement (last edited 2023-08-09 06:29:52 by ormaaj)