Differences between revisions 3 and 5 (spanning 2 versions)
Revision 3 as of 2008-05-07 16:42:25
Size: 2471
Editor: Lhunath
Comment: extend upon and explain that PIDs are unreliable.
Revision 5 as of 2008-05-07 17:29:31
Size: 2473
Editor: triband-del-59
Comment:
Deletions are marked like this. Additions are marked like this.
Line 22: Line 22:
'''NOTE: Anything you do that relies on PIDs to identify a process is inheritly flawed! If process A can have a PID B, PID B will always refer to process A. However, once process A dies, the mapping of PID B is UNDEFINED! You may thing that PID B will just not exist, but remember that it is very possible that ANOTHER process launched almost at the same time as process A died taking the SAME PID as process A. That would make the above program think that process A is still alive (its PID exists!) even though it is dead and gone! It is for this reason that nobody should try to manage processes other than the parent of that process. Read the link at the end of this FAQ.''' '''NOTE: Anything you do that relies on PIDs to identify a process is inheritly flawed! If process A can have a PID B, PID B will always refer to process A. However, once process A dies, the mapping of PID B is UNDEFINED! You may think that PID B will just not exist, but remember that it is very possible that ANOTHER process launched almost at the same time as process A died taking the SAME PID as process A. That would make the above program think that process A is still alive (its PID exists!) even though it is dead and gone! It is for this reason that nobody should try to manage processes other than the parent of that process. Read the link at the end of this FAQ.'''
Line 24: Line 24:
More often, there's some alterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running. If this is the case; you should '''fix the program or its configuration''' rather than half-wittedly restarting it each time it dies. If you still think simply restarting the program as soon as it crashes, you should use this: More often, there's some alterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running. If this is the case; you should '''fix the program or its configuration''' rather than half-wittedly restarting it each time it dies. If you still think simply restarting the program as soon as it crashes is necessary, use this:

Anchor(faq42)

How can I find out if a process is still running?

The kill command is used to send signals to a running process. As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running:

  •  myprog &          # Start program in the background
     daemonpid=$!      # ...and save its process id
    
     while sleep 60
     do
         if kill -0 $daemonpid       # Is the process still alive?
         then
             echo >&2 "OK - process is still running"
         else
             echo >&2 "ERROR - process $daemonpid is no longer running!"
             break
         fi
     done

This is one of those questions that usually masks a much deeper issue. It's rare that someone wants to know whether a process is still running simply to display a red or green light to an operator.

NOTE: Anything you do that relies on PIDs to identify a process is inheritly flawed! If process A can have a PID B, PID B will always refer to process A. However, once process A dies, the mapping of PID B is UNDEFINED! You may think that PID B will just not exist, but remember that it is very possible that ANOTHER process launched almost at the same time as process A died taking the SAME PID as process A. That would make the above program think that process A is still alive (its PID exists!) even though it is dead and gone! It is for this reason that nobody should try to manage processes other than the parent of that process. Read the link at the end of this FAQ.

More often, there's some alterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running. If this is the case; you should fix the program or its configuration rather than half-wittedly restarting it each time it dies. If you still think simply restarting the program as soon as it crashes is necessary, use this:

until myprog
do
    echo "ERROR: myprog terminated with exit code: $?.  Restarting .."
    sleep 1
done

This piece of code will restart myprog if it terminated with an exit code other than 0 (indicating something went wrong). If the exit code is 0 (successfully shut down) the loop ends. The latter case generally speaking happens only when you instruct the program to shut down in which case you don't want it to automatically restart itself.

For much better discussion of these issues, see ProcessManagement or [:BashFAQ#faq33:FAQ #33].

BashFAQ/042 (last edited 2012-10-27 10:38:13 by a88-114-128-29)