Differences between revisions 54 and 100 (spanning 46 versions)

BASH Frequently Asked Questions

These are answers to frequently asked questions on channel #bash on the [http://www.freenode.net/ freenode] IRC network. These answers are contributed by the regular members of the channel (originally heiner, and then others including greycat and r00t), and by users like you. If you find something inaccurate or simply misspelled, please feel free to correct it!

All the information here is presented without any warranty or guarantee of accuracy. Use it at your own risk. When in doubt, please consult the man pages or the GNU info pages as the authoritative references.

["BASH"] is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the KornShell, too. If a question is not strictly shell specific, but rather related to Unix, it may be in the UnixFaq.

If you want to help, you can add new questions with answers here, or try to answer one of the BashOpenQuestions.

TableOfContents

Anchor(faq1)

1. How can I read a file line-by-line?

    while read line
    do
        echo "$line"
    done < "$file"

The read command still modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the IFS (internal field separator) variable has to be cleared:

    OIFS=$IFS; IFS=
    while read line
    do
        echo "$line"
    done < "$file"
    IFS=$OIFS

As a feature, the read command concatenates lines that end with a backslash '\' character to one single line. To disable this feature, KornShell and ["BASH"] have read -r:

    OIFS=$IFS; IFS=
    while read -r line
    do
        echo "$line"
    done < "$file"
    IFS=$OIFS

Note that reading a file line by line this way is very slow for large files. Consider using e.g. ["AWK"] instead if you get performance problems.

One may also read from a command instead of a regular file:

    some command | while read line; do
       other commands
    done

That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24].

Sometimes it's useful to read a file into an array, one array element per line. You can do that with the following example:

    O=$IFS IFS=$'\n' arr=($(< myfile)) IFS=$O

This temporarily changes the Input Field Separator to a newline, so that each line will be considered one field by read. Then it populates the array arr with the fields. Then it sets the IFS back to what it was before.

Anchor(faq2)

2. How can I remove the last character of a line?

Using bash and ksh extended parameter substitution:

    var=${var%?}

Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of var. As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least).

More portable, but slower:

    var=`expr "$var" : '\(.*\).'`

or (using sed):

    var=`echo "$var" | sed 's/.$//'`

Anchor(faq3)

3. How can I insert a blank character after each character?

    sed 's/./& /g'

Example:

    $ echo "testing" | sed 's/./& /g'
    t e s t i n g

Anchor(faq4)

4. How can I check whether a directory is empty or not?

The following idea counts the number of entries in the specified directory (omitting ".." and "."):

    find "$dir" -maxdepth 0 -links 2 \
     -exec echo "empty directory: {}" \;

Conversely, to find a non-empty directory:

    find "$dir" -maxdepth 0 -links +2 \
     -exec echo "directory is non-empty" \;

Most modern systems have an "ls -A" which explicitly omits "." and ".." from the directory listing:

    if [ -n "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi

This can be shortened to:

    if [ "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi

Another way, using Bash features, involves setting the special shell option which changes the behavior of globbing. Some people prefer to avoid this approach, because it's so drastically different and could severely alter the behavior of scripts.

Nevertheless, if you're willing to use this approach, it does greatly simplify this particular task:

    shopt -s nullglob
    if [[ -z $(echo *) ]]; then
        echo directory is empty
    fi

It also simplifies various other operations:

    shopt -s nullglob
    for i in *.zip; do
        blah blah "$i"  # No need to check $i is a file.
    done

Without the shopt, that would have to be:

    for i in *.zip; do
        [[ -f $i ]] || continue  # If no .zip files, i becomes *.zip
        blah blah "$i"
    done

(You may want to use the latter anyway, if there's a possibility that the glob may match directories in addition to files.)

Anchor(faq5)

5. How can I convert all upper-case file names to lower case?

# tolower - convert file names to lower case

for file in *
do
    [ -f "$file" ] || continue                  # ignore non-existing names
    newname=$(echo "$file" | tr '[A-Z]' '[a-z]') # lower-case version of file name
    [ "$file" = "$newname" ] && continue        # nothing to do
    [ -f "$newname" ] && continue               # do not overwrite existing files
    mv "$file" "$newname"
done

Purists will insist on using

tr '[[:upper:]]' '[[:lower:]]'

in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them.

This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.

# renamefiles - rename files whose name contain unusual characters
for file in *
do
    [ -f "$file" ] || continue                  # ignore non-existing names
    newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g')
    [ "$file" = "$newname" ] && continue        # nothing to do
    [ -f "$newname" ] && continue               # do not overwrite existing files
    mv "$file" "$newname"
done

The character class in [] contains all allowed characters; modify it as needed.

Anchor(faq6)

6. How can I use a logical AND in a shell pattern (glob)?

That can be achieved through the !() extglob operator. You'll need extglob set. It can be checked with:

$ shopt extglob

and set with:

$ shopt -s extglob

To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:

$ mv foo!(*.d) foo_thursday.d

For the general case:

Delete all files containing Pink_Floyd AND not containing The_Final_Cut:

$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)

By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.

Anchor(faq7)

7. Is there a function to return the length of a string?

The fastest way, not requiring external programs (but usable only with ["BASH"] and KornShell):

${#varname}

expr "$varname" : '.*'

(expr prints the number of characters matching the pattern .*, which is the length of the string)

expr length "$varname"

(for a BSD/GNU version of expr. Do not use this, because it is not ["POSIX"]).

Anchor(faq8)

8. How can I recursively search all files for a string?

On most recent systems (GNU/Linux/BSD), you would use grep -r pattern . to search all files from the current directory (.) downward.

You can use find if your grep lacks -r:

    find . -type f -exec grep -l "$search" '{}' \;

The {} characters will be replaced with the current file name.

This command is slower than it needs to be, because find will call grep with only one file name, resulting in many grep invocations (one per file). Since grep accepts multiple file names on the command line, find can be instrumented to call it with several file names at once:

    find . -type f -exec grep -l "$search" '{}' \+

The trailing '+' character instructs find to call grep with as many file names as possible, saving processes and resulting in faster execution. This example works for POSIX find, e.g. with Solaris.

GNU find uses a helper program called xargs for the same purpose:

    find . -type f -print0 | xargs -0 grep -l "$search"

The -print0 / -0 options ensure that any file name can be processed, even ones containing blanks, TAB characters, or new-lines.

90% of the time, all you need is:

Have grep recurse and print the lines (GNU grep):

    grep -r "$search" .

Have grep recurse and print only the names (GNU grep):

    grep -r -l "$search" .

The find command can be used to run arbitrary commands on every file in a directory (including sub-directories). Replace grep with the command of your choice. The curly braces {} will be replaced with the current file name in the case above.

(Note that they must be escaped in some shells, but not in ["BASH"].)

Anchor(faq9)

9. My command line produces no output: tail -f logfile | grep 'ssh'

Most standard Unix commands buffer their output if used non-interactively. This means, that they don't write each character (or even each line) as they are ready, but collect a larger number (e.g. 4 kilobytes) before printing it. In the case above, the tail command buffers its output, and therefore grep only gets its input in e.g. 4K blocks.

Unfortunately there's no easy solution to this, because the behaviour of the standard programs would need to be changed. *See bottom of section before taking 'no easy solution' to heart*

Some programs provide special command line options for this purpose, e.g.

grep (e.g. GNU version 2.5.1)	`--line-buffered`
sed (e.g. GNU version 4.0.6)	`-u,--unbuffered`
awk (some GNU versions)	`-W interactive, or use the fflush() function`
tcpdump, tethereal	`-l`

The expect package (http://expect.nist.gov/) has an unbuffer example program, which can help here. It disables buffering for the output of a program.

Example usage:

    unbuffer tail -f logfile | grep 'ssh'

There is another option when you have more control over the creation of the log file. If you would like to grep the real-time log of a text interface program which does buffered session logging by default (or you were using script to make a session log), then try this instead:

   $ program | tee -a program.log

   In another window:
   $ tail -f program.log | grep whatever

Apparently this works because tee produces unbuffered output. This has only been tested on GNU tee, YMMV.

A solution to this is to use the 'less' command in follow mode. This is simple to do!

   $ less program.log

Then enter your search pattern (/ is search in less, like vi)

/ssh

Next, put less into follow mode by issuing shift+f

Thats all there is to it! Anchor(faq10)

10. How can I recreate a directory structure, without the files?

With the cpio program:

    cd "$srcdir"
    find . -type d -print | cpio -pdumv "$dstdir"

or with GNU-tar, and less obscure syntax:

    cd "$srcdir"
    find . -type d -print | tar c --files-from - --no-recursion | tar x --directory "$dstdir"

This creates a list of directory names with find, non-recursively adds just the directories to an archive, and pipes it to a second tar instance to extract it at the target location.

Anchor(faq11)

11. How can I print the n'th line of a file?

The dirty (but not quick) way would be sed -n ${n}p "$file" but this reads the whole input file, even if you only wanted the third line.

The following sed command line reads a file printing nothing (-n). At line $n the command "p" is run, printing it, with a "q" afterwards: quit the program.

    sed -n "$n{p;q;}" "$file"

Anchor(faq12)

12. A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle...

    sh -c 'echo "$1"' -- hello

Anchor(faq13)

13. How can I concatenate two variables?

There is no concatenation operator for strings (either literal or variable dereferences) in the shell. The strings are just written one after the other:

    var=$var1$var2

If the right-hand side contains whitespace characters, it needs to be quoted:

    var="$var1 - $var2"

Braces can be used to disambiguate the right-hand side:

    var=${var1}xyzzy
    # without braces, var1xyzzy would be interpreted as a variable name
    # Another equivalent way would be:
    var="$var1"xyzzy

CommandSubstitution can be used as well. The following line creates a log file name logname containing the current date, resulting in names like e.g. log.2004-07-26:

    logname="log.$(date +%Y-%m-%d)"

Appending data to the end of a string doesn't require any black magic, either.

    string="$string more data here"

Bash 3.1 has a new += operator that you may see from time to time:

    string+=" more data here"     # EXTREMELY non-portable!

It's generally best to use the portable syntax.

Anchor(faq14)

14. How can I redirect the output of several commands at once?

Redirecting the standard output of a single command is as easy as

    date > file

To redirect standard error:

    date 2> file

To redirect both:

    date > file 2>&1

In a loop or other larger code structure:

    for i in $list; do
        echo "Now processing $i"
        # more stuff here...
    done > file 2>&1

However, this can become tedious if the output of many programs should be redirected. If all output of a script should go into a file (e.g. a log file), the exec command can be used:

    # redirect both standard output and standard error to "log.txt"
    exec > log.txt 2>&1
    # all output including stderr now goes into "log.txt"

Otherwise command grouping helps:

    {
        date
        # some other command
        echo done
    } > messages.log 2>&1

In this example, the output of all commands within the curly braces is redirected to the file messages.log.

Anchor(faq15)

15. How can I run a command on all files with the extention .gz?

Often a command already accepts several files as arguments, e.g.

    zcat *.gz

(One some systems, you would use gzcat instead of zcat. If neither is available, or if you don't care to play guessing games, just use gzip -dc instead.) If an explicit loop is desired, or if your command does not accept multiple filename arguments in one invocation, the for loop can be used:

    for file in *.gz
    do
        echo "$file"
        # do something with "$file"
    done

To do it recursively, you should use a loop, plus the find command:

    while read file; do
        echo "$file"
        # do something with "$file"
    done < <(find . -name '*.gz' -print)

For more hints in this direction, see [#faq20 FAQ #20], below. To see why the find command comes after the loop instead of before it, see [#faq24 FAQ #24].

Anchor(faq16)

16. How can I remove a file name extension from a string, e.g. file.tar to file?

The easiest (and fastest) way is to use the following:

    $ name="file.tar"
    $ echo "${name%.tar}"
    file

The ${var%pattern} syntax removes the pattern from the end of the variable. ${var#pattern} would remove pattern from the start of the string. This could be used to rename all files from "*.doc" to "*.txt":

    for file in *.doc
    do
        mv "$file" "${file%.doc}".txt
    done

There's more to ParameterSubstitution, e.g. ${var%%pattern}, ${var##pattern}, ${var//old/new}.

Note that this extended form of ParameterSubstitution works with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, sed could be used to remove the filename extension part:

    for file in *.doc
    do
        base=`echo "$file" | sed 's/\.[^.]*$//'`    # remove everything starting with last '.'
        mv "$file" "$base".txt
    done

Finally, some GNU/Linux/BSD systems offer a rename command. There are multiple different rename commands out there with contradictory syntaxes. Consult your man pages to see which one you have (if any).

Anchor(faq17)

17. How can I group expressions, e.g. (A AND B) OR C?

The TestCommand [ uses parentheses () for expression grouping. Given that "AND" is "-a", and "OR" is "-o", the following expression

    (0<n AND n<=10) OR n=-1

can be written as follows:

    if [ \( $n -gt 0 -a $n -le 10 \) -o $n -eq -1 ]
    then
        echo "0 < $n <= 10, or $n=-1"
    else
        echo "invalid number: $n"
    fi

Note that the parentheses have to be quoted: \(, '(' or "(".

["BASH"] and KornShell have different, more powerful comparison commands with slightly different (easier) quoting:

ArithmeticExpression for arithmetic expressions, and
NewTestCommand for string (and file) expressions.

Examples:

    if (( (n>0 && n<10) || n == -1 ))
    then echo "0 < $n < 10, or n==-1"
    fi

    if [[ ( -f $localconfig && -f $globalconfig ) || -n $noconfig ]]
    then echo "configuration ok (or not used)"
    fi

Note that the distinction between numeric and string comparisons is strict. Consider the following example:

    n=3
    if [[ n>0 && n<10 ]]
    then echo "$n is between 0 and 10"
    else echo "ERROR: invalid number: $n"
    fi

The output will be "ERROR: ....", because in a string comparision "3" is bigger than "10", because "3" already comes after "1", and the next character "0" is not considered. Changing the square brackets to double parentheses (( makes the example work as expected.

Anchor(faq18)

18. How can I use numbers with leading zeros in a loop, e.g. 01, 02?

As always, there are different ways to solve the problem, each with its own advantages and disadvantages.

If there are not many numbers, BraceExpansion can be used:

    for i in 0{1,2,3,4,5,6,7,8,9} 10
    do
        echo $i
    done

Output:

00
01
02
03
[...]

This gets tedious for large sequences, but there are other ways, too. If the command seq is available, you can use it as follows:

    seq -w 1 10

or, for arbitrary numbers of leading zeros (here: 3):

    seq -f "%03g" 1 10

If you have the printf command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number, too:

    for ((i=1; i<=10; i++))
    do
        printf "%02d " "$i"
    done

The KornShell and KornShell93 have the typeset command to specify the number of leading zeros:

    $ typeset -Z3 i=4
    $ echo $i
    004

Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes:

i=0
while test $i -le 10
do
    echo "00$i"
    i=`expr $i + 1`
done |
    sed 's/.*\(...\)$/\1/g'

In this example, the number of '.' inside the parentheses in the sed statement determins how many total bytes from the echo command (at the end of each line) will be kept and printed.

One more addendum: in Bash 3, you can use:

printf "%03d \n" {1..300}

Which is slightly easier in some cases.

Also you can use the printf command with xargs and wget to fetch files:

printf "%03d \n" {$START..$END} | xargs -i% wget $LOCATION/%

Sometimes a good solution.

Anchor(faq19)

19. How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?

Some Unix systems provide the split utility for this purpose:

    split --lines 10 --numeric-suffixes input.txt output-

For more flexibility you can use sed. The sed command can print e.g. the line number range 1-10:

    sed -n '1,10p'

This stops sed from printing each line (-n). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). sed still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making sed terminate immediately after printing line 10:

    sed -n -e '1,10p' -e '10q'

Now the command will quit after reading line 10 ("10q"). The -e arguments indicate a script (instead of a file name). The same can be written a little shorter:

    sed -n '1,10p;10q'

We can now use this to print an arbitrary range of a file (specified by line number):

file=/etc/passwd
range=10
firstline=1
maxlines=$(wc -l < "$file") # count number of lines
while (($firstline < $maxlines))
do
    ((lastline=$firstline+$range+1))
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    ((firstline=$firstline+$range+1))
done

This example uses ["BASH"] and KornShell ArithmeticExpressions, which older [wiki:BourneShell Bourne shells] do not have. In that case the following example should be used instead:

file=/etc/passwd
range=10
firstline=1
maxlines=`wc -l < "$file"` # count line numbers
while [ $firstline -le $maxlines ]
do
    lastline=`expr $firstline + $range + 1`
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    firstline=`expr $lastline + 1`
done

Anchor(faq20)

20. How can I find and deal with file names containing newlines, spaces or both?

The preferred method is still to use

    find ... -exec command {} \;

or, if you need to handle filenames en masse:

    find ... -print0 | xargs -0 command

for GNU find/xargs, or (POSIX find):

    find ... -exec command {} +

Use that unless you really can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion (["globbing"]). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well.

This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. (But it will not work in the original BourneShell.)

for file in *.mp3; do
    mv "$file" "${file// /_}"
done

You could do the same thing for all files (regardless of extension) by using

for file in *\ *; do

instead of *.mp3.

Another way to handle filenames recursively involes using the -print0 option of find (a GNU/BSD extension), together with bash's -d option for read:

unset a i
while read -d $'\0' file; do
  a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its word delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec.

Anchor(faq21)

21. How can I replace a string with another string in all files?

sed is a good command to replace strings, e.g.

    sed 's/olddomain\.com/newdomain\.com/g' input > output

To replace a string in all files of the current directory:

    for i in *; do
        sed 's/old/new/g' "$i" > atempfile && mv atempfile "$i"
    done

GNU sed 4.x (but no other version of sed) has a special -i flag which makes the temp file unnecessary:

   for i in *; do
      sed -i 's/old/new/g' "$i"
   done

Those of you who have perl 5 can accomplish the same thing using this code:

    perl -pi -e 's/old/new/g' *

Recursively:

    find . -type f -print0 | xargs -0 perl -pi -e 's/old/new/g'

To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...:

    perl -i.bak -pne 's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g' $(find . -type f)

Finally, here's a script that some people may find useful:

    :
    # chtext - change text in several files

    # neither string may contain '|' unquoted
    old='olddomain\.com'
    new='newdomain\.com'

    # if no files were specified on the command line, use all files:
    [ $# -lt 1 ] && set -- *

    for file
    do
        [ -f "$file" ] || continue # do not process e.g. directories
        [ -r "$file" ] || continue # cannot read file - ignore it
        # Replace string, write output to temporary file. Terminate script in case of errors
        sed "s|$old|$new|g" "$file" > "$file"-new || exit
        # If the file has changed, overwrite original file. Otherwise remove copy
        if cmp "$file" "$file"-new >/dev/null 2>&1
        then rm "$file"-new              # file nas not changed
        else mv "$file"-new "$file"      # file has changed: overwrite original file
        fi
    done

If the code above is put into a script file (e.g. chtext), the resulting script can be used to change a text e.g. in all HTML files of the current and all subdirectories:

    find . -type f -name '*.html' -exec chtext {} \;

Many optimizations are possible:

use another sed separator character than '|', e.g. ^A (ASCII 1)
some implementations of sed (e.g. GNU sed) have an "-i" option that can change a file in-place; no temporary file is necessary in that case
the find command above could use either xargs or the built-in xargs of POSIX find

Note: set -- * in the code above is safe with respect to files whose names contain spaces. The expansion of * by set is the same as the expansion done by for, and filenames will be preserved properly as individual parameters, and not broken into words on whitespace.

A more sophisticated example of chtext is here: http://www.shelldorado.com/scripts/cmds/chtext

Anchor(faq22)

22. How can I calculate with floating point numbers instead of just integers?

["BASH"] does not have built-in floating point arithmetic:

    $ echo $((10/3))
    3

For better precision, an external program must be used, e.g. bc, awk or dc:

    $ echo "scale=3; 10/3" | bc
    3.333

The "scale=3" command notifies bc that three digits of precision after the decimal point are required.

awk can be used for calculations, too:

    $ awk 'BEGIN {printf "%.3f\n", 10 / 3}' /dev/null
    3.333

There is a subtle but important difference between the bc and the awk solution here: bc reads commands and expressions from standard input. awk on the other hand evaluates the expression as part of the program. Expressions on standard input are not evaluated, i.e. echo 10/3 | awk '{print $0}' will print 10/3 instead of the evaluated result of the expression.

This explains why the example uses /dev/null as an input file for awk: the program evaluates the BEGIN action, evaluating the expression and printing the result. Afterwards the work is already done: it reads its standard input, gets an end-of-file indication, and terminates. If no file had been specified, awk would wait for data on standard input.

Newer versions of KornShell93 have built-in floating point arithmetic, together with mathematical functions like sin() or cos() .

Anchor(faq23)

23. How do I append a string to the contents of a variable?

The shell doesn't have a string concatenation operator like Java ("+") or Perl ("."). The following example shows how to append the string ".2004-08-15" to the contents of the shell variable filename:

    filename="$filename.2004-08-15"

If the variable name and the string to append could be confused, the variable name can be enclosed in braces, e.g.

    filename="${filename}old"

instead of filename=$filenameold

Anchor(faq24)

24. I set variables in a loop. Why do they suddenly disappear after the loop terminates?

The following command always prints "total number of lines: 0", although the variable linecnt has a larger value in the while loop:

    linecnt=0
    cat /etc/passwd | while read line
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

The reason for this surprising behaviour is that a while/for/until loop runs in a subshell when its input or output is redirected from a pipeline. For the while loop above, a new subshell with its own copy of the variable linecnt is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecnt of the parent (whose value has not changed) is used in the echo command.

It's hard to tell when shell would create a new process for a loop:

BourneShell creates it when the input or output is redirected, either by using a pipeline or by a redirection operator ('<', '>').
["BASH"] creates a new process only if the loop is part of a pipeline
KornShell creates it only if the loop is part of a pipeline, but not if the loop is the last part of it.

To solve this, either use a method that works without a subshell (shown below), or make sure you do all processing inside that subshell (a bit of a kludge, but easier to work with):

    linecnt=0
    cat /etc/passwd |
    (
        while read line ; do
                linecnt="$((linecnt+1))"
        done
        echo "total number of lines: $linecnt"
    )

To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem at least for ["BASH"] and KornShell (but still for BourneShell):

    linecnt=0
    while read line ; do
        linecnt="$((linecnt+1))"
   done < /etc/passwd
   echo "total number of lines: $linecnt"

A portable and common work-around is to redirect the input of the read command using exec:

    linecnt=0
    exec < /etc/passwd    # redirect standard input from the file /etc/passwd
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:

    exec 3<&0    # save original standard input file descriptor "0" as FD "3"
    exec 0</etc/passwd    # redirect standard input from the file /etc/passwd

    linecnt=0
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done

    exec 0<&3   # restore saved standard input (fd 0) from file descriptor "3"
    exec 3<&-   # close the no longer needed file descriptor "3"

    echo "total number of lines: $linecnt"

Subsequent exec commands can be combined into one line, which is interpreted left-to-right:

    exec 3<&0
    exec 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3
    exec 3<&-

is equivalent to

    exec 3<&0 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3 3<&-

Anchor(faq25)

25. How can I access positional parameters after $9?

Use ${10} instead of $10. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use for, e.g. to get the last parameter:

    for last
    do
        : # nothing
    done

    echo "last argument is: $last"

To get an argument by number, we can use a counter:

    n=12        # This is the number of the argument we are interested in
    i=1
    for arg
    do
        if [ $i -eq $n ]
        then
            argn=arg
            break
        fi
        i=`expr $i + 1`
    done
    echo "argument number $n is: $argn"

This has the advantage of not "consuming" the arguments. If this is no problem, the shift command discards the first positional arguments:

    shift 11
    echo "the 12th argument is: $1"

Although direct access to any positional argument is possible this way, it's hardly needed. The common way is to use getopts(3) to process command line options (e.g. "-l", or "-o filename"), and then use either for or while to process all arguments in turn. An explanation of how to process command line arguments is available here: http://www.shelldorado.com/goodcoding/cmdargs.html

Anchor(faq26)

26. How can I randomize (shuffle) the order of lines in a file?

    randomize(){
        while read l ; do echo "0$RANDOM $l" ; done |
        sort -n |
        cut -d" " -f2-
    }

Note: the leading 0 is to make sure it doesnt break if the shell doesnt support $RANDOM, which is supported by ["BASH"], KornShell, KornShell93 and ["POSIX"] shell, but not BourneShell.

The same idea (printing random numbers in front of a line, and sorting the lines on that column) using other programs:

    awk '
        BEGIN { srand() }
        { print rand() "\t" $0 }
    ' |
    sort -n |    # Sort numerically on first (random number) column
    cut -f2-     # Remove sorting column

This is faster thAn the previous solution, but will not work for very old AWK implementations (try "nawk", or "gawk", if available).

A related question we frequently see is, "How can I print a random line from a file?" The problem here is that you need to know in advance how many lines the file contains. Lacking that knowledge, you have to read the entire file through once just to count them -- or, you have to suck the entire file into memory. Let's explore both of these approaches.

   n=$(wc -l < "$file")        # Count number of lines.
   r=$((RANDOM % n + 1))       # Random number from 1..n.
   sed -n "$r{p;q;}" "$file"   # Print the r'th line.

(These examples use the answer from [#faq11 FAQ 11] to print the n'th line.) The first one's pretty straightforward -- we use wc to count the lines, choose a random number, and then use sed to print the line. If we already happened to know how many lines were in the file, we could skip the wc command, and this would be a very efficient approach.

The next example sucks the entire file into memory. This approach saves time reopening the file, but obviously uses more memory.

   oIFS=$IFS IFS=$'\n' lines=($(<"$file")) IFS=$oIFS
   n=${#lines[@]}
   r=$((RANDOM % n))
   echo "${lines[r]}"

Note that we don't add 1 to the random number in this example, because the array of lines is indexed counting from 0.

Also, some people want to choose a random file from a directory (for a signature on an e-mail, or to chose a random song to play, or a random image to display, etc.). A similar technique can be used:

    files=(*.ogg)               # Or *.gif, or *
    n=${#files[@]}              # For aesthetics
    xmms "${files[RANDOM % n]}" # Choose a random element

Anchor(faq27)

27. How can two processes communicate using named pipes (fifos)?

NamedPipes, also known as FIFOs ("First In First Out") are well suited for inter-process communication. The advantage over using files as a means of communication is, that processes are synchronized by pipes: a process writing to a pipe blocks if there is no reader, and a process reading from a pipe blocks if there is no writer.

Here is a small example of a server process communicating with a client process. The server sends commands to the client, and the client acknowledges each command:

Server

# server - communication example

# Create a FIFO. Some systems don't have a "mkfifo" command, but use
# "mknod pipe p" instead

mkfifo pipe

while sleep 1
do
    echo "server: sending GO to client"

    # The following command will cause this process to block (wait)
    # until another process reads from the pipe
    echo GO > pipe

    # A client read the string! Now wait for its answer. The "read"
    # command again will block until the client wrote something
    read answer < pipe

    # The client answered!
    echo "server: got answer: $answer"
done

Client

# client

# We cannot start working until the server has created the pipe...
until [ -p pipe ]
do
    sleep 1;    # wait for server to create pipe
done

# Now communicate...

while sleep 1
do
    echo "client: waiting for data"

    # Wait until the server sends us one line of data:
    read data < pipe

    # Received one line!
    echo "client: read <$data>, answering"

    # Now acknowledge that we got the data. This command
    # again will block until the server read it.
    echo ACK > pipe
done

Write both examples to files server and client respectively, and start them concurrently to see it working:

    $ chmod +x server client
    $ server & client &
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    [...]

Anchor(faq28)

28. How do I determine the location of my script? I want to read some config files from the same place.

This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable $0. But providing the script name in $0 is only a (very common) convention, not a requirement.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". That's not the case in ["BASH"]. But this isn't reliable across shells; some of them return the actual command typed in by the user instead of the fully qualified path. In those cases, if all you want is the fully qualified version of $0, you can use something like this (["POSIX"], non-Bourne):

  [[ $0 = /* ]] && echo $0 || echo $PWD/$0

Or the BourneShell version:

  case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac

However, this approach has some major drawbacks. The most important is, that the script name (as seen in $0) may not be relative to the current working directory, but relative to a directory from the program search path $PATH (this is often seen with KornShell).

Another drawback is that there is really no guarantee that your script is still in the same place it was when it first started executing. Suppose your script is loaded from a temporary file which is then unlinked immediately... your script might not even exist on disk any more! The script could also have been moved to a different location while it was executing. Or (and this is most likely by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common PATH directory like /usr/local/bin, which is how it's being invoked. Your script might be in /opt/foobar/bin/script but the naive approach of reading $0 won't tell you that.

(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [http://www.cs.bell-labs.com/sys/doc/lexnames.html this Plan 9 paper].)

So if the name in $0 is a relative one, i.e. does not start with '/', we can still try to search the script like the shell would have done: in all directories from $PATH.

The following script shows how this could be done:

    myname=$0
    if [ -s "$myname" ] && [ -x "$myname" ]
    then                   # $myname is already a valid file name
        mypath=$myname
    else
        case "$myname" in
        /*) exit 1;;             # absolute path - do not search PATH
        *)
            # Search all directories from the PATH variable. Take
            # care to interpret leading and trailing ":" as meaning
            # the current directory; the same is true for "::" within
            # the PATH.

            for dir in `echo "$PATH" | sed 's/^:/.:/g;s/::/:.:/g;s/:$/:./;s/:/ /g'`
            do
                [ -f "$dir/$myname" ] || continue # no file
                [ -x "$dir/$myname" ] || continue # not executable
                mypath=$dir/$myname
                break           # only return first matching file
            done
            ;;
        esac
    fi

    if [ -f "$mypath" ]
    then
        : # echo >&2 "DEBUG: mypath=<$mypath>"
    else
        echo >&2 "cannot find full path name: $myname"
        exit 1
    fi

    echo >&2 "path of this script: $mypath"

Note that $mypath is not necessarily an absolute path name. It still can contain relative parts like ../bin/myscript.

Generally storing data files in the same directory as their scripts is a bad practice. The Unix file system layout assumes that files in one place (e.g. /bin) are executable programs, while files in another place (e.g. /etc) are data files. (Let's ignore legacy Unix systems with programs in /etc for the moment, shall we....)

It really makes the most sense to keep your script's configuration in a single, static location such as $SCRIPTROOT/etc/foobar.conf. If you need to define multiple configuration files, then you can have a directory (say, /var/lib/foobar or /usr/local/lib/foobar), and read that directory's location from a variable in /etc/foobar.conf. If you don't even want that much to be hard-coded, you could pass the location of foobar.conf as a parameter to the script. If you need the script to assume certain default in the absence of /etc/foobar.conf, you can put defaults in the script itself, and/or fall back to something like $HOME/.foobar.conf if /etc/foobar.conf is missing. (This depends on what your script does. In some cases, it may make more sense to abort gracefully.)

Anchor(faq29)

29. How can I display value of a symbolic link on standard output?

The external command readlink can be used to display the value of a symbolic link.

$ readlink /bin/sh
bash

you can also use GNU find's %l directive, which is especially useful if you need to resolve links in batches:

$ find /bin/ -type l -printf '%p points to %l\n'
/bin/sh points to bash
/bin/bunzip2 points to bzip2
...

If your system lacks readlink, you can use a function like this one:

readlink() {
    local path=$1 ll

    if [ -L "$path" ]; then
        ll="$(LC_ALL=C ls -l "$path" 2> /dev/null)" &&
        echo "${ll/* -> }"
    else
        return 1
    fi
}

Anchor(faq30)

30. How can I rename all my .foo files to .bar?

Some GNU/Linux distributions have a rename command, which you can use for this purpose; however, the syntax differs from one distribution to the next, so it's not a portable answer.

You can do it in POSIX shells like this:

for f in *.foo; do mv "$f" "${f%.foo}.bar"; done

This invokes the external command mv once for each file, so it may not be as efficient as some of the rename implementations.

If you want to do it recursively, then it becomes much more challenging. This example works (in ["BASH"]) as long as no files have newlines in their names:

find . -name '*.foo' -print | while IFS=$'\n' read -r f; do
  mv "$f" "${f%.foo}.bar"
done

Another common form of this question is "How do I rename all my MP3 files so that they have underscores instead of spaces?" You can use this:

for f in *\ *.mp3; do mv "$f" "${f// /_}"; done

Anchor(faq31)

31. What is the difference between the old and new test commands ([ and [[)?

[ ("test" command) and [[ ("new test" command) are both used to evaluate expressions. Some examples:

    if [ -z "$variable" ]
    then
        echo "variable is empty!"
    fi

    if [ -f "$filename" ]
    then
        echo "not a valid, existing file name: $filename"
    fi

and

    if [[ -e $file ]]
    then
        echo "directory entry does not exist: $file"
    fi

    if [[ $file0 -nt $file1 ]]
    then
        echo "file $file0 is newer than $file1"
    fi

To cut a long story short: [ implements the old, portable syntax of the command. Although all modern shells have built-in implementations, there usually still is an external executable of that name, e.g. /bin/[. [[ is a new improved version of it, which is a keyword, not a program. This has benefical effects on the ease of use, see below. [[ is understood by KornShell, ["BASH"] (e.g. 2.03), KornShell93, ["POSIX"] shell, but not by the older BourneShell.

Although [ and [[ have much in common, and share many expression operators like "-f", "-s", "-n", "-z", there are some notable differences. Here is a comparison list:

Feature	new test `[[`	old test `[`	Example
string comparison	>	(not available)	-
	<	(not available)	-
	== (or =)	=	-
	!=	!=	-
expression grouping	&&	-a	`[[ -n $var && -f $var ]] && echo "$var is a file"`
expression grouping	`\|\|`	-o	-
Pattern matching	=	(not available)	`[[ $name = a* ]] \|\| echo "name does not start with an 'a': $name"`
In-process regular expression matching	=~	(not available)	`[[ $(date) =~ '^Fri ... 13 ' ]] && echo "It's Friday the 13th!"`

Special primitives that [[ is defined to have, but [ may be lacking (depending on the implementation):

Description	Primitive	Example
entry (file or directory) exists	-e	`[[ -e $config ]] && echo "config file exists: $config"`
file is newer/older than other file	-nt / -ot	`[[ $file0 -nt $file1 ]] && echo "$file0 is newer than $file1"`
two files are the same	-ef	`[[ $input -ef $output ]] && { echo "will not overwrite input file: $input"; exit 1; }`
negation	!	-

But there are more subtle differences.

No field splitting will be done for [[ (and therefore many arguments need not to be quoted)
```
 file="file name"
 [[ -f $file ]] && echo "$file is a file"
```
will work even though $file is not quoted and contains whitespace. With [ the variable needs to be quoted:
```
 file="file name"
 [ -f "$file" ] && echo "$file is a file"
```
This makes [[ easier to use and less error prone.
No file name generation will be done for [[. Therefore the following line tries to match the contents of the variable $path with the pattern /*
```
 [[ $path = /* ]] && echo "\$path starts with a forward slash /: $path"
```
The next command most likely will result in an error, because /* is subject to file name generation:
```
 [ $path = /* ] && echo "this does not work"
```
[[ is strictly used for strings and files. If you want to compare numbers, use ArithmethicExpression ((expression)), e.g.
```
 i=0
 while ((i<10))
 do
    echo $i
    ((i=$i+1))
 done
```

When should the new test command [[ be used, and when the old one [? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires ["BASH"] or KornShell, the new syntax could be preferable.

Anchor(faq32)

32. How can I redirect the output of 'time' to a variable or file?

The reason that 'time' needs special care for redirecting its output is one of those mysteries of the universe. The answer will probably be solved around the same time we find dark matter.

File Redirection

     bash -c "time ls" > /path/to/foo 2>&1
     ( time ls ) > /path/to/foo 2>&1
     { time ls; } > /path/to/foo 2>&1

Variable Redirection

     foo=$( bash -c "time ls" 2>&1 )
     foo=$( ( time ls ) 2>&1 )
     foo=$( { time ls; } 2>&1 )

Note: Using 'bash -c' and ( ) creates a subshell, using { } does not. Do with that as you wish.

Anchor(faq33)

33. How can I find a process id for a process given its name?

Usually a process is referred to using its process id (PID), and the ps command can display the information for any process given its process id, e.g.

    $ echo $$         # my process id
    21796
    $ ps -p 21796
    PID TTY          TIME CMD
    21796 pts/5    00:00:00 ksh

But frequently the process id for a process is not known, but only its name. Some operating systems, e.g. Solaris, BSD, and some versions of Linux have a dedicated command to search a process given its name, called "pgrep":

    $ pgrep init
    1

Often there is an even more specialized program available to not just find the process id of a process given its name, but also to send a signal to it:

    $ pkill myprocess

Some systems also provide pidof. It differs from pgrep in that multiple output process IDs are only space separated, not newline separated.

    $ pidof cron
    5392

If these programs are not available, a user can search the output of the ps(1) command using grep.

The major problem when grepping the ps output is that grep may match its own ps entry (try: ps aux | grep init). To make matters worse, this does not happen every time; the techicnal name for this is a "race condition". To avoid this, there are several ways:

Using grep -v at the end

     ps aux | grep name | grep -v grep

will throw away all lines containing "grep" from the output. Disadvantage: You always have the exit state of the grep -v, so you can't e.g. check if a specific process exists.

Using grep -v in the middle

     ps aux | grep -v grep | grep name

This does exactly the same, beside that the exit state of "grep name" is acessible and a representation for "name is a process in ps" or "name is not a process in ps". It still has the disadvantage to start a new process (grep -v).

Using [] in grep

     ps aux | grep [n]ame

This spawns only the needed grep-process. The trick is to use the []-character class (regular expressions). To put only one character in a character group normally makes no sense at all, because a [c] will always be a "c". In this case, it's the same. grep [n]ame searches for "name". But as grep's own process list entry is what you executed ("grep [n]ame") and not "grep name", it will not match itself.

===BEGIN greycat rant===

Most of the time when someone asks a question like this, it's because they want to manage a long-running daemon using primitive shell scripting techniques. Common variants are "How can I get the PID of my foobard process.... so I can start one if it's not already running" or "How can I get the PID of my foobard process... because I want to prevent the foobard script from running if foobard is already active." Both of these questions will lead to seriously flawed production systems.

If what you really want is to restart your daemon whenever it dies, just do this:

while true; do
   mydaemon --in-the-foreground
done

where --in-the-foreground is whatever switch, if any, you must give to the daemon to PREVENT IT from automatically backgrounding itself. (Often, -d does this and has the additional benefit of running the daemon with increased verbosity.) Self-daemonizing programs may or may not be the target of a future greycat rant....

If that's too simplistic, look into [http://cr.yp.to/daemontools.html daemontools] or [http://smarden.org/runit/ runit], which are programs for managing services.

If what you really want is to prevent multiple instances of your program from running, then the only sure way to do that is by using a lock. For details on doing this, see [#faq45 FAQ 45].

===END greycat rant===

Anchor(faq34)

34. Can I do a spinner in Bash?

Sure.

    i=1
    sp="/-|-\|"
    echo -n ' '
    while true
    do
        echo -en "\b${sp:i++%${#sp}:1}"
    done

You can also use \r instead of \b. You can use pretty much any character sequence you want as well. If you want it to slow down, put a sleep command inside the loop.

Anchor(faq35)

35. How can I handle command-line arguments to my script easily?

Well, that depends a great deal on what you want to do with them. Here's a general template that might help for the simple cases:

    while [[ $1 == -* ]]; do
        case "$1" in
          -h|--help) show_help; exit 0;;
          -v) verbose=1; shift;;
          -f) output_file=$2; shift 2;;
        esac
    done
    # Now all of the remaining arguments are the filenames which followed
    # the optional switches.  You can process those with "for i" or "$@".

For more complex/generalized cases, or if you want things like "-xvf" to be handled as three separate flags, you can use getopts or getopt. (Heiner, that's your cue....)

Anchor(faq36)

36. How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction).

Use the comm(1) command.

  # intersection of file1 and file2
  comm -12 <(sort file1) <(sort file2)
  # subtraction of file1 from file2
  comm -13 <(sort file1) <(sort file2)

Read the comm(1) manpage for details.

If for some reason you lack the core comm(1) program, you can use these other methods:

an amazingly simple and fast implementation, that took just 20 seconds to match a 30k line file against a 400k line file for me.

note that it probably only works with GNU grep, and that the file specified with -f is will be loaded into ram, so it doesn't scale for very large files.

it has grep read one of the sets as a pattern list from a file (-f), and interpret the patterns as plain strings not regexps (-F), matching only whole lines (-x).

  # intersection of file1 and file2
  grep -xF -f file1 file2
  # substraction of file1 from file2
  grep -vxF -f file1 file2

an implementation using sort and uniq

  # intersection of file1 and file2
  sort file1 file2 | uniq -d  (Assuming each of file1 or file2 does not have repeated content)
  # file1-file2 (Subtraction)
  sort file1 file2 file2 | uniq -u
  # same way for file2 - file1, change last file2 to file1
  sort file1 file2 file1 | uniq -u

another implementation of substraction:

  cat file1 file1 file2 | sort | uniq -c |
  awk '{ if ($1 == 2) { $1 = ""; print; } }'

This may introduce an extra space at the start of the line; if that's a problem, just strip it away.

Also, this approach assumes that neither file1 nor file2 has any duplicates in it.

Finally, it sorts the output for you. If that's a problem, then you'll have to abandon this approach altogether. Perhaps you could use awk's associative arrays (or perl's hashes or tcl's arrays) instead.

Anchor(faq37)

37. How can I print text in various colors?

Do not hard-code ANSI color escape sequences in your program! The tput command lets you interact with the terminal database in a sane way.

  tput setaf 1; echo this is red
  tput setaf 2; echo this is green
  tput setaf 0; echo now we are back in black

tput reads the terminfo database which contains all the escape codes necessary for interacting with your terminal, as defined by the $TERM variable. For more details, see the terminfo(5) man page.

If you don't know in advance what your user's terminal's default text color is, you can use tput sgr0 to reset the colors to their default settings. This also removes boldface (tput bold), etc.

Anchor(faq38)

38. How do Unix file permissions work?

See ["Permissions"].

Anchor(faq39)

39. What are all the dot-files that bash reads?

See DotFiles.

Anchor(faq40)

40. How do I use dialog to get input from the user?

  foo=$(dialog --inputbox "text goes here" 8 40 2>&1 >/dev/tty)
  echo "The user typed '$foo'"

The redirection here is a bit tricky.

The foo=$(command) is set up first, so the standard output of the command is being captured by bash.
Inside the command, the 2>&1 causes standard error to be sent to where standard out is going -- in other words, stderr will now be captured.
>/dev/tty sends standard output to the terminal, so the dialog box will be seen by the user. Standard error will still be captured, however.

Another common dialog(1)-related question is how to dynamically generate a dialog command that has items which must be quoted (either because they're empty strings, or because they contain internal white space). One can use eval for that purpose, but the cleanest way to achieve this goal is to use an array.

  unset m; i=0
  words=(apple banana cherry "dog droppings")
  for w in "${words[@]}"; do
    m[i++]=$w; m[i++]=""
  done
  dialog --menu "Which one?" 12 70 9 "${m[@]}"

In the previous example, the while loop that populates the m array could have been reading from a pipeline, a file, etc.

Recall that the construction "${m[@]}" expands to the entire contents of an array, but with each element implicitly quoted. It's analogous to the "$@" construct for handling positional parameters. For more details, see [#faq50 FAQ50] below.

Here's another example, using filenames:

    files=(*.mp3)       # These may contain spaces, apostrophes, etc.
    cmd=(dialog --menu "Select one:" 22 76 16); n=6
    i=0
    for f in "${files[@]}"; do
        cmd[n++]=$((i++)); cmd[n++]="$f"
    done
    choice=$("${cmd[@]}" 2>&1 >/dev/tty)

The user's choice will be stored in the choice variable, as an integer, which can in turn be used as an index into the files array.

A seperate but useful function of dialog is to track progress of a process that produces output. Below is an example that uses dialog to track processes writing to a log file. In the dialog window, there is a tailbox where output is stored, and a msgbox with a clickable Quit. Clicking quit will cause trap to execute, removing the tempfile, and destroying the tail process.

  #you can not tail a nonexistant file, so always ensure it pre-exists!
  rm -f dialog-tail.log; echo Initialize log >> dialog-tail.log
  date >> dialog-tail.log
  tempfile=`tempfile 2>/dev/null` || tempfile=/tmp/test$$
  trap "rm -f $tempfile" 0 1 2 5 15
  dialog --title "TAIL BOXES" \
        --begin 10 10 --tailboxbg dialog-tail.log 8 58 \
        --and-widget \
        --begin 3 10 --msgbox "Press OK " 5 30 \
        2>$tempfile &
  mypid=$!;
  for i in 1 2 3;  do echo $i >> dialog-tail.log; sleep 1; done
  echo Done. >> dialog-tail.log
  wait $mypid;

Anchor(faq41)

41. How do I determine whether a variable contains a substring?

  if [[ $foo = *bar* ]]

The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions:

  if [[ $foo =~ ab*c ]]   # bash 3, matches abbbbcde, or ac, etc.

If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax:

  case "$foo" in
    *bar*) .... ;;
  esac

This should allow you to match variables against globbing-style patterns. if you need a portable way to match variables against regular expressions, use grep or egrep.

  if echo "$foo" | egrep some-regex >/dev/null; then ...

Anchor(faq42)

42. How can I find out if a process is still running?

The kill command is used to send signals to a running process. As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running:

 myprog &          # Start program in the background
 daemonpid=$!      # ...and save its process id

 while sleep 60
 do
     if kill -0 $daemonpid       # Is the process still alive?
     then
         echo >&2 "OK - process is still running"
     else
         echo >&2 "ERROR - process $daemonpid is no longer running!"
         break
     fi
 done

Anchor(faq43)

43. How can I use array variables?

BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.

 host[0]="micky"
 host[1]="minnie"
 host[2]="goofy"
 i=0
 while (($i < ${#host[@]} ))
 do
     echo "host number $i is ${host[i++]}"
 done

The awkward experssion ${#host[@]} returns the number of elements for the array host.

It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:

 # BASH
 array=(one two three four)
 # KornShell
 set -A array -- one two three four

Anchor(faq44)

44. How can I use associative arrays or variable variables?

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array:

 # KornShell93 script - does not work with BASH
 typeset -A homedir             # Declare KornShell93 associative array
 homedir[jim]=/home/jim
 homedir[silvia]=/home/silvia
 homedir[alex]=/home/alex
 
 for user in ${!homedir[@]}     # Enumerate all indices (user names)
 do
     echo "Home directory of user $user is ${homedir[$user]}"
 done

BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:

 for user in jim silvia alex
 do
     eval homedir_$user=/home/$user
 done

This creates the variables

 homedir_jim=/home/jim
 homedir_silvia=/home/silvia
 homedir_alex=/home/alex

with the corresponding content. Note the use of the eval command, which interprets a command line not just one time like the shell usually does, but twice. In the first step, the shell uses the input homedir_$user=/home/$user to create a new line homedir_jim=/home/jim. In the second step, caused by eval, this variable assignment is executed, actually creating the variable.

Print the variables using

 for user in jim silvia alex
 do
     varname=homedir_$user              # e.g. "homedir_jim"
     eval varcontent='$'$varname        # e.g. "/home/jim"
     echo "home directory of $user is $varcontent"
 done

The eval line needs some explanation. In a first step the command substitution is run:

```
 eval varcontent='$'$varname
```

becomes

```
 eval varcontent=$homedir_jim
```

In a second step the eval re-evaluates the line, and converts this to

```
 varcontent=/home/jim
```

Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:

it's hard to read and to maintain
the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named hong-hu, because a dash '-' can be no valid part of a user name.
Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it.

Here is the summary. "var" is a constant prefix, "$index" contains index string, "$content" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:

Set variables

  eval "var$index=\"$content\""    # index must only contain characters from [a-zA-Z0-9_]

Print variable content
- ```
  eval "echo \"var$index=\$$varname\""
```

Check if a variable is empty

  if eval "[ -z "\$var$index\" ]"
  then echo "variable is empty: $var$index"
  fi

You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.

Anchor(faq45)

45. How can I ensure that only one instance of a script is running at a time (mutual exclusion)?

We need some means of mutual exclusion. One easy way is to use a "lock": any number of processes can try to acquire the lock simultaneously, but only one of them will succeed.

How can we implement this using shell scripts? Some people suggest creating a lock file, and checking for its presence:

 # locking example -- WRONG

 lockfile=/tmp/myscript.lock
 if [ -f "$lockfile" ]
 then                      # lock is already held
     echo >&2 "cannot acquire lock, giving up: $lockfile"
     exit 0
 else                      # nobody owns the lock
     > "$lockfile"         # create the file
     #...continue script
 fi

This example does not work, because there is a time window between checking and creating the file. Assume two processes are running the code at the same time. Both check if the lockfile exists, and both get the result that it does not exist. Now both processes assume they have acquired the lock -- a disaster waiting to happen. We need an atomic check-and-create operation, and fortunately there is one: mkdir, the command to create a directory:

 # locking example -- CORRECT

 lockdir=/tmp/myscript.lock
 if mkdir "$lockdir"
 then    # directory did not exist, but was created successfully
     echo >&2 "successfully acquired lock: $lockdir"
     # continue script
 else
     echo >&2 "cannot acquire lock, giving up on $lockdir"
     exit 0
 fi

The advantage over using a lock file is, that even when two processes call mkdir at the same time, only one process can succeed at most. This atomicity of check-and-create is ensured at the operating system kernel level.

Note that we cannot use "mkdir -p" to automatically create missing path components: "mkdir -p" does not return an error if the directory exists already, but that's the feature we rely upon to ensure mutual exclusion.

Now let's spice up this example by automatically removing the lock when the script finishes:

 lockdir=/tmp/myscript.lock
 if mkdir "$lockdir"
 then
     echo >&2 "successfully acquired lock"
 
     # Remove lockdir when the script finishes, or when it receives a signal
     trap 'rm -rf "$lockdir"' 0    # remove directory when script finishes
     trap "exit 2" 1 2 3 15        # terminate script when receiving signal
 
     # Optionally create temporary files in this directory, because
     # they will be removed automatically:
     tmpfile=$lockdir/filelist
 
 else
     echo >&2 "cannot acquire lock, giving up on $lockdir"
     exit 0
 fi

This example provides reliable mutual exclusion. There is still the disadvantage that a stale lock file could remain when the script is terminated with a signal not caught (or signal 9, SIGKILL), but it's a good step towards reliable mutual exclusion.

Instead of using mkdir we could also have used the program to create a symbolic link, ln -s.

Anchor(faq46)

46. I want to check to see whether a word is in a list (or an element is a member of a set).

Let's suppose you have your "list" stored as a big string of words, with spaces in between them. (That's the most common case when people are asking this one.) What you actually want to do is determine whether the string " foo " (note the spaces around it) appears in the list. But since your list may not have leading/trailing spaces, you have to add them as well. So, here's the most portable way to do it:

  if echo " $list " | grep " foo " >/dev/null; then ....

GNU grep seems to have a special -w extension which lets you avoid the spaces:

  if echo "$list" | GNUgrep -q -w "foo"; then ....

Finally, if you want to use Bash builtins, you can do it thus:

  if [[ " $list " = *\ foo\ * ]]; then ....

This is basically the same as the original grep -- we surround both the list and the word (foo) with spaces, and then do a simple text matching.

Anchor(faq47)

47. How can I redirect stderr to a pipe?

A pipe can only carry stdout of a program. To pipe stderr through it, you need to redirect stderr to the same destination as stdout. Optionally you can close stdout or redirect it to /dev/null to only get stderr. Some sample code:

# - 'myprog' is an example for a program that outputs both, stdout and
#   stderr
# - after the pipe I will just use a 'cat', of course you can put there
#   what you want

# version 1: redirect stderr towards the pipe while stdout survives (both come
# mixed)
myprog 2>&1 | cat                                                               
                                                                                
# version 2: redirect stderr towards the pipe without getting stdout (it's
# redirected to /dev/null)
myprog 2>&1 >/dev/null | cat
#Note that '>/dev/null' comes after '2>&1', otherwise the stderr will also be directed to /dev/null
                                                                                
# version 3: redirect stderr towards the pipe while the "original" stdout gets
# closed
myprog 2>&1 >&- | cat

Anchor(faq48)

48. Why should I never use eval?

"eval" is a common misspelling of "evil". The section dealing with spaces in file names used to include the following quote "helpful tool (which is probably not as safe as the \0 technique)", end quote.

    Syntax : nasty_find_all [path] [command] <maxdepth>

    #This code is evil and must never be used
    export IFS=" "
    [ -z "$3" ] && set -- "$1" "$2" 1
    FILES=`find "$1" -maxdepth "$3" -type f -printf "\"%p\" "`
    #warning, evilness
    eval FILES=($FILES)
    for ((I=0; I < ${#FILES[@]}; I++))
    do
        eval "$2 \"${FILES[I]}\""
    done
    unset IFS

This script is supposed to recursively search for files with newlines and/or spaces in them, arguing that find -print0 | xargs -0 was unsuitable for some purposes such as multiple commands. It was followed by an instructional description on all the lines involved, which we'll skip.

To its defense, it works:

$ ls -lR
.:
total 8
drwxr-xr-x  2 vidar users 4096 Nov 12 21:51 dir with spaces
-rwxr-xr-x  1 vidar users  248 Nov 12 21:50 nasty_find_all

./dir with spaces:
total 0
-rw-r--r--  1 vidar users 0 Nov 12 21:51 file?with newlines
$ ./nasty_find_all . echo 3
./nasty_find_all
./dir with spaces/file
with newlines
$

But consider this:

$ touch "\"); ls -l $'\x2F'; #"

You just created a file called "); ls -l $'\x2F'; #

Now FILES will contain ""); ls -l $'\x2F'; #. When we do eval FILES=($FILES), it becomes

FILES=(""); ls -l $'\x2F'; #"

Which becomes the two statements FILES=(""); and ls -l / . Congratulations, you just allowed execution of arbitrary commands.

$ touch "\"); ls -l $'\x2F'; #"
$ ./nasty_find_all . echo 3
total 1052
-rw-r--r--   1 root root 1018530 Apr  6  2005 System.map
drwxr-xr-x   2 root root    4096 Oct 26 22:05 bin
drwxr-xr-x   3 root root    4096 Oct 26 22:05 boot
drwxr-xr-x  17 root root   29500 Nov 12 20:52 dev
drwxr-xr-x  68 root root    4096 Nov 12 20:54 etc
drwxr-xr-x   9 root root    4096 Oct  5 11:37 home
drwxr-xr-x  10 root root    4096 Oct 26 22:05 lib
drwxr-xr-x   2 root root    4096 Nov  4 00:14 lost+found
drwxr-xr-x   6 root root    4096 Nov  4 18:22 mnt
drwxr-xr-x  11 root root    4096 Oct 26 22:05 opt
dr-xr-xr-x  82 root root       0 Nov  4 00:41 proc
drwx------  26 root root    4096 Oct 26 22:05 root
drwxr-xr-x   2 root root    4096 Nov  4 00:34 sbin
drwxr-xr-x   9 root root       0 Nov  4 00:41 sys
drwxrwxrwt   8 root root    4096 Nov 12 21:55 tmp
drwxr-xr-x  15 root root    4096 Oct 26 22:05 usr
drwxr-xr-x  13 root root    4096 Oct 26 22:05 var
./nasty_find_all
./dir with spaces/file
with newlines
./
$

It doesn't take much imagination to replace ls -l with rm -rf or worse.

One might think these circumstances are obscure, but one should not be tricked by this. All it takes is one malicious user, or perhaps more likely, a benign user who left the terminal unlocked when going to the bathroom, wrote a funny php uploading script that doesn't sanity check file names or who made the same mistake as oneself in allowing arbitrary code execution (now instead of being limited to the www-user, an attacker can use nasty_find_all to traverse chroot jails and/or gain additional privileges), uses an IRC or IM client that's too liberal in the filenames it accepts for file transfers or conversation logs, etc.

Anchor(faq49)

49. How can I view periodic updates/appends to a file? (ex: growing log file)

tail -f will show you the growing log file. On some systems (e.g. OpenBSD), this will automatically track a rotated log file to the new file with the same name (which is usually what you want). To get the equivalent functionality on GNU systems, use tail --follow=name instead.

This is helpful if you need to view only the updates to the file after your last view.

# Start by setting n=1
   tail -n $n testfile; n="+$(( $(wc -l < testfile) + 1 ))"

Every invocation of this gives the update to the file from where we stopped last. If you know the line number from where you want to start, set n to that.

Anchor(faq50)

50. I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments.

Some people attempt to do things like this:

    # Non-working example
    args="-s 'The subject' $address"
    mail $args < $body

This fails because of word-splitting. When $args is evaluated, it becomes four words: 'The is the second word, and subject' is the third word.

What's needed is a way to maintain each word as a separate item, even if that word contains multiple spaces. Quotes won't do it, but an array will.

    # Working example
    args=(-s "The subject" "$address")
    mail "${args[@]}" < $body

Usually, this question arises when someone is trying to use dialog to construct a menu on the fly. For an example of how to do this properly, see [#faq40 FAQ #40] above.

Anchor(faq51)

51. I want history-search just like in tcsh. How can I bind it to the up and down keys?

Just add the following to /etc/inputrc or your ~/.inputrc

"\e[A":history-search-backward
"\e[B":history-search-forward

Anchor(faq52)

52. How do I convert a file in DOS format to UNIX format. ( Remove CRLF line terminators )

All these are from the sed one-liners page

sed 's/.$//' dosfile              # assumes that all lines end with CR/LF
sed 's/^M$//' dosfile             # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' dosfile

Some distributions have dos2unix command which can do this. In vim, you can use :set fileformat=unix

Anchor(faq53)

53. I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is. Lines wrap around incorrectly.

You must put \[ and \] around any non-printing escape sequences in your prompt. Thus:

BLUE=$(tput setaf 4)
PURPLE=$(tput setaf 5)
BLACK=$(tput setaf 0)
PS1='\[$BLUE\]\h:\[$PURPLE\]\w\[$BLACK\]\$ '

Without the \[ \], bash will think the bytes which constitute the escape sequences for the color codes will actually take up space on the screen, so bash won't be able to know where the cursor actually is.

Anchor(faq54)

54. How can I tell whether a variable contains a valid number?

First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign".

if [[ $foo = *[^0-9]* ]]; then
   echo "'$foo' has a non-digit somewhere in it"
else
   echo "'$foo' is strictly numeric"
fi

This can be done in legacy Bourne shell as well, using case:

case "$foo" in
    *[^0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
esac

If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression. Bash version 3 and above have regular expression support in the [[ command:

if [[ $foo =~ ^[-+]?[0-9]+\(\.[0-9]+\)?$ ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

If you don't have bash version 3, then you would use egrep:

if echo "$foo" | egrep '^[-+]?[0-9]+(\.[0-9]+)?$' >/dev/null; then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
fi

Note that the parentheses in the egrep regular expression don't require backslashes in front of them, whereas the ones in the bash3 command do.

Anchor(faq55)

55. Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which?

Bash processes all redirections from left to right, in order. And the order is significant. Moving them around within a command may change the results of that command.

Here's a simple example:

foo() {
  echo "This is stdout"
  echo "This is stderr" 1>&2
}
foo >/dev/null 2>&1             # produces no output
foo 2>&1 >/dev/null             # writes "This is stderr" on the screen

Why do the results differ? In the first case, >/dev/null is performed first, and therefore the standard output of the command is sent to /dev/null. Then, the 2>&1 is performed, which causes standard error to be sent to the same place that standard output is already going. So both of them are discarded.

In the second example, 2>&1 is performed first. This means standard error is sent to wherever standard output happens to be going -- in this case, the user's terminal. Then, standard output is sent to /dev/null and is therefore discarded. So when we run foo the second time, we see only its standard error, not its standard output.

There are times when we really do want 2>&1 to appear first -- for one example of this, see [#faq40 FAQ 40].

There are other times when we may use 2>&1 without any other redirections. Consider:

find ... 2>&1 | grep "some error"

In this example, we want to search find's standard error (as well as its standard output) for the string "some error". The 2>&1 in the piped command forces standard error to go into the pipe along with standard output. (When pipes and redirections are mixed in this way, remember: the pipe is done first, before any redirections. So find's standard output is already set to point to the pipe before we process the 2>&1 redirection.)

If we wanted to read only standard error in the pipe, and discard standard output, we could do it like this:

find ... 2>&1 >/dev/null | grep "some error"

The redirections in that example are processed thus:

First, the pipe is created. find's output is sent to it.
Next, 2>&1 causes find's standard error to go to the pipe as well.
Finally, >/dev/null causes find's standard output to be discarded, leaving only stderr going into the pipe.

A related question is [#faq59 FAQ #59], which discusses how to send stderr to a pipeline, while leaving stdout unpiped.

Anchor(faq56)

56. How can I untar or unzip multiple tarballs at once?

As the tar command was originally designed to read from and write to tape devices (tar - Tape ARchiver), you can specify only filenames to put inside an archive or to extract out of an archive (e.g. tar x myfileonthe.tape). There is an option to tell tar that the archive is not on some tape, but in a file: -f. This option takes exactly one argument: the filename of the file containing the archive. All other (following) filenames are taken to be archive members:

    tar -x -f backup.tar myfile.txt
    # OR (more common syntax IMHO)
    tar xf backup.tar myfile.txt

Now here's a common mistake -- imagine a directory containing the following archive-files you want to extract all at once:

    $ ls
    backup1.tar backup2.tar backup3.tar

Maybe you think of tar xf *.tar. Let's see:

    $ tar xf *.tar
    tar: backup2.tar: Not found in archive
    tar: backup3.tar: Not found in archive
    tar: Error exit delayed from previous errors

What happened? The shell replaced your *.tar by the matching filenames. You really wrote:

    tar xf backup1.tar backup2.tar backup3.tar

And as we saw earlier, it means: "extract the files backup2.tar and backup3.tar from the archive backup1.tar", which will of course only succeed when there are such filenames stored in the archive.

The solution is relatively easy: extract the contents of all archives one at a time. As we use a UNIX shell and we are lazy, we do that with a loop:

    for tarname in *.tar; do
      tar xf "$tarname"
    done

What happens? The for-loop will iterate through all filenames matching *.tar and call tar xf for each of them. That way you extract all archives one-by-one and you even do it automagically.

The second common archive type in these days is ZIP. The command to extract contents from a ZIP file is unzip (who would have guessed that!). The problem here is the very same: unzip takes only one option specifying the ZIP-file. So, you solve it the very same way:

    for zipfile in *.zip; do
      unzip "$zipfile"
    done

Not enough? Ok. There's another option with unzip: it can take shell-like patterns to specify the ZIP-file names. And to avoid interpretion of those patterns by the shell, you need to quote them. unzip itself and not the shell will interpret *.zip in this case:

    unzip "*.zip"
    # OR, to make more clear what we do:
    unzip \*.zip

(This feature of unzip derives mainly from its origins as an MS-DOS program. MS-DOS's command interpreter does not perform glob expansions, so every MS-DOS program must be able to expand wildcards into a list of filenames. This feature was left in the Unix version, and as we just demonstrated, it can occasionally be useful.)

Anchor(faq57)

57. How can group entries (in a file by common prefixes)?

as in, convert:

    foo: entry1
    bar: entry2
    foo: entry3
    baz: entry4

    foo: entry1 entry3
    bar: entry2
    baz: entry4

there are two simple general methods for this:

sort the file, and then iterate over it, collectin entries until the prefix changes, and then print the collected entries with the previous prefix b iterate over the file, collect entries for each prefix in an array indexed by the prefix

a basic implementation of a) in bash:

old=xxx ; stuff=
(sort file ; echo xxx) | while read prefix line ; do 
        if [[ $prefix = $old ]] ; then
                stuff="$stuff $line"
        else
                echo "$old: $stuff"
                old="$prefix"
                stuff=
        fi
done

and a basic implementation of b) in awk:

    {
        a[$1] = a[$1] " " $2
    }
    END{
        for (x in a) print x, a[x]
    }

usage:

    awk '{a[$1] = a[$1] " " $2}END{for (x in a) print x, a[x]}' file

Anchor(faq58)

58. Can bash handle binary data?

the answer is, basically no... while bash won't have as much problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them. one instance where such would sometimes be handy is for example storing small temporary bitmaps while working with netpbm... here i resorted to adding an extra pnmnoraw to the pipe, creating (larger) ascii files that bash has no problems storing)

if you are feeling adventurous, consider this experiment:

    # bindec.bash, attempt to decode binary data to ascii decimals
    while read -n1 x ;do
        case "$x" in
            '') echo empty ;;
            # insert the 256 lines generated by the following oneliner here:
            # for x in $(seq 0 255) ;do echo "        $'\\$(printf %o $x)') echo $x;;" ;done
        esac
    done

and then pipe binary data into it, maybe like so:

    for x in $(seq 0 255) ;do echo -ne "\\$(printf %o $x)" ;done | bash bindec.bash | nl | less

this suggests that a the 0 character is skipped entirely, because we can't create it with the input generation, and a few others are read as empty strings, giving us this list of binary values bash doesn't like:

    0, 1, 8, 9 (decimal)

enough to conveniently corrupt most binary files we try to process

(note that this refers to storing them in variables... moving data between programs using pipes is always binary clean)

Anchor(faq59)

59. I'd like to pipe stderr only but keep stdout intact.

This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade. (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!)

On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick:

# Redirecting only stderr to a pipe.

exec 3>&1                              # Save current "value" of stdout.
ls -l 2>&1 >&3 3>&- | grep bad 3>&-    # Close fd 3 for 'grep' (but not 'ls').
#              ^^^^   ^^^^
exec 3>&-                              # Now close it for the remainder of the script.

# Thanks, S.C.

To show it as a dialog one-liner:

exec 3>&1
dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/'
exec 3>&-

This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers.

Anchor(faq60)

60. I'm trying to write a script that will change directory, but after the script finishes, I'm back where I started!

Consider this:

   #!/bin/sh
   cd /tmp

If one executes this simple script, what happens? Bash forks, and the parent waits. The child executes the script, including the chdir(2) system call, and then exits. The parent, which was waiting for the child, harvests the child's exit status (presumably 0 for success), and then bash carries on with the next command.

Since the chdir was done by a child process, it has no effect on the parent.

Moreover, there is no conceivable way you can ever have a child process affect any part of the parent's environment, which includes its variables as well as its current working directory.

So, how does one go about it? You can still have the cd command in an external file, but you can't run it as a script. Instead, you must source it (or "dot it in", using the . command, which is a synonym for source).

   echo 'cd /tmp' > $HOME/mycd
   source $HOME/mycd
   pwd                          # Now, we're in /tmp

Anchor(faq61)

61. Is there a list of which features were added to specific releases of Bash?

[http://cnswww.cns.cwru.edu/~chet/bash/NEWS NEWS]: a file tersely listing the notable changes between the current and previous versions
[http://cnswww.cns.cwru.edu/~chet/bash/CHANGES CHANGES]: a complete bash change history
[http://cnswww.cns.cwru.edu/~chet/bash/COMPAT COMPAT]: compatibility issues between bash3 and previous versions

-  ⇤ ← Revision 54 as of 2006-02-09 14:35:50 → 
  Size: 0
  Editor: m4n
  Comment: pat - this is not just tail -f
+   ← Revision 100 as of 2006-09-07 19:17:23 → ⇥
  Size: 90973
  Editor: brick
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
+#pragma section-numbers 2

= BASH Frequently Asked Questions =

These are answers to frequently asked questions on channel #bash on the [http://www.freenode.net/ freenode] IRC network.  These answers are contributed by the regular members of the channel (originally heiner, and then others including greycat and r00t), and by users like you.  If you find something inaccurate or simply misspelled, please feel free to correct it!

All the information here is presented without any warranty or guarantee of accuracy.  Use it at your own risk.  When in doubt, please consult the man pages or the GNU info pages as the authoritative references.

["BASH"] is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the KornShell, too. If a question is not strictly shell specific, but rather related to Unix, it may be in the UnixFaq.

If you want to help, you can add new questions with answers here, or try to answer one of the BashOpenQuestions.

[[TableOfContents]]

[[Anchor(faq1)]]
== How can I read a file line-by-line? ==
{{{
    while read line
    do
        echo "$line"
    done < "$file"
}}}

The {{{read}}} command still modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the IFS (internal field separator) variable has to be cleared:

{{{
    OIFS=$IFS; IFS=
    while read line
    do
        echo "$line"
    done < "$file"
    IFS=$OIFS
}}}

As a feature, the {{{read}}} command concatenates lines that end with a backslash '\' character to one single line. To disable this feature, KornShell and ["BASH"] have {{{read -r}}}:

{{{
    OIFS=$IFS; IFS=
    while read -r line
    do
        echo "$line"
    done < "$file"
    IFS=$OIFS
}}}

Note that reading a file line by line this way is ''very slow'' for large files. Consider using e.g. ["AWK"] instead if you get performance problems.

One may also read from a command instead of a regular file:

{{{
    some command | while read line; do
       other commands
    done
}}}

That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24].

Sometimes it's useful to read a file into an array, one array element per line.  You can do that with the following example:

{{{
    O=$IFS IFS=$'\n' arr=($(< myfile)) IFS=$O
}}}

This temporarily changes the Input Field Separator to a newline, so that each line will be considered one field by read.  Then it populates the array {{{arr}}} with the fields.  Then it sets the {{{IFS}}} back to what it was before.

[[Anchor(faq2)]]
== How can I remove the last character of a line? ==
Using bash and ksh extended parameter substitution:

{{{
    var=${var%?}
}}}

Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of {{{var}}}.  As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least).

More portable, but slower:

{{{
    var=`expr "$var" : '\(.*\).'`
}}}

or (using {{{sed}}}):

{{{
    var=`echo "$var" | sed 's/.$//'`
}}}

[[Anchor(faq3)]]
== How can I insert a blank character after each character? ==
{{{
    sed 's/./& /g'
}}}

Example:

{{{
    $ echo "testing" | sed 's/./& /g'
    t e s t i n g
}}}

[[Anchor(faq4)]]
== How can I check whether a directory is empty or not? ==
The following idea counts the number of entries in the specified directory (omitting ".." and "."):

{{{
    find "$dir" -maxdepth 0 -links 2 \
     -exec echo "empty directory: {}" \;
}}}

Conversely, to find a non-empty directory:

{{{
    find "$dir" -maxdepth 0 -links +2 \
     -exec echo "directory is non-empty" \;
}}}

Most modern systems have an "ls -A" which explicitly omits "." and ".." from the directory listing:

{{{
    if [ -n "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi
}}}

This can be shortened to:

{{{
    if [ "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi
}}}

Another way, using Bash features, involves setting the special shell option which changes the behavior of globbing.  Some people prefer to avoid this approach, because it's so drastically different and could severely alter the behavior of scripts.

Nevertheless, if you're willing to use this approach, it does greatly simplify this particular task:

{{{
    shopt -s nullglob
    if [[ -z $(echo *) ]]; then
        echo directory is empty
    fi
}}}

It also simplifies various other operations:

{{{
    shopt -s nullglob
    for i in *.zip; do
        blah blah "$i"  # No need to check $i is a file.
    done
}}}

Without the {{{shopt}}}, that would have to be:

{{{
    for i in *.zip; do
        [[ -f $i ]] || continue  # If no .zip files, i becomes *.zip
 blah blah "$i"
    done
}}}

(You may want to use the latter anyway, if there's a possibility that the glob  may match directories in addition to files.)

[[Anchor(faq5)]]
== How can I convert all upper-case file names to lower case? ==
{{{
# tolower - convert file names to lower case

for file in *
do
    [ -f "$file" ] || continue                  # ignore non-existing names
    newname=$(echo "$file" | tr '[A-Z]' '[a-z]') # lower-case version of file name
    [ "$file" = "$newname" ] && continue        # nothing to do
    [ -f "$newname" ] && continue               # do not overwrite existing files
    mv "$file" "$newname"
done
}}}

Purists will insist on using
{{{
tr '[[:upper:]]' '[[:lower:]]'
}}}
in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them.

This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.

{{{
# renamefiles - rename files whose name contain unusual characters
for file in *
do
    [ -f "$file" ] || continue                  # ignore non-existing names
    newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g')
    [ "$file" = "$newname" ] && continue        # nothing to do
    [ -f "$newname" ] && continue               # do not overwrite existing files
    mv "$file" "$newname"
done
}}}

The character class in {{{[]}}} contains all allowed characters; modify it as needed.

[[Anchor(faq6)]]
== How can I use a logical AND in a shell pattern (glob)? ==
That can be achieved through the !() extglob operator. You'll need {{{extglob}}} set.  It can be checked with:
{{{
$ shopt extglob
}}}

and set with:
{{{
$ shopt -s extglob
}}}

To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:
{{{
$ mv foo!(*.d) foo_thursday.d
}}}

For the general case:

Delete all files containing Pink_Floyd AND not containing The_Final_Cut:

{{{
$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)
}}}

By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.

[[Anchor(faq7)]]
== Is there a function to return the length of a string? ==
The fastest way, not requiring external programs (but usable only with ["BASH"] and KornShell):
{{{
${#varname}
}}}

or

{{{
expr "$varname" : '.*'
}}}

({{{expr}}} prints the number of characters matching the pattern {{{.*}}}, which is the length of the string)

or

{{{
expr length "$varname"
}}}

(for a BSD/GNU version of {{{expr}}}. Do not use this, because it is not ["POSIX"]).

[[Anchor(faq8)]]
== How can I recursively search all files for a string? ==
On most recent systems (GNU/Linux/BSD), you would use {{{grep -r pattern .}}} to search all files from the current directory (.) downward.

You can use {{{find}}} if your {{{grep}}} lacks -r:
{{{
    find . -type f -exec grep -l "$search" '{}' \;
}}}

The {} characters will be replaced with the current file name.

This command is slower than it needs to be, because {{{find}}} will call {{{grep}}} with only one file name, resulting in many {{{grep}}} invocations (one per file). Since {{{grep}}} accepts multiple file names on the command line, {{{find}}} can be instrumented to call it with several file names at once:
{{{
    find . -type f -exec grep -l "$search" '{}' \+
}}}

The trailing '+' character instructs {{{find}}} to call {{{grep}}} with as many file names as possible, saving processes and resulting in faster execution. This example works for POSIX {{{find}}}, e.g. with Solaris.

GNU find uses a helper program called {{{xargs}}} for the same purpose:
{{{
    find . -type f -print0 | xargs -0 grep -l "$search"
}}}

The {{{-print0}}} / {{{-0}}} options ensure that any file name can be processed, even ones containing blanks, TAB characters, or new-lines.

90% of the time, all you need is:

Have grep recurse and print the lines (GNU grep):
{{{
    grep -r "$search" .
}}}

Have grep recurse and print only the names (GNU grep):
{{{
    grep -r -l "$search" .
}}}

The {{{find}}} command can be used to run arbitrary commands on every file in a directory (including sub-directories). Replace {{{grep}}} with the command of your choice. The curly braces {} will be replaced with the current file name in the case above.

(Note that they must be escaped in some shells, but not in ["BASH"].)

[[Anchor(faq9)]]
== My command line produces no output: tail -f logfile | grep 'ssh' ==
Most standard Unix commands buffer their output if used non-interactively. This means, that they don't write each character (or even each line) as they are ready, but collect a larger number (e.g. 4 kilobytes) before printing it.
In the case above, the {{{tail}}} command buffers its output, and therefore {{{grep}}} only gets its input in e.g. 4K blocks.

Unfortunately there's no easy solution to this, because the behaviour of the standard programs would need to be changed.  *See bottom of section before taking 'no easy solution' to heart*

Some programs provide special command line options for this purpose, e.g.

||grep (e.g. GNU version 2.5.1)||{{{--line-buffered}}}||
||sed (e.g. GNU version 4.0.6)||{{{-u,--unbuffered}}}||
||awk (some GNU versions)||{{{-W interactive, or use the fflush() function}}}||
||tcpdump, tethereal||{{{-l}}}||

The {{{expect}}} package (http://expect.nist.gov/) has an {{{unbuffer}}} example program, which can help here. It disables buffering for the output of a program.

Example usage:

{{{
    unbuffer tail -f logfile | grep 'ssh'
}}}

There is another option when you have more control over the creation of the log file. If you would like to {{{grep}}} the real-time log of a text interface program which does buffered session logging by default (or you were using {{{script}}} to make a session log), then try this instead:

{{{
   $ program | tee -a program.log

   In another window:
   $ tail -f program.log | grep whatever
}}}

Apparently this works because {{{tee}}} produces unbuffered output.  This has only been tested on GNU {{{tee}}}, YMMV.

A solution to this is to use the 'less' command in follow mode.  This is simple to do!
{{{
   $ less program.log
}}}
Then enter your search pattern (/ is search in less, like vi)
   /ssh

Next, put less into follow mode by issuing shift+f

Thats all there is to it!
[[Anchor(faq10)]]
== How can I recreate a directory structure, without the files? ==
With the {{{cpio}}} program:
{{{
    cd "$srcdir"
    find . -type d -print | cpio -pdumv "$dstdir"
}}}

or with GNU-{{{tar}}}, and less obscure syntax:

{{{
    cd "$srcdir"
    find . -type d -print | tar c --files-from - --no-recursion | tar x --directory "$dstdir"
}}}

This creates a list of directory names with find, non-recursively adds just the directories to an archive, and pipes it to a second tar instance to extract it at the target location.

[[Anchor(faq11)]]
== How can I print the n'th line of a file? ==
The dirty (but not quick) way would be {{{sed -n ${n}p "$file"}}} but this reads the whole input file, even if you only wanted the third line.

The following {{{sed}}} command line reads a file printing nothing (-n). At line $n the command "p" is run, printing it, with a "q" afterwards: quit the program.

{{{
    sed -n "$n{p;q;}" "$file"
}}}

[[Anchor(faq12)]]
== A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle... ==
{{{
    sh -c 'echo "$1"' -- hello
}}}

[[Anchor(faq13)]]
== How can I concatenate two variables? ==
There is no concatenation operator for strings (either literal or variable dereferences) in the shell. The strings are just written one after the other:

{{{
    var=$var1$var2
}}}

If the right-hand side contains whitespace characters, it needs to be quoted:

{{{
    var="$var1 - $var2"
}}}

Braces can be used to disambiguate the right-hand side:

{{{
    var=${var1}xyzzy
    # without braces, var1xyzzy would be interpreted as a variable name
    # Another equivalent way would be:
    var="$var1"xyzzy
}}}

CommandSubstitution can be used as well. The following line creates a log file name {{{logname}}} containing the current date, resulting in names like e.g. {{{log.2004-07-26}}}:

{{{
    logname="log.$(date +%Y-%m-%d)"
}}}

Appending data to the end of a string doesn't require any black magic, either.

{{{
    string="$string more data here"
}}}

Bash 3.1 has a new += operator that you may see from time to time:

{{{
    string+=" more data here"     # EXTREMELY non-portable!
}}}

It's generally best to use the portable syntax.

[[Anchor(faq14)]]
== How can I redirect the output of several commands at once? ==
Redirecting the standard output of a single command is as easy as
{{{
    date > file
}}}

To redirect standard error:
{{{
    date 2> file
}}}

To redirect both:
{{{
    date > file 2>&1
}}}

In a loop or other larger code structure:
{{{
    for i in $list; do
        echo "Now processing $i"
        # more stuff here...
    done > file 2>&1
}}}

However, this can become tedious if the output of many programs should be redirected. If all output of a script should go into a file (e.g. a log file), the {{{exec}}} command can be used:

{{{
    # redirect both standard output and standard error to "log.txt"
    exec > log.txt 2>&1
    # all output including stderr now goes into "log.txt"
}}}

Otherwise command grouping helps:

{{{
    {
        date
        # some other command
        echo done
    } > messages.log 2>&1
}}}

In this example, the output of all commands within the curly braces is redirected to the file {{{messages.log}}}.

[[Anchor(faq15)]]
== How can I run a command on all files with the extention .gz? ==
Often a command already accepts several files as arguments, e.g.

{{{
    zcat *.gz
}}}

(One some systems, you would use {{{gzcat}}} instead of {{{zcat}}}.  If neither is available, or if you don't care to play guessing games, just use {{{gzip -dc}}} instead.)  If an explicit loop is desired, or if your command does not accept multiple filename arguments in one invocation, the {{{for}}} loop can be used:

{{{
    for file in *.gz
    do
        echo "$file"
        # do something with "$file"
    done
}}}

To do it recursively, you should use a loop, plus the find command:

{{{
    while read file; do
        echo "$file"
        # do something with "$file"
    done < <(find . -name '*.gz' -print)
}}}

For more hints in this direction, see [#faq20 FAQ #20], below.  To see why the find command comes after the loop instead of before it, see [#faq24 FAQ #24].

[[Anchor(faq16)]]
== How can I remove a file name extension from a string, e.g. file.tar to file? ==
The easiest (and fastest) way is to use the following:

{{{
    $ name="file.tar"
    $ echo "${name%.tar}"
    file
}}}

The {{{${var%pattern}}}} syntax removes the pattern from the end of the variable. {{{${var#pattern}}}} would remove pattern from the start of the string. This could be used to rename all files from "*.doc" to "*.txt":

{{{
    for file in *.doc
    do
        mv "$file" "${file%.doc}".txt
    done
}}}

There's more to ParameterSubstitution, e.g. {{{${var%%pattern}, ${var##pattern}, ${var//old/new}}}}.

Note that this extended form of ParameterSubstitution works with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, {{{sed}}} could be used to remove the filename extension part:

{{{
    for file in *.doc
    do
        base=`echo "$file" | sed 's/\.[^.]*$//'`    # remove everything starting with last '.'
        mv "$file" "$base".txt
    done
}}}

Finally, some GNU/Linux/BSD systems offer a {{{rename}}} command.  There are multiple different {{{rename}}} commands out there with contradictory syntaxes.  Consult your man pages to see which one you have (if any).

[[Anchor(faq17)]]
== How can I group expressions, e.g. (A AND B) OR C? ==
The TestCommand {{{[}}} uses parentheses () for expression grouping. Given that "AND" is "-a", and "OR" is "-o", the following expression

{{{
    (0<n AND n<=10) OR n=-1
}}}

can be written as follows:

{{{
    if [ \( $n -gt 0 -a $n -le 10 \) -o $n -eq -1 ]
    then
        echo "0 < $n <= 10, or $n=-1"
    else
        echo "invalid number: $n"
    fi
}}}

Note that the parentheses have to be quoted: \(, '(' or "(".

["BASH"] and KornShell have different, more powerful comparison commands with slightly different (easier) quoting:
 * ArithmeticExpression for arithmetic expressions, and
 * NewTestCommand for string (and file) expressions.

Examples:
{{{
    if (( (n>0 && n<10) || n == -1 ))
    then echo "0 < $n < 10, or n==-1"
    fi
}}}

or
{{{
    if [[ ( -f $localconfig && -f $globalconfig ) || -n $noconfig ]]
    then echo "configuration ok (or not used)"
    fi
}}}

Note that the distinction between numeric and string comparisons is strict. Consider the following example:
{{{
    n=3
    if [[ n>0 && n<10 ]]
    then echo "$n is between 0 and 10"
    else echo "ERROR: invalid number: $n"
    fi
}}}

The output will be "ERROR: ....", because in a ''string comparision'' "3" is bigger than "10", because "3" already comes after "1", and the next character "0" is not considered. Changing the square brackets to double parentheses {{{((}}} makes the example work as expected.

[[Anchor(faq18)]]
== How can I use numbers with leading zeros in a loop, e.g. 01, 02? ==
As always, there are different ways to solve the problem, each with its own advantages and disadvantages.

If there are not many numbers, BraceExpansion can be used:
{{{
    for i in 0{1,2,3,4,5,6,7,8,9} 10
    do
        echo $i
    done
}}}

Output:
{{{
00
01
02
03
[...]
}}}

This gets tedious for large sequences, but there are other ways, too.  If the command {{{seq}}} is available, you can use it as follows:
{{{
    seq -w 1 10
}}}

or, for arbitrary numbers of leading zeros (here: 3):

{{{
    seq -f "%03g" 1 10
}}}

If you have the {{{printf}}} command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number, too:

{{{
    for ((i=1; i<=10; i++))
    do
        printf "%02d " "$i"
    done
}}}

The KornShell and KornShell93 have the {{{typeset}}} command to specify the number of leading zeros:

{{{
    $ typeset -Z3 i=4
    $ echo $i
    004
}}}

Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes:

{{{
i=0
while test $i -le 10
do
    echo "00$i"
    i=`expr $i + 1`
done |
    sed 's/.*\(...\)$/\1/g'
}}}

In this example, the number of '.' inside the parentheses in the {{{sed}}} statement determins how many total bytes from the {{{echo}}} command (at the end of each line) will be kept and printed.

One more addendum: in Bash 3, you can use:
{{{
printf "%03d \n" {1..300}
}}}

Which is slightly easier in some cases.

Also you can use the {{{printf}}} command with xargs and wget to fetch files:

{{{
printf "%03d \n" {$START..$END} | xargs -i% wget $LOCATION/%
}}}

Sometimes a good solution.

[[Anchor(faq19)]]
== How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30? ==
Some Unix systems provide the {{{split}}} utility for this purpose:

{{{
    split --lines 10 --numeric-suffixes input.txt output-
}}}

For more flexibility you can use {{{sed}}}.  The {{{sed}}} command can print e.g. the line number range 1-10:
{{{
    sed -n '1,10p'
}}}

This stops {{{sed}}} from printing each line ({{{-n}}}). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). {{{sed}}} still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making {{{sed}}} terminate immediately after printing line 10:

{{{
    sed -n -e '1,10p' -e '10q'
}}}

Now the command will quit after reading line 10 ("10q"). The {{{-e}}} arguments indicate a script (instead of a file name). The same can be written a little shorter:

{{{
    sed -n '1,10p;10q'
}}}

We can now use this to print an arbitrary range of a file (specified by line number):

{{{
file=/etc/passwd
range=10
firstline=1
maxlines=$(wc -l < "$file") # count number of lines
while (($firstline < $maxlines))
do
    ((lastline=$firstline+$range+1))
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    ((firstline=$firstline+$range+1))
done
}}}

This example uses ["BASH"] and KornShell ArithmeticExpressions, which older [wiki:Self:BourneShell Bourne shells] do not have. In that case the following example should be used instead:

{{{
file=/etc/passwd
range=10
firstline=1
maxlines=`wc -l < "$file"` # count line numbers
while [ $firstline -le $maxlines ]
do
    lastline=`expr $firstline + $range + 1`
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    firstline=`expr $lastline + 1`
done
}}}

[[Anchor(faq20)]]
== How can I find and deal with file names containing newlines, spaces or both? ==
The preferred method is still to use

{{{
    find ... -exec command {} \;
}}}

or, if you need to handle filenames ''en masse'':

{{{
    find ... -print0 | xargs -0 command
}}}

for GNU {{{find}}}/{{{xargs}}}, or (POSIX {{{find}}}):

{{{
    find ... -exec command {} +
}}}

Use that unless you really can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion (["globbing"]).  This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well.

This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces.  (But it will not work in the original BourneShell.)

{{{
for file in *.mp3; do
    mv "$file" "${file// /_}"
done
}}}

You could do the same thing for all files (regardless of extension) by using

{{{
for file in *\ *; do
}}}

instead of *.mp3.

Another way to handle filenames recursively involes using the {{{-print0}}} option of {{{find}}} (a GNU/BSD extension), together with bash's {{{-d}}} option for read:

{{{
unset a i
while read -d $'\0' file; do
  a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)
}}}

The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing {{{read}}} to use the NUL byte (\0) as its word delimiter.  Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using {{{find -exec}}}.



[[Anchor(faq21)]]
== How can I replace a string with another string in all files? ==
{{{sed}}} is a good command to replace strings, e.g.

{{{
    sed 's/olddomain\.com/newdomain\.com/g' input > output
}}}

To replace a string in all files of the current directory:

{{{
    for i in *; do
        sed 's/old/new/g' "$i" > atempfile && mv atempfile "$i"
    done
}}}

GNU sed 4.x (but no other version of sed) has a special {{{-i}}} flag which makes the temp file unnecessary:

{{{
   for i in *; do
      sed -i 's/old/new/g' "$i"
   done
}}}

Those of you who have perl 5 can accomplish the same thing using this code:

{{{
    perl -pi -e 's/old/new/g' *
}}}

Recursively:

{{{
    find . -type f -print0 | xargs -0 perl -pi -e 's/old/new/g'
}}}

To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...:

{{{
    perl -i.bak -pne 's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g' $(find . -type f)
}}}


Finally, here's a script that some people may find useful:

{{{
    :
    # chtext - change text in several files

    # neither string may contain '|' unquoted
    old='olddomain\.com'
    new='newdomain\.com'

    # if no files were specified on the command line, use all files:
    [ $# -lt 1 ] && set -- *

    for file
    do
        [ -f "$file" ] || continue # do not process e.g. directories
        [ -r "$file" ] || continue # cannot read file - ignore it
        # Replace string, write output to temporary file. Terminate script in case of errors
        sed "s|$old|$new|g" "$file" > "$file"-new || exit
        # If the file has changed, overwrite original file. Otherwise remove copy
        if cmp "$file" "$file"-new >/dev/null 2>&1
        then rm "$file"-new              # file nas not changed
        else mv "$file"-new "$file"      # file has changed: overwrite original file
        fi
    done
}}}

If the code above is put into a script file (e.g. {{{chtext}}}), the resulting script can be used to change a text e.g. in all HTML files of the current and all subdirectories:

{{{
    find . -type f -name '*.html' -exec chtext {} \;
}}}

Many optimizations are possible:
 * use another {{{sed}}} separator character than '|', e.g. ^A (ASCII 1)
 * some implementations of {{{sed}}} (e.g. GNU sed) have an "-i" option that can change a file in-place; no temporary file is necessary in that case
 * the {{{find}}} command above could use either {{{xargs}}} or the built-in {{{xargs}}} of POSIX find

Note: {{{set -- *}}} in the code above is safe with respect to files whose names contain spaces.  The expansion of * by {{{set}}} is the same as the expansion done by {{{for}}}, and filenames will be preserved properly as individual parameters, and not broken into words on whitespace.

A more sophisticated example of {{{chtext}}} is here: http://www.shelldorado.com/scripts/cmds/chtext

[[Anchor(faq22)]]
== How can I calculate with floating point numbers instead of just integers? ==
["BASH"] does not have built-in floating point arithmetic:

{{{
    $ echo $((10/3))
    3
}}}

For better precision, an external program must be used, e.g. {{{bc}}}, {{{awk}}} or {{{dc}}}:

{{{
    $ echo "scale=3; 10/3" | bc
    3.333
}}}

The "scale=3" command notifies {{{bc}}} that three digits of precision after the decimal point are required.

{{{awk}}} can be used for calculations, too:

{{{
    $ awk 'BEGIN {printf "%.3f\n", 10 / 3}' /dev/null
    3.333
}}}

There is a subtle but important difference between the {{{bc}}} and the {{{awk}}} solution here: {{{bc}}} reads commands and expressions ''from standard input''. {{{awk}}} on the other hand evaluates the expression as ''part of the program''. Expressions on standard input are ''not'' evaluated, i.e. {{{echo 10/3 | awk '{print $0}'}}} will print {{{10/3}}} instead of the evaluated result of the expression.

This explains why the example uses {{{/dev/null}}} as an input file for {{{awk}}}: the program evaluates the {{{BEGIN}}} action, evaluating the expression and printing the result. Afterwards the work is already done: it reads its standard input, gets an end-of-file indication, and terminates. If no file had been specified, {{{awk}}} would wait for data on standard input.

Newer versions of KornShell93 have built-in floating point arithmetic, together with mathematical functions like {{{sin()}}} or {{{cos()}}} .

[[Anchor(faq23)]]
== How do I append a string to the contents of a variable? ==
The shell doesn't have a string concatenation operator like Java ("+") or Perl ("."). The following example shows how to append the string ".2004-08-15" to the contents of the shell variable {{{filename}}}:

{{{
    filename="$filename.2004-08-15"
}}}

If the variable name and the string to append could be confused, the variable name can be enclosed in braces, e.g.

{{{
    filename="${filename}old"
}}}

instead of {{{filename=$filenameold}}}

[[Anchor(faq24)]]
== I set variables in a loop. Why do they suddenly disappear after the loop terminates? ==

The following command always prints "total number of lines: 0", although the variable {{{linecnt}}} has a larger value in the {{{while}}} loop:

{{{
    linecnt=0
    cat /etc/passwd | while read line
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"
}}}

The reason for this surprising behaviour is that a {{{while/for/until}}} loop runs in a subshell when its input or output is redirected from a pipeline. For the {{{while}}} loop above, a new subshell with its own copy of the variable {{{linecnt}}} is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecnt}}} of the parent (whose value has not changed) is used in the {{{echo}}} command.

It's hard to tell when shell would create a new process for a loop:
 * BourneShell creates it when the input or output is redirected, either by using a pipeline or by a redirection operator ('<', '>').
 * ["BASH"] creates a new process only if the loop is part of a pipeline
 * KornShell creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it.

To solve this, either use a method that works without a subshell (shown below), or make sure you do all processing inside that subshell (a bit of a kludge, but easier to work with):

{{{
    linecnt=0
    cat /etc/passwd |
    (
        while read line ; do
                linecnt="$((linecnt+1))"
        done
        echo "total number of lines: $linecnt"
    )
}}}

To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem at least for ["BASH"] and KornShell (but still for BourneShell):

{{{
    linecnt=0
    while read line ; do
        linecnt="$((linecnt+1))"
   done < /etc/passwd
   echo "total number of lines: $linecnt"
}}}

A portable and common work-around is to redirect the input of the {{{read}}} command using {{{exec}}}:

{{{
    linecnt=0
    exec < /etc/passwd    # redirect standard input from the file /etc/passwd
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"
}}}

This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:

{{{
    exec 3<&0    # save original standard input file descriptor "0" as FD "3"
    exec 0</etc/passwd    # redirect standard input from the file /etc/passwd

    linecnt=0
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done

    exec 0<&3   # restore saved standard input (fd 0) from file descriptor "3"
    exec 3<&-   # close the no longer needed file descriptor "3"

    echo "total number of lines: $linecnt"
}}}

Subsequent {{{exec}}} commands can be combined into one line, which is interpreted left-to-right:

{{{
    exec 3<&0
    exec 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3
    exec 3<&-
}}}

is equivalent to

{{{
    exec 3<&0 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3 3<&-
}}}

[[Anchor(faq25)]]
== How can I access positional parameters after $9? ==
Use {{{${10}}}} instead of {{{$10}}}. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use {{{for}}}, e.g. to get the last parameter:

{{{
    for last
    do
        : # nothing
    done

    echo "last argument is: $last"
}}}

To get an argument by number, we can use a counter:

{{{
    n=12        # This is the number of the argument we are interested in
    i=1
    for arg
    do
        if [ $i -eq $n ]
        then
            argn=arg
            break
        fi
        i=`expr $i + 1`
    done
    echo "argument number $n is: $argn"
}}}

This has the advantage of not "consuming" the arguments. If this is no problem, the {{{shift}}} command discards the first positional arguments:

{{{
    shift 11
    echo "the 12th argument is: $1"
}}}

Although direct access to any positional argument is possible this way, it's hardly needed. The common way is to use {{{getopts(3)}}} to process command line options (e.g. "-l", or "-o filename"), and then use either {{{for}}} or {{{while}}} to process all arguments in turn. An explanation of how to process command line arguments is available here: http://www.shelldorado.com/goodcoding/cmdargs.html

[[Anchor(faq26)]]
== How can I randomize (shuffle) the order of lines in a file? ==
{{{
    randomize(){
        while read l ; do echo "0$RANDOM $l" ; done |
        sort -n |
        cut -d" " -f2-
    }
}}}

Note: the leading 0 is to make sure it doesnt break if the shell doesnt support $RANDOM, which is supported by ["BASH"], KornShell, KornShell93 and ["POSIX"] shell, but not BourneShell.

The same idea (printing random numbers in front of a line, and sorting the lines on that column) using other programs:
{{{
    awk '
        BEGIN { srand() }
        { print rand() "\t" $0 }
    ' |
    sort -n |    # Sort numerically on first (random number) column
    cut -f2-     # Remove sorting column
}}}

This is faster thAn the previous solution, but will not work for very old AWK implementations (try "nawk", or "gawk", if available).

A related question we frequently see is, "How can I print a random line from a file?"  The problem here is that you need to know in advance how many lines the file contains.  Lacking that knowledge, you have to read the entire file through once just to count them -- or, you have to suck the entire file into memory.  Let's explore both of these approaches.

{{{
   n=$(wc -l < "$file")        # Count number of lines.
   r=$((RANDOM % n + 1))       # Random number from 1..n.
   sed -n "$r{p;q;}" "$file"   # Print the r'th line.
}}}

(These examples use the answer from [#faq11 FAQ 11] to print the n'th line.)  The first one's pretty straightforward -- we use {{{wc}}} to count the lines, choose a random number, and then use {{{sed}}} to print the line.  If we already happened to know how many lines were in the file, we could skip the {{{wc}}} command, and this would be a very efficient approach.

The next example sucks the entire file into memory.  This approach saves time reopening the file, but obviously uses more memory.

{{{
   oIFS=$IFS IFS=$'\n' lines=($(<"$file")) IFS=$oIFS
   n=${#lines[@]}
   r=$((RANDOM % n))
   echo "${lines[r]}"
}}}

Note that we don't add 1 to the random number in this example, because the array of lines is indexed counting from 0.

Also, some people want to choose a random file from a directory (for a signature on an e-mail, or to chose a random song to play, or a random image to display, etc.).  A similar technique can be used:

{{{
    files=(*.ogg)               # Or *.gif, or *
    n=${#files[@]}              # For aesthetics
    xmms "${files[RANDOM % n]}" # Choose a random element
}}}

[[Anchor(faq27)]]
== How can two processes communicate using named pipes (fifos)? ==
NamedPipes, also known as FIFOs ("First In First Out") are well suited for inter-process communication. The advantage over using files as a means of communication is, that processes are synchronized by pipes: a process writing to a pipe blocks if there is no reader, and a process reading from a pipe blocks if there is no writer.

Here is a small example of a server process communicating with a client process. The server sends commands to the client, and the client acknowledges each command:

'''Server'''
{{{
#! /bin/sh
# server - communication example

# Create a FIFO. Some systems don't have a "mkfifo" command, but use
# "mknod pipe p" instead

mkfifo pipe

while sleep 1
do
    echo "server: sending GO to client"

    # The following command will cause this process to block (wait)
    # until another process reads from the pipe
    echo GO > pipe

    # A client read the string! Now wait for its answer. The "read"
    # command again will block until the client wrote something
    read answer < pipe

    # The client answered!
    echo "server: got answer: $answer"
done
}}}

'''Client'''
{{{
#! /bin/sh
# client

# We cannot start working until the server has created the pipe...
until [ -p pipe ]
do
    sleep 1;    # wait for server to create pipe
done

# Now communicate...

while sleep 1
do
    echo "client: waiting for data"

    # Wait until the server sends us one line of data:
    read data < pipe

    # Received one line!
    echo "client: read <$data>, answering"

    # Now acknowledge that we got the data. This command
    # again will block until the server read it.
    echo ACK > pipe
done
}}}

Write both examples to files {{{server}}} and {{{client}}} respectively, and start them concurrently to see it working:

{{{
    $ chmod +x server client
    $ server & client &
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    [...]
}}}

[[Anchor(faq28)]]
== How do I determine the location of my script?  I want to read some config files from the same place. ==
This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable {{{$0}}}. But providing the script name in {{{$0}}} is only a (very common) convention, not a requirement.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". That's not the case in ["BASH"].  But this isn't reliable across shells; some of them return the actual command typed in by the user instead of the fully qualified path.  In those cases, if all you want is the fully qualified version of $0, you can use something like this (["POSIX"], non-Bourne):

{{{
  [[ $0 = /* ]] && echo $0 || echo $PWD/$0
}}}

Or the BourneShell version:

{{{
  case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac
}}}

However, this approach has some major drawbacks. The most important is, that the script name (as seen in {{{$0}}}) may not be relative to the current working directory, but relative to a directory from the program search path {{{$PATH}}} (this is often seen with KornShell).

Another drawback is that there is really no guarantee that your script is still in the same place it was when it first started executing.  Suppose your script is loaded from a temporary file which is then unlinked immediately... your script might not even exist on disk any more!  The script could also have been moved to a different location while it was executing.  Or (and this is most likely by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common {{{PATH}}} directory like {{{/usr/local/bin}}}, which is how it's being invoked.  Your script might be in {{{/opt/foobar/bin/script}}} but the naive approach of reading {{{$0}}} won't tell you that.

(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [http://www.cs.bell-labs.com/sys/doc/lexnames.html this Plan 9 paper].)

So if the name in {{{$0}}} is a relative one, i.e. does not start with '/', we can still try to search the script like the shell would have done: in all directories from {{{$PATH}}}.

The following script shows how this could be done:

{{{
    myname=$0
    if [ -s "$myname" ] && [ -x "$myname" ]
    then                   # $myname is already a valid file name
        mypath=$myname
    else
        case "$myname" in
        /*) exit 1;;             # absolute path - do not search PATH
        *)
            # Search all directories from the PATH variable. Take
            # care to interpret leading and trailing ":" as meaning
            # the current directory; the same is true for "::" within
            # the PATH.

            for dir in `echo "$PATH" | sed 's/^:/.:/g;s/::/:.:/g;s/:$/:./;s/:/ /g'`
            do
                [ -f "$dir/$myname" ] || continue # no file
                [ -x "$dir/$myname" ] || continue # not executable
                mypath=$dir/$myname
                break           # only return first matching file
            done
            ;;
        esac
    fi

    if [ -f "$mypath" ]
    then
        : # echo >&2 "DEBUG: mypath=<$mypath>"
    else
        echo >&2 "cannot find full path name: $myname"
        exit 1
    fi

    echo >&2 "path of this script: $mypath"
}}}

Note that {{{$mypath}}} is not necessarily an absolute path name. It still can contain relative parts like {{{../bin/myscript}}}.

Generally storing data files in the same directory as their scripts is a bad practice. The Unix file system layout assumes that files in one place (e.g. /bin) are executable programs, while files in another place (e.g. /etc) are data files.  (Let's ignore legacy Unix systems with programs in /etc for the moment, shall we....)

It really makes the most sense to keep your script's configuration in a single, static location such as {{{$SCRIPTROOT/etc/foobar.conf}}}. If you need to define multiple configuration files, then you can have a directory (say, {{{/var/lib/foobar}}} or {{{/usr/local/lib/foobar}}}), and read that directory's location from a variable in {{{/etc/foobar.conf}}}.  If you don't even want that much to be hard-coded, you could pass the location of {{{foobar.conf}}} as a parameter to the script.  If you need the script to assume certain default in the absence of {{{/etc/foobar.conf}}}, you can put defaults in the script itself, and/or fall back to something like {{{$HOME/.foobar.conf}}} if {{{/etc/foobar.conf}}} is missing.  (This depends on what your script does. In some cases, it may make more sense to abort gracefully.)

[[Anchor(faq29)]]
== How can I display value of a symbolic link on standard output? ==
The external command {{{readlink}}} can be used to display the value of a symbolic link.

{{{
$ readlink /bin/sh
bash
}}}

you can also use GNU find's %l directive, which is especially useful if you need to resolve links in batches:

{{{
$ find /bin/ -type l -printf '%p points to %l\n'
/bin/sh points to bash
/bin/bunzip2 points to bzip2
...
}}}

If your system lacks {{{readlink}}}, you can use a function like this one:
{{{
readlink() {
    local path=$1 ll

    if [ -L "$path" ]; then
        ll="$(LC_ALL=C ls -l "$path" 2> /dev/null)" &&
        echo "${ll/* -> }"
    else
        return 1
    fi
}
}}}

[[Anchor(faq30)]]
== How can I rename all my *.foo files to *.bar? ==
Some GNU/Linux distributions have a rename command, which you can use for this purpose; however, the syntax differs from one distribution to the next, so it's not a portable answer.

You can do it in POSIX shells like this:

{{{
for f in *.foo; do mv "$f" "${f%.foo}.bar"; done
}}}

This invokes the external command {{{mv}}} once for each file, so it may not be as efficient as some of the {{{rename}}} implementations.

If you want to do it recursively, then it becomes much more challenging.  This example works (in ["BASH"]) as long as no files have newlines in their names:

{{{
find . -name '*.foo' -print | while IFS=$'\n' read -r f; do
  mv "$f" "${f%.foo}.bar"
done
}}}

Another common form of this question is "How do I rename all my MP3 files so that they have underscores instead of spaces?"  You can use this:

{{{
for f in *\ *.mp3; do mv "$f" "${f// /_}"; done
}}}

[[Anchor(faq31)]]
== What is the difference between the old and new test commands ([ and [[)? ==
{{{[}}} ("test" command) and {{{[[}}} ("new test" command) are both used to evaluate expressions. Some examples:

{{{
    if [ -z "$variable" ]
    then
        echo "variable is empty!"
    fi

    if [ -f "$filename" ]
    then
        echo "not a valid, existing file name: $filename"
    fi
}}}

and

{{{
    if [[ -e $file ]]
    then
        echo "directory entry does not exist: $file"
    fi

    if [[ $file0 -nt $file1 ]]
    then
        echo "file $file0 is newer than $file1"
    fi
}}}

To cut a long story short: {{{[}}} implements the old, portable syntax of the command. Although all modern shells have built-in implementations, there usually still is an external executable of that name, e.g. {{{/bin/[}}}. {{{[[}}} is a new improved version of it, which is a keyword, not a program. This has benefical effects on the ease of use, see below. {{{[[}}} is understood by KornShell, ["BASH"] (e.g. 2.03), KornShell93, ["POSIX"] shell, but not by the older BourneShell.

Although {{{[}}} and {{{[[}}} have much in common, and share many expression operators like "-f", "-s", "-n", "-z", there are some notable differences. Here is a comparison list:

||'''Feature'''||'''new test''' {{{[[}}}||'''old test''' {{{[}}}||'''Example'''||
||<rowspan="4">string comparison||>||(not available)||-||
||<||(not available)||-||
||== (or =)||=||-||
||!=||!=||-||
||<rowspan="2">expression grouping||&&||-a||{{{[[ -n $var && -f $var ]] && echo "$var is a file"}}}||
||{{{||}}}||-o||-||
||Pattern matching||=||(not available)||{{{[[ $name = a* ]] || echo "name does not start with an 'a': $name"}}}||
||In-process regular expression matching||=~||(not available)||{{{[[ $(date) =~ '^Fri ... 13 ' ]] && echo "It's Friday the 13th!"}}}||

Special primitives that {{{[[}}} is defined to have, but {{{[}}} may be lacking (depending on the implementation):

||'''Description'''||'''Primitive'''||'''Example'''||
||entry (file or directory) exists||-e||{{{[[ -e $config ]] && echo "config file exists: $config"}}}||
||file is newer/older than other file||-nt / -ot||{{{[[ $file0 -nt $file1 ]] && echo "$file0 is newer than $file1"}}}||
||two files are the same||-ef||{{{[[ $input -ef $output ]] && { echo "will not overwrite input file: $input"; exit 1; } }}}||
||negation||!||-||

But there are more subtle differences.
 * No field splitting will be done for {{{[[}}} (and therefore many arguments need not to be quoted)

 {{{
 file="file name"
 [[ -f $file ]] && echo "$file is a file"}}}

 will work even though $file is not quoted and contains whitespace. With {{{[}}} the variable needs to be quoted:

 {{{
 file="file name"
 [ -f "$file" ] && echo "$file is a file"}}}

 This makes {{{[[}}} easier to use and less error prone.

 * No file name generation will be done for {{{[[}}}. Therefore the following line tries to match the contents of the variable $path with the pattern {{{/*}}}

 {{{
 [[ $path = /* ]] && echo "\$path starts with a forward slash /: $path"}}}

 The next command most likely will result in an error, because {{{/*}}} is subject to file name generation:

 {{{
 [ $path = /* ] && echo "this does not work"}}}

 {{{[[}}} is strictly used for strings and files. If you want to compare numbers, use ArithmethicExpression ((''expression'')), e.g.

 {{{
 i=0
 while ((i<10))
 do
    echo $i
    ((i=$i+1))
 done}}}

When should the new test command {{{[[}}} be used, and when the old one {{{[}}}? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires ["BASH"] or KornShell, the new syntax could be preferable.

[[Anchor(faq32)]]
== How can I redirect the output of 'time' to a variable or file? ==
The reason that 'time' needs special care for redirecting its output is one of those mysteries of the universe.  The answer will probably be solved around the same time we find dark matter.

 * File Redirection
{{{
     bash -c "time ls" > /path/to/foo 2>&1
     ( time ls ) > /path/to/foo 2>&1
     { time ls; } > /path/to/foo 2>&1
}}}

 * Variable Redirection
{{{
     foo=$( bash -c "time ls" 2>&1 )
     foo=$( ( time ls ) 2>&1 )
     foo=$( { time ls; } 2>&1 )
}}}

Note: Using 'bash -c' and ( ) creates a subshell, using { } does not. Do with that as you wish.

[[Anchor(faq33)]]
== How can I find a process id for a process given its name? ==
Usually a process is referred to using its process id (PID), and the {{{ps}}} command can display the information for any process given its process id, e.g.

{{{
    $ echo $$         # my process id
    21796
    $ ps -p 21796
    PID TTY          TIME CMD
    21796 pts/5    00:00:00 ksh
}}}

But frequently the process id for a process is not known, but only its name. Some operating systems, e.g. Solaris, BSD, and some versions of Linux have a dedicated command to search a process given its name, called "pgrep":

{{{
    $ pgrep init
    1
}}}

Often there is an even more specialized program available to not just find the process id of a process given its name, but also to send a signal to it:

{{{
    $ pkill myprocess
}}}

Some systems also provide {{{pidof}}}.  It differs from {{{pgrep}}} in that multiple output process IDs are only space separated, not newline separated.

{{{
    $ pidof cron
    5392
}}}

If these programs are not available, a user can search the output of the ps(1) command using {{{grep}}}.

The major problem when grepping the ps output is that grep ''may'' match its own ps entry (try: ps aux | grep init).  To make matters worse, this does not happen every time; the techicnal name for this is a "race condition".  To avoid this, there are several ways:

 * Using grep -v at the end
{{{
     ps aux | grep name | grep -v grep
}}}

will throw away all lines containing "grep" from the output. Disadvantage: You always have the exit state of the grep -v, so you can't e.g. check if a specific process exists.

 * Using grep -v in the middle
{{{
     ps aux | grep -v grep | grep name
}}}

This does exactly the same, beside that the exit state of "grep name" is acessible and a representation for "name is a process in ps" or "name is not a process in ps". It still has the disadvantage to start a new process (grep -v).

 * Using [] in grep
{{{
     ps aux | grep [n]ame
}}}

This spawns only the needed grep-process. The trick is to use the {{{[]}}}-character class (regular expressions). To put only one character in a character group normally makes no sense at all, because a {{{[c]}}} will always be a "c". In this case, it's the same. {{{grep [n]ame}}} searches for "name". But as grep's own process list entry is what you executed ("grep [n]ame") and not "grep name", it will not match itself.

===BEGIN greycat rant===

Most of the time when someone asks a question like this, it's because they want to manage a long-running daemon using primitive shell scripting techniques.  Common variants are "How can I get the PID of my foobard process.... so I can start one if it's not already running" or "How can I get the PID of my foobard process... because I want to prevent the foobard script from running if foobard is already active."  Both of these questions will lead to seriously flawed production systems.

If what you really want is to restart your daemon whenever it dies, just do this:

{{{
#!/bin/sh
while true; do
   mydaemon --in-the-foreground
done
}}}

where --in-the-foreground is whatever switch, if any, you must give to the daemon to PREVENT IT from automatically backgrounding itself.  (Often, -d does this and has the additional benefit of running the daemon with increased verbosity.)  Self-daemonizing programs may or may not be the target of a future greycat rant....

If that's too simplistic, look into [http://cr.yp.to/daemontools.html daemontools] or [http://smarden.org/runit/ runit], which are programs for managing services.

If what you really want is to prevent multiple instances of your program from running, then the only sure way to do that is by using a lock.  For details on doing this, see [#faq45 FAQ 45].

===END greycat rant===

[[Anchor(faq34)]]
== Can I do a spinner in Bash? ==
Sure.
{{{
    i=1
    sp="/-|-\|"
    echo -n ' '
    while true
    do
        echo -en "\b${sp:i++%${#sp}:1}"
    done
}}}

You can also use \r instead of \b.  You can use pretty much any character sequence you want as well.  If you want it to slow down, put a {{{sleep}}} command inside the loop.

[[Anchor(faq35)]]
== How can I handle command-line arguments to my script easily? ==
Well, that depends a great deal on what you want to do with them.  Here's a general template that might help for the simple cases:

{{{
    while [[ $1 == -* ]]; do
        case "$1" in
          -h|--help) show_help; exit 0;;
          -v) verbose=1; shift;;
          -f) output_file=$2; shift 2;;
        esac
    done
    # Now all of the remaining arguments are the filenames which followed
    # the optional switches.  You can process those with "for i" or "$@".
}}}

For more complex/generalized cases, or if you want things like "-xvf" to be handled as three separate flags, you can use getopts or getopt.  (Heiner, that's your cue....)

[[Anchor(faq36)]]
== How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction). ==

Use the comm(1) command.

{{{
  # intersection of file1 and file2
  comm -12 <(sort file1) <(sort file2)
  # subtraction of file1 from file2
  comm -13 <(sort file1) <(sort file2)
}}}

Read the comm(1) manpage for details.

If for some reason you lack the core comm(1) program, you can use these other methods:

an amazingly simple and fast implementation, that took just 20 seconds to match a 30k line file against a 400k line file for me.

note that it probably only works with GNU grep, and that the file specified with -f is will be loaded into ram, so it doesn't scale for very large files.

it has grep read one of the sets as a pattern list from a file (-f), and interpret the patterns as plain strings not regexps (-F), matching only whole lines (-x).

{{{
  # intersection of file1 and file2
  grep -xF -f file1 file2
  # substraction of file1 from file2
  grep -vxF -f file1 file2
}}}

an implementation using sort and uniq

{{{
  # intersection of file1 and file2
  sort file1 file2 | uniq -d  (Assuming each of file1 or file2 does not have repeated content)
  # file1-file2 (Subtraction)
  sort file1 file2 file2 | uniq -u
  # same way for file2 - file1, change last file2 to file1
  sort file1 file2 file1 | uniq -u
}}}

another implementation of substraction:
{{{
  cat file1 file1 file2 | sort | uniq -c |
  awk '{ if ($1 == 2) { $1 = ""; print; } }'
}}}

This may introduce an extra space at the start of the line; if that's a problem, just strip it away.

Also, this approach assumes that neither file1 nor file2 has any duplicates in it.

Finally, it sorts the output for you.  If that's a problem, then you'll have to abandon this approach altogether.  Perhaps you could use awk's associative arrays (or perl's hashes or tcl's arrays) instead.

[[Anchor(faq37)]]
== How can I print text in various colors? ==
''Do not'' hard-code ANSI color escape sequences in your program!  The {{{tput}}} command lets you interact with the terminal database in a sane way.

{{{
  tput setaf 1; echo this is red
  tput setaf 2; echo this is green
  tput setaf 0; echo now we are back in black
}}}

{{{tput}}} reads the terminfo database which contains all the escape codes necessary for interacting with your terminal, as defined by the {{{$TERM}}} variable.  For more details, see the {{{terminfo(5)}}} man page.

If you don't know in advance what your user's terminal's default text color is, you can use {{{tput sgr0}}} to reset the colors to their default settings.  This also removes boldface ({{{tput bold}}}), etc.

[[Anchor(faq38)]]
== How do Unix file permissions work? ==
See ["Permissions"].

[[Anchor(faq39)]]
== What are all the dot-files that bash reads? ==
See DotFiles.

[[Anchor(faq40)]]
== How do I use dialog to get input from the user? ==

{{{
  foo=$(dialog --inputbox "text goes here" 8 40 2>&1 >/dev/tty)
  echo "The user typed '$foo'"
}}}

The redirection here is a bit tricky.

 1. The {{{foo=$(command)}}} is set up first, so the standard output of the command is being captured by bash.

 1. Inside the command, the {{{2>&1}}} causes standard error to be sent to where standard out is going -- in other words, stderr will now be captured.

 1. {{{>/dev/tty}}} sends standard output to the terminal, so the dialog box will be seen by the user.  Standard error will still be captured, however.

Another common {{{dialog(1)}}}-related question is how to dynamically generate a dialog command that has items which must be quoted (either because they're empty strings, or because they contain internal white space).  One ''can'' use {{{eval}}} for that purpose, but the cleanest way to achieve this goal is to use an array.

{{{
  unset m; i=0
  words=(apple banana cherry "dog droppings")
  for w in "${words[@]}"; do
    m[i++]=$w; m[i++]=""
  done
  dialog --menu "Which one?" 12 70 9 "${m[@]}"
}}}

In the previous example, the while loop that populates the '''m''' array could have been reading from a pipeline, a file, etc.

Recall that the construction {{{"${m[@]}"}}} expands to the entire contents of an array, but with each element implicitly quoted.  It's analogous to the {{{"$@"}}} construct for handling positional parameters.  For more details, see [#faq50 FAQ50] below.

Here's another example, using filenames:

{{{
    files=(*.mp3)	# These may contain spaces, apostrophes, etc.
    cmd=(dialog --menu "Select one:" 22 76 16); n=6
    i=0
    for f in "${files[@]}"; do
 cmd[n++]=$((i++)); cmd[n++]="$f"
    done
    choice=$("${cmd[@]}" 2>&1 >/dev/tty)
}}}

The user's choice will be stored in the {{{choice}}} variable, as an integer, which can in turn be used as an index into the {{{files}}} array.

A seperate but useful function of dialog is to track progress of a process that produces output. Below is an example that uses dialog to track processes writing to a log file. In the dialog window, there is a tailbox where output is stored, and a msgbox with a clickable Quit. Clicking quit will cause trap to execute, removing the tempfile, and destroying the tail process.

{{{
  #you can not tail a nonexistant file, so always ensure it pre-exists!
  rm -f dialog-tail.log; echo Initialize log >> dialog-tail.log
  date >> dialog-tail.log
  tempfile=`tempfile 2>/dev/null` || tempfile=/tmp/test$$
  trap "rm -f $tempfile" 0 1 2 5 15
  dialog --title "TAIL BOXES" \
        --begin 10 10 --tailboxbg dialog-tail.log 8 58 \
        --and-widget \
        --begin 3 10 --msgbox "Press OK " 5 30 \
        2>$tempfile &
  mypid=$!;
  for i in 1 2 3;  do echo $i >> dialog-tail.log; sleep 1; done
  echo Done. >> dialog-tail.log
  wait $mypid;

}}}

[[Anchor(faq41)]]
== How do I determine whether a variable contains a substring? ==

{{{
  if [[ $foo = *bar* ]]
}}}

The above works in virtually all versions of Bash.  Bash version 3 also allows regular expressions:

{{{
  if [[ $foo =~ ab*c ]]   # bash 3, matches abbbbcde, or ac, etc.
}}}

If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax:

{{{
  case "$foo" in
    *bar*) .... ;;
  esac
}}}

This should allow you to match variables against globbing-style patterns.  if you need a portable way to match variables against regular expressions, use {{{grep}}} or {{{egrep}}}.

{{{
  if echo "$foo" | egrep some-regex >/dev/null; then ...
}}}

[[Anchor(faq42)]]
== How can I find out if a process is still running? ==

The {{{kill}}} command is used to send signals to a running process.  As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running:

 {{{
 myprog &          # Start program in the background
 daemonpid=$!      # ...and save its process id

 while sleep 60
 do
     if kill -0 $daemonpid       # Is the process still alive?
     then
         echo >&2 "OK - process is still running"
     else
         echo >&2 "ERROR - process $daemonpid is no longer running!"
         break
     fi
 done}}}

[[Anchor(faq43)]]
== How can I use array variables? ==

BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.

 {{{
 host[0]="micky"
 host[1]="minnie"
 host[2]="goofy"
 i=0
 while (($i < ${#host[@]} ))
 do
     echo "host number $i is ${host[i++]}"
 done}}}

The awkward experssion {{{ ${#host[@]} }}} returns the number of elements for the array {{{host}}}.

It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:

 {{{
 # BASH
 array=(one two three four)
 # KornShell
 set -A array -- one two three four}}}

[[Anchor(faq44)]]
== How can I use associative arrays or variable variables? ==

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array:

 {{{
 # KornShell93 script - does not work with BASH
 typeset -A homedir             # Declare KornShell93 associative array
 homedir[jim]=/home/jim
 homedir[silvia]=/home/silvia
 homedir[alex]=/home/alex
 
 for user in ${!homedir[@]}     # Enumerate all indices (user names)
 do
     echo "Home directory of user $user is ${homedir[$user]}"
 done}}}

BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:

 {{{
 for user in jim silvia alex
 do
     eval homedir_$user=/home/$user
 done}}}

This creates the variables

 {{{
 homedir_jim=/home/jim
 homedir_silvia=/home/silvia
 homedir_alex=/home/alex}}}

with the corresponding content. Note the use of the {{{eval}}} command, which interprets a command line not just one time like the shell usually does, but '''twice'''. In the first step, the shell uses the input {{{homedir_$user=/home/$user}}} to create a new line {{{homedir_jim=/home/jim}}}. In the second step, caused by {{{eval}}}, this variable assignment is executed, actually creating the variable.

Print the variables using

 {{{
 for user in jim silvia alex
 do
     varname=homedir_$user              # e.g. "homedir_jim"
     eval varcontent='$'$varname        # e.g. "/home/jim"
     echo "home directory of $user is $varcontent"
 done}}}

The {{{eval}}} line needs some explanation.  In a first step the command substitution is run:

 {{{
 eval varcontent='$'$varname}}}

becomes

 {{{
 eval varcontent=$homedir_jim}}}

In a second step the {{{eval}}} re-evaluates the line, and converts this to

 {{{
 varcontent=/home/jim}}}

Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:

 1. it's hard to read and to maintain
 1. the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named {{{hong-hu}}}, because a dash '-' can be no valid part of a user name.
 1. Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it.

Here is the summary.  "{{{var}}}" is a constant prefix, "{{{$index}}}" contains index string, "{{{$content}}}" is the string to store.  Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:

 * Set variables

  {{{
  eval "var$index=\"$content\""    # index must only contain characters from [a-zA-Z0-9_]}}}

 * Print variable content

  {{{
  eval "echo \"var$index=\$$varname\""}}}

 * Check if a variable is empty

  {{{
  if eval "[ -z "\$var$index\" ]"
  then echo "variable is empty: $var$index"
  fi}}}

You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.

[[Anchor(faq45)]]
== How can I ensure that only one instance of a script is running at a time (mutual exclusion)? ==

We need some means of '''mutual exclusion'''. One easy way is to use a "lock": any number of processes can try to acquire the lock simultaneously, but only one of them will succeed.

How can we implement this using shell scripts? Some people suggest creating a lock file, and checking for its presence:

 {{{
 # locking example -- WRONG

 lockfile=/tmp/myscript.lock
 if [ -f "$lockfile" ]
 then                      # lock is already held
     echo >&2 "cannot acquire lock, giving up: $lockfile"
     exit 0
 else                      # nobody owns the lock
     > "$lockfile"         # create the file
     #...continue script
 fi}}}

This example '''does not work''', because there is a time window between checking and creating the file. Assume two processes are running the code at the same time. Both check if the lockfile exists, and both get the result that it does not exist. Now both processes assume they have acquired the lock -- a disaster waiting to happen. We need an atomic check-and-create operation, and fortunately there is one: {{{mkdir}}}, the command to create a directory:

 {{{
 # locking example -- CORRECT

 lockdir=/tmp/myscript.lock
 if mkdir "$lockdir"
 then    # directory did not exist, but was created successfully
     echo >&2 "successfully acquired lock: $lockdir"
     # continue script
 else
     echo >&2 "cannot acquire lock, giving up on $lockdir"
     exit 0
 fi}}}

The advantage over using a lock file is, that even when two processes call {{{mkdir}}} at the same time, only one process can succeed at most. This atomicity of check-and-create is ensured at the operating system kernel level.

Note that we cannot use "mkdir -p" to automatically create missing path components: "mkdir -p" does not return an error if the directory exists already, but that's the feature we rely upon to ensure mutual exclusion.

Now let's spice up this example by automatically removing the lock when the script finishes:

 {{{
 lockdir=/tmp/myscript.lock
 if mkdir "$lockdir"
 then
     echo >&2 "successfully acquired lock"
 
     # Remove lockdir when the script finishes, or when it receives a signal
     trap 'rm -rf "$lockdir"' 0    # remove directory when script finishes
     trap "exit 2" 1 2 3 15        # terminate script when receiving signal
 
     # Optionally create temporary files in this directory, because
     # they will be removed automatically:
     tmpfile=$lockdir/filelist
 
 else
     echo >&2 "cannot acquire lock, giving up on $lockdir"
     exit 0
 fi}}}

This example provides reliable mutual exclusion. There is still the disadvantage that a ''stale'' lock file could remain when the script is terminated with a signal not caught (or signal 9, SIGKILL), but it's a good step towards reliable mutual exclusion.

Instead of using {{{mkdir}}} we could also have used the program to create a symbolic link, {{{ln -s}}}.

[[Anchor(faq46)]]
== I want to check to see whether a word is in a list (or an element is a member of a set). ==

Let's suppose you have your "list" stored as a big string of words, with spaces in between them.  (That's the most common case when people are asking this one.)  What you actually want to do is determine whether the string " foo " (note the spaces around it) appears in the list.  But since your list may not have leading/trailing spaces, you have to add them as well.  So, here's the most portable way to do it:

  {{{
  if echo " $list " | grep " foo " >/dev/null; then ....}}}

GNU grep seems to have a special {{{-w}}} extension which lets you avoid the spaces:

  {{{
  if echo "$list" | GNUgrep -q -w "foo"; then ....}}}

Finally, if you want to use Bash builtins, you can do it thus:

  {{{
  if [[ " $list " = *\ foo\ * ]]; then ....}}}

This is basically the same as the original grep -- we surround both the list and the word (foo) with spaces, and then do a simple text matching.

[[Anchor(faq47)]]                                                         
== How can I redirect stderr to a pipe? ==
                                                                                
A pipe can only carry stdout of a program. To pipe stderr through it, you
need to redirect stderr to the same destination as stdout. Optionally
you can close stdout or redirect it to /dev/null to only get stderr. Some
sample code:

{{{
# - 'myprog' is an example for a program that outputs both, stdout and
#   stderr
# - after the pipe I will just use a 'cat', of course you can put there
#   what you want

# version 1: redirect stderr towards the pipe while stdout survives (both come
# mixed)
myprog 2>&1 | cat                                                               
                                                                                
# version 2: redirect stderr towards the pipe without getting stdout (it's
# redirected to /dev/null)
myprog 2>&1 >/dev/null | cat
#Note that '>/dev/null' comes after '2>&1', otherwise the stderr will also be directed to /dev/null
                                                                                
# version 3: redirect stderr towards the pipe while the "original" stdout gets
# closed
myprog 2>&1 >&- | cat
}}}




[[Anchor(faq48)]]                                                         
== Why should I never use eval? ==

"eval" is a common misspelling of "evil". The section dealing with spaces in file names used to include the following 
quote "helpful tool (which is probably not as safe as the \0 technique)", end quote. 

{{{
    Syntax : nasty_find_all [path] [command] <maxdepth>
}}}

{{{
    #This code is evil and must never be used
    export IFS=" "
    [ -z "$3" ] && set -- "$1" "$2" 1
    FILES=`find "$1" -maxdepth "$3" -type f -printf "\"%p\" "`
    #warning, evilness
    eval FILES=($FILES)
    for ((I=0; I < ${#FILES[@]}; I++))
    do
        eval "$2 \"${FILES[I]}\""
    done
    unset IFS
}}}

This script is supposed to recursively search for files with newlines and/or spaces in them, arguing that {{{find -print0 | xargs -0}}} was unsuitable for some purposes such as multiple commands. It was followed by an instructional description on all the lines involved, which we'll skip. 

To its defense, it works:
{{{
$ ls -lR
.:
total 8
drwxr-xr-x  2 vidar users 4096 Nov 12 21:51 dir with spaces
-rwxr-xr-x  1 vidar users  248 Nov 12 21:50 nasty_find_all

./dir with spaces:
total 0
-rw-r--r--  1 vidar users 0 Nov 12 21:51 file?with newlines
$ ./nasty_find_all . echo 3
./nasty_find_all
./dir with spaces/file
with newlines
$ 
}}}

But consider this: 
{{{
$ touch "\"); ls -l $'\x2F'; #"
}}}

You just created a file called  {{{ "); ls -l $'\x2F'; #}}}

Now FILES will contain {{{ ""); ls -l $'\x2F'; #}}}. When we do {{{eval FILES=($FILES)}}}, it becomes
{{{
FILES=(""); ls -l $'\x2F'; #"
}}}

Which becomes the two statements {{{ FILES=(""); }}} and {{{ ls -l / }}}. Congratulations, you just allowed execution of arbitrary commands. 

{{{
$ touch "\"); ls -l $'\x2F'; #"
$ ./nasty_find_all . echo 3
total 1052
-rw-r--r--   1 root root 1018530 Apr  6  2005 System.map
drwxr-xr-x   2 root root    4096 Oct 26 22:05 bin
drwxr-xr-x   3 root root    4096 Oct 26 22:05 boot
drwxr-xr-x  17 root root   29500 Nov 12 20:52 dev
drwxr-xr-x  68 root root    4096 Nov 12 20:54 etc
drwxr-xr-x   9 root root    4096 Oct  5 11:37 home
drwxr-xr-x  10 root root    4096 Oct 26 22:05 lib
drwxr-xr-x   2 root root    4096 Nov  4 00:14 lost+found
drwxr-xr-x   6 root root    4096 Nov  4 18:22 mnt
drwxr-xr-x  11 root root    4096 Oct 26 22:05 opt
dr-xr-xr-x  82 root root       0 Nov  4 00:41 proc
drwx------  26 root root    4096 Oct 26 22:05 root
drwxr-xr-x   2 root root    4096 Nov  4 00:34 sbin
drwxr-xr-x   9 root root       0 Nov  4 00:41 sys
drwxrwxrwt   8 root root    4096 Nov 12 21:55 tmp
drwxr-xr-x  15 root root    4096 Oct 26 22:05 usr
drwxr-xr-x  13 root root    4096 Oct 26 22:05 var
./nasty_find_all
./dir with spaces/file
with newlines
./
$
}}}

It doesn't take much imagination to replace {{{ ls -l }}} with {{{ rm -rf }}} or worse. 

One might think these circumstances are obscure, but one should not be tricked by this. All it takes is one malicious user, or perhaps more likely, a benign user who left the terminal unlocked when going to the bathroom, wrote a funny php uploading script that doesn't sanity check file names or who made the same mistake as oneself in allowing arbitrary code execution (now instead of being limited to the www-user, an attacker can use {{{nasty_find_all}}} to traverse chroot jails and/or gain additional privileges), uses an IRC or IM client that's too liberal in the filenames it accepts for file transfers or conversation logs, etc. 

[[Anchor(faq49)]]
== How can I view periodic updates/appends to a file? (ex: growing log file) ==
{{{tail -f}}} will show you the growing log file.  On some systems (e.g. OpenBSD), this will automatically track a rotated log file to the new file with the same name (which is usually what you want).  To get the equivalent functionality on GNU systems, use {{{tail --follow=name}}} instead.

This is helpful if you need to view only the updates to the file after your last view.
{{{
# Start by setting n=1
   tail -n $n testfile; n="+$(( $(wc -l < testfile) + 1 ))"
}}}

Every invocation of this gives the update to the file from where we stopped last. If you know the line number from where you want to start, set n to that.

[[Anchor(faq50)]]
== I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments. ==

Some people attempt to do things like this:
{{{
    # Non-working example
    args="-s 'The subject' $address"
    mail $args < $body
}}}

This fails because of word-splitting.  When {{{$args}}} is evaluated, it becomes four words: {{{'The}}} is the second word, and {{{subject'}}} is the third word.

What's needed is a way to maintain each word as a separate item, even if that word contains multiple spaces.  Quotes won't do it, but an array will.

{{{
    # Working example
    args=(-s "The subject" "$address")
    mail "${args[@]}" < $body
}}}

Usually, this question arises when someone is trying to use {{{dialog}}} to construct a menu on the fly.  For an example of how to do this properly, see [#faq40 FAQ #40] above.

[[Anchor(faq51)]]
== I want history-search just like in tcsh. How can I bind it to the up and down keys? ==

Just add the following to /etc/inputrc or your ~/.inputrc
{{{
"\e[A":history-search-backward
"\e[B":history-search-forward
}}}

[[Anchor(faq52)]]
== How do I convert a file in DOS format to UNIX format. ( Remove CRLF line terminators ) ==

All these are from the sed one-liners page
{{{
sed 's/.$//' dosfile              # assumes that all lines end with CR/LF
sed 's/^M$//' dosfile             # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' dosfile            
}}}

Some distributions have ''dos2unix'' command which can do this. In vim, you can use '':set fileformat=unix''

[[Anchor(faq53)]]
== I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is.  Lines wrap around incorrectly. ==

You must put {{{\[}}} and {{{\]}}} around any non-printing escape sequences in your prompt.  Thus:

{{{
BLUE=$(tput setaf 4)
PURPLE=$(tput setaf 5)
BLACK=$(tput setaf 0)
PS1='\[$BLUE\]\h:\[$PURPLE\]\w\[$BLACK\]\$ '
}}}

Without the {{{\[ \]}}}, bash will think the bytes which constitute the escape sequences for the color codes will actually take up space on the screen, so bash won't be able to know where the cursor actually is.

[[Anchor(faq54)]]
== How can I tell whether a variable contains a valid number? ==

First, you have to define what you mean by "number".  The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign".

{{{
if [[ $foo = *[^0-9]* ]]; then
   echo "'$foo' has a non-digit somewhere in it"
else
   echo "'$foo' is strictly numeric"
fi
}}}

This can be done in legacy Bourne shell as well, using {{{case}}}:

{{{
case "$foo" in
    *[^0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
esac
}}}

If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression.  Bash version 3 and above have regular expression support in the [[ command:

{{{
if [[ $foo =~ ^[-+]?[0-9]+\(\.[0-9]+\)?$ ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi
}}}

If you don't have bash version 3, then you would use {{{egrep}}}:

{{{
if echo "$foo" | egrep '^[-+]?[0-9]+(\.[0-9]+)?$' >/dev/null; then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
fi
}}}

Note that the parentheses in the {{{egrep}}} regular expression don't require backslashes in front of them, whereas the ones in the bash3 command do.

[[Anchor(faq55)]]
== Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which? ==

Bash processes all redirections from left to right, in order.  And the order is significant.  Moving them around within a command may change the results of that command.

Here's a simple example:

{{{
foo() {
  echo "This is stdout"
  echo "This is stderr" 1>&2
}
foo >/dev/null 2>&1		# produces no output
foo 2>&1 >/dev/null		# writes "This is stderr" on the screen
}}}

Why do the results differ?  In the first case, {{{>/dev/null}}} is performed first, and therefore the standard output of the command is sent to {{{/dev/null}}}.  Then, the {{{2>&1}}} is performed, which causes standard error to be sent to the same place that standard output is ''already'' going.  So both of them are discarded.

In the second example, {{{2>&1}}} is performed first.  This means standard error is sent to wherever standard output happens to be going -- in this case, the user's terminal.  Then, standard output is sent to {{{/dev/null}}} and is therefore discarded.  So when we run {{{foo}}} the second time, we see only its standard error, not its standard output.

There are times when we really do want {{{2>&1}}} to appear first -- for one example of this, see [#faq40 FAQ 40].

There are other times when we may use {{{2>&1}}} without any other redirections.  Consider:

{{{
find ... 2>&1 | grep "some error"
}}}

In this example, we want to search {{{find}}}'s standard error (as well as its standard output) for the string "some error".  The {{{2>&1}}} in the piped command forces standard error to go into the pipe along with standard output.  (When pipes and redirections are mixed in this way, remember: the pipe is done ''first'', before any redirections.  So {{{find}}}'s standard output is already set to point to the pipe before we process the {{{2>&1}}} redirection.)

If we wanted to read ''only'' standard error in the pipe, and discard standard output, we could do it like this:

{{{
find ... 2>&1 >/dev/null | grep "some error"
}}}

The redirections in that example are processed thus:

 1. First, the pipe is created.  {{{find}}}'s output is sent to it.
 1. Next, {{{2>&1}}} causes {{{find}}}'s standard error to go to the pipe as well.
 1. Finally, {{{>/dev/null}}} causes {{{find}}}'s standard output to be discarded, leaving only stderr going into the pipe.

A related question is [#faq59 FAQ #59], which discusses how to send stderr to a pipeline, while leaving stdout unpiped.

[[Anchor(faq56)]]
== How can I untar or unzip multiple tarballs at once? ==

As the {{{tar}}} command was originally designed to read from and write to tape devices (tar - Tape ARchiver), you can specify only filenames to put inside an archive or to extract out of an archive (e.g. {{{tar x myfileonthe.tape}}}). There is an option to tell {{{tar}}} that the archive is not on some tape, but in a file: {{{-f}}}. This option takes exactly one argument: the filename of the file containing the archive. All other (following) filenames are taken to be archive members:
{{{
    tar -x -f backup.tar myfile.txt
    # OR (more common syntax IMHO)
    tar xf backup.tar myfile.txt
}}}

Now here's a common mistake -- imagine a directory containing the following archive-files you want to extract all at once:
{{{
    $ ls
    backup1.tar backup2.tar backup3.tar
}}}

Maybe you think of {{{tar xf *.tar}}}. Let's see:
{{{
    $ tar xf *.tar
    tar: backup2.tar: Not found in archive
    tar: backup3.tar: Not found in archive
    tar: Error exit delayed from previous errors
}}}

What happened? The shell replaced your *.tar by the matching filenames.  You really wrote:
{{{
    tar xf backup1.tar backup2.tar backup3.tar
}}}

And as we saw earlier, it means: "extract the files backup2.tar and backup3.tar from the archive backup1.tar", which will of course only succeed when there are such filenames stored in the archive.

The solution is relatively easy: extract the contents of all archives '''one at a time'''. As we use a UNIX shell and we are lazy, we do that with a loop:
{{{
    for tarname in *.tar; do
      tar xf "$tarname"
    done
}}}

What happens? The for-loop will iterate through all filenames matching {{{*.tar}}} and call {{{tar xf}}} for each of them. That way you extract all archives one-by-one and you even do it automagically.

The second common archive type in these days is ZIP. The command to extract contents from a ZIP file is {{{unzip}}} (who would have guessed that!). The problem here is the very same: {{{unzip}}} takes only one option specifying the ZIP-file. So, you solve it the very same way:
{{{
    for zipfile in *.zip; do
      unzip "$zipfile"
    done
}}}

Not enough? Ok. There's another option with {{{unzip}}}: it can take shell-like patterns to specify the ZIP-file names. And to avoid interpretion of those patterns by the shell, you need to quote them. {{{unzip}}} itself and '''not''' the shell will interpret {{{*.zip}}} in this case:
{{{
    unzip "*.zip"
    # OR, to make more clear what we do:
    unzip \*.zip
}}}

(This feature of {{{unzip}}} derives mainly from its origins as an MS-DOS program.  MS-DOS's command interpreter does not perform glob expansions, so every MS-DOS program must be able to expand wildcards into a list of filenames.  This feature was left in the Unix version, and as we just demonstrated, it can occasionally be useful.)

[[Anchor(faq57)]]
== How can group entries (in a file by common prefixes)? ==
as in, convert:
{{{
    foo: entry1
    bar: entry2
    foo: entry3
    baz: entry4
}}}
to
{{{
    foo: entry1 entry3
    bar: entry2
    baz: entry4
}}}

there are two simple general methods for this:
 a. sort the file, and then iterate over it, collectin entries until the prefix changes, and then print the collected entries with the previous prefix
 b iterate over the file, collect entries for each prefix in an array indexed by the prefix

a basic implementation of a) in bash:
{{{
old=xxx ; stuff=
(sort file ; echo xxx) | while read prefix line ; do 
 if [[ $prefix = $old ]] ; then
  stuff="$stuff $line"
 else
  echo "$old: $stuff"
  old="$prefix"
  stuff=
 fi
done 
}}}

and a basic implementation of b) in awk:
{{{
    {
        a[$1] = a[$1] " " $2
    }
    END{
        for (x in a) print x, a[x]
    }
}}}
usage:
{{{
    awk '{a[$1] = a[$1] " " $2}END{for (x in a) print x, a[x]}' file
}}}

[[Anchor(faq58)]]
== Can bash handle binary data? ==
the answer is, basically no...
while bash won't have as much problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them.
one instance where such would sometimes be handy is for example storing small temporary bitmaps while working with netpbm... here i resorted to adding an extra pnmnoraw to the pipe, creating (larger) ascii files that bash has no problems storing)

if you are feeling adventurous, consider this experiment:
{{{
    # bindec.bash, attempt to decode binary data to ascii decimals
    while read -n1 x ;do
        case "$x" in
            '') echo empty ;;
            # insert the 256 lines generated by the following oneliner here:
            # for x in $(seq 0 255) ;do echo "        $'\\$(printf %o $x)') echo $x;;" ;done
        esac
    done
}}}
and then pipe binary data into it, maybe like so:
{{{
    for x in $(seq 0 255) ;do echo -ne "\\$(printf %o $x)" ;done | bash bindec.bash | nl | less
}}}
this suggests that a the 0 character is skipped entirely, because we can't create it with the input generation, and a few others are read as empty strings, giving us this list of binary values bash doesn't like:
{{{
    0, 1, 8, 9 (decimal)
}}}
enough to conveniently corrupt most binary files we try to process

(note that this refers to storing them in variables... moving data between programs using pipes is always binary clean)

[[Anchor(faq59)]]
== I'd like to pipe stderr only but keep stdout intact. ==
This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade.  (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!)

On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick:
{{{
# Redirecting only stderr to a pipe.

exec 3>&1                              # Save current "value" of stdout.
ls -l 2>&1 >&3 3>&- | grep bad 3>&-    # Close fd 3 for 'grep' (but not 'ls').
#              ^^^^   ^^^^
exec 3>&-                              # Now close it for the remainder of the script.

# Thanks, S.C.
}}}

To show it as a dialog one-liner:
{{{
exec 3>&1
dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/'
exec 3>&-
}}}

This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers.

[[Anchor(faq60)]]
== I'm trying to write a script that will change directory, but after the script finishes, I'm back where I started! ==

Consider this:

{{{
   #!/bin/sh
   cd /tmp
}}}

If one executes this simple script, what happens?  Bash forks, and the parent waits.  The child executes the script, including the {{{chdir(2)}}} system call, and then exits.  The parent, which was waiting for the child, harvests the child's exit status (presumably 0 for success), and then bash carries on with the next command.

Since the {{{chdir}}} was done by a child process, it has no effect on the parent.

Moreover, there is '''no conceivable way''' you can ''ever'' have a child process affect ''any'' part of the parent's environment, which includes its variables as well as its current working directory.

So, how does one go about it?  You can still have the {{{cd}}} command in an external file, but you can't ''run it'' as a script.  Instead, you must {{{source}}} it (or "dot it in", using the {{{.}}} command, which is a synonym for {{{source}}}).

{{{
   echo 'cd /tmp' > $HOME/mycd
   source $HOME/mycd
   pwd				# Now, we're in /tmp
}}}

[[Anchor(faq61)]]
== Is there a list of which features were added to specific releases of Bash? ==

  * [http://cnswww.cns.cwru.edu/~chet/bash/NEWS NEWS]: a file tersely listing the notable changes between the current and previous versions
  * [http://cnswww.cns.cwru.edu/~chet/bash/CHANGES CHANGES]: a complete bash change history
  * [http://cnswww.cns.cwru.edu/~chet/bash/COMPAT COMPAT]: compatibility issues between bash3 and previous versions

Diff for "BashFAQ"