Differences between revisions 164 and 216 (spanning 52 versions)
Revision 164 as of 2007-02-03 19:34:48
Size: 114335
Editor: h69-11-204-58
Comment:
Revision 216 as of 2007-04-26 20:18:39
Size: 129871
Editor: GreyCat
Comment: merge #5 into #30; #6 -> #16; #43&44 -> #5&6
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
If you can't find the answer you're looking for here, try ["BashPitfalls"].
Line 52: Line 53:
The {{{read}}} command modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the {{{IFS}}} variable has to be cleared: The {{{read}}} command modifies each line read, e.g. by default it removes all leading whitespace characters (blanks, tab characters, ... -- basically any leading characters present in IFS). If that is not desired, the {{{IFS}}} variable has to be cleared:
Line 74: Line 75:
Note that reading a file line by line this way is ''very slow'' for large files. Consider using e.g. ["AWK"] instead if you get performance problems. '''Note that reading a file line by line this way is ''very slow'' for large files. Consider using e.g. ["AWK"] instead if you get performance problems.'''
Line 84: Line 86:
That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24], or use process substitution like: This method is especially useful for processing the output of ''find'' with a block of commands:

{{{
    find . -print0 | while read -d $'\0' file; do
        mv $file ${file// /_}
    done
}}}

This command reads one filename at a time from the file command and renames the file so that its spaces are replaced by underscores.
Note the usage of ''-print0'' in the find command, which uses ''NULL bytes'' as filename delimiters, and ''-d $'\0' ''in the read command to instruct it to read all text into the file variable until it finds a NULL byte. By default, find and read delimit their input with newlines; however, since filenames can potentially contain newlines themselves, this default behaviour will split those filenames with newlines up and cause the command block to fail. Filenames can never contain NULL bytes.


Using a pipe to send find's output into a while loop places the loop in a ''subshell'' and may therefore cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24], or use process substitution like:
Line 150: Line 164:
What if you want the exit status of a command in a few that are piped to each other? Use the {{{PIPESTATUS}}} array (BASH only). Say you want the exit status of {{{grep}}} in the following:

{{{
    grep foo somelogfile | head -5
    result=${PIPESTATUS[0]}
}}}

Now, some trickier stuff. Let's say you want ''only'' the stderr, but not stdout. Well, then first you have to decide where you ''do'' want stdout to go:

{{{
    var=$(command 2>&1 >/dev/null) # Save stderr, discard stdout.
    var=$(command 2>&1 >/dev/tty) # Save stderr, send stdout to the terminal.
}}}

It's possible, although considerably harder, to let stdout "fall through" to wherever it would've gone if there hadn't been any redirection. This involves "saving" the current value of stdout, so that it can be used inside the command substitution:

{{{
    exec 3>&1 # Save the place that stdout (1) points to.
    var=$(command 2>&1 1>&3) # Run command. stderr is captured.
    exec 3>&- # Close FD #3.
}}}

What you ''cannot'' do is capture stdout in one variable, and stderr in another, using only FD redirections. You must use a temporary file to achieve that one.
Line 162: Line 200:

For more examples of sed 1-liners, see [http://www.student.northpark.edu/pemente/sed/sed1line.txt sed 1-liners] or [http://sed.sourceforge.net/sedfaq.html the sed FAQ].
Line 175: Line 215:
Line 237: Line 276:
(You may want to use the latter anyway, if there's a possibility that the glob  may match directories in addition to files.) (You may want to use the latter anyway, if there's a possibility that the glob may match directories in addition to files.)
Line 240: Line 279:
== How can I convert all upper-case file names to lower case? ==
{{{
# tolower - convert file names to lower case

for file in *
do
    [ -f "$file" ] || continue # ignore non-existing names
    newname=$(echo "$file" | tr 'A-Z' 'a-z') # lower-case version of file name
    [ "$file" = "$newname" ] && continue # nothing to do
    [ -f "$newname" ] && continue # do not overwrite existing files
    mv "$file" "$newname"
done
}}}

Purists will insist on using
{{{
tr '[:upper:]' '[:lower:]'
}}}
in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them.

This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.

{{{
# renamefiles - rename files whose name contain unusual characters
for file in *
do
    [ -f "$file" ] || continue # ignore non-existing names
    newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g')
    [ "$file" = "$newname" ] && continue # nothing to do
    [ -f "$newname" ] && continue # do not overwrite existing files
    mv "$file" "$newname"
done
}}}

The character class in {{{[]}}} contains all allowed characters; modify it as needed.

If you have the utility "mmv" on your machine, you could simply do

{{{
mmv "*" "#l1"
}}}
== How can I use array variables? ==

BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.

 {{{
 host[0]="micky"
 host[1]="minnie"
 host[2]="goofy"
 i=0
 while (($i < ${#host[@]} ))
 do
     echo "host number $i is ${host[i++]}"
 done}}}

The awkward experssion {{{ ${#host[@]} }}} returns the number of elements for the array {{{host}}}. Also noteworthy is the fact that inside the square brackets, {{{i++}}} works as a C programmer would expect. The square brackets in an array reference force an ArithmeticExpression.

It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:

 {{{
 # BASH
 array=(one two three four)
 # KornShell
 set -A array -- one two three four}}}

Using array elements ''en masse'' is one of the key features. Much like {{{"$@"}}} for the positional parameters, {{{"${arr[@]}"}}} expands the array to a list of words, one array element per word, even if the words contain internal whitespace. For example,

 {{{
 for x in "${arr[@]}"; do
   echo "next element is '$x'"
 done}}}

If one simply wants to dump the full array, {{{"${arr[*]}"}}} will cause the elements to be concatenated together, with the first character of {{{IFS}}} (a space by default) between them.

 {{{
 arr=(x y z)
 IFS=/; echo "${arr[*]}"; unset IFS
 # prints x/y/z}}}

BASH's arrays are also ''sparse''. Elements may be added and deleted out of sequence.

 {{{
 arr=(0 1 2 3)
 arr[42]="what was the question?"
 unset arr[2]
 echo "${arr[*]}"
 # prints 0 1 3 what was the question?}}}

BASH 3.0 added the ability to retrieve the list of index values in an array, rather than just iterating over the elements:

 {{{
 echo ${!arr[*]}
 # using the previous array, prints 0 1 3 42}}}

[#faq73 Parameter Expansions] may be performed on array elements ''en masse'' as well:

 {{{
 arr=(abc def ghi jkl)
 echo "${arr[@]#?}" # prints bc ef hi kl
 echo "${arr[@]/[aeiou]/}" # prints bc df gh jkl
 }}}

For examples of loading data into arrays, see [#faq1 FAQ #1]. For examples of using arrays to hold complex shell commands, see [#faq50 FAQ #50] and [#faq40 FAQ #40].
Line 284: Line 343:
== How can I use a logical AND in a shell pattern (glob)? ==
That can be achieved through the !() extglob operator. You'll need {{{extglob}}} set. It can be checked with:
{{{
$ shopt extglob
}}}

and set with:
{{{
$ shopt -s extglob
}}}

To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:
{{{
$ mv foo!(*.d) foo_thursday.d
}}}

For the general case:

Delete all files containing Pink_Floyd AND not containing The_Final_Cut:

{{{
$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)
}}}

By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.
== How can I use associative arrays or variable variables? ==

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array:

 {{{
 # KornShell93 script - does not work with BASH
 typeset -A homedir # Declare KornShell93 associative array
 homedir[jim]=/home/jim
 homedir[silvia]=/home/silvia
 homedir[alex]=/home/alex
 
 for user in ${!homedir[@]} # Enumerate all indices (user names)
 do
     echo "Home directory of user $user is ${homedir[$user]}"
 done}}}

BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:

 {{{
 for user in jim silvia alex
 do
     eval homedir_$user=/home/$user
 done}}}

This creates the variables

 {{{
 homedir_jim=/home/jim
 homedir_silvia=/home/silvia
 homedir_alex=/home/alex}}}

with the corresponding content. Note the use of the {{{eval}}} command, which interprets a command line not just one time like the shell usually does, but '''twice'''. In the first step, the shell uses the input {{{homedir_$user=/home/$user}}} to create a new line {{{homedir_jim=/home/jim}}}. In the second step, caused by {{{eval}}}, this variable assignment is executed, actually creating the variable.

Print the variables using

 {{{
 for user in jim silvia alex
 do
     varname=homedir_$user # e.g. "homedir_jim"
     eval varcontent='$'$varname # e.g. "/home/jim"
     echo "home directory of $user is $varcontent"
 done}}}

The {{{eval}}} line needs some explanation. In a first step the command substitution is run:

 {{{
 eval varcontent='$'$varname}}}

becomes

 {{{
 eval varcontent=$homedir_jim}}}

In a second step the {{{eval}}} re-evaluates the line, and converts this to

 {{{
 varcontent=/home/jim}}}

Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:

 1. it's hard to read and to maintain
 1. the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named {{{hong-hu}}}, because a dash '-' can be no valid part of a user name.
 1. Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it.

Here is the summary. "{{{var}}}" is a constant prefix, "{{{$index}}}" contains index string, "{{{$content}}}" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:

 * Set variables

  {{{
  eval "var$index=\"$content\"" # index must only contain characters from [a-zA-Z0-9_]}}}

 * Print variable content

  {{{
  eval "echo \"var$index=\$$varname\""}}}

 * Check if a variable is empty

  {{{
  if eval "[ -z "\$var$index\" ]"
  then echo "variable is empty: $var$index"
  fi}}}

You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.
Line 375: Line 493:
== My command line produces no output: tail -f logfile | grep 'ssh' == == My command line produces no output: tail -f logfile | grep 'foo bar' ==
Line 393: Line 511:
    unbuffer tail -f logfile | grep 'ssh'     unbuffer tail -f logfile | grep 'foo bar'
Line 412: Line 530:
   /ssh    /foo bar
Line 444: Line 562:
Another way, more obvious to some, is to grab the last line from a listing of the first n lines:
{{{
   head -n $n $file | tail -n 1
}}}

Using awk:
{{{
   awk 'NR==n{print;exit}' file
}}}
Line 451: Line 579:
== How can I concatenate two variables? == == How can I concatenate two variables?  How do I append a string to a variable? ==
Line 464: Line 592:
Braces can be used to disambiguate the right-hand side: If you're appending a string that doesn't "look like" part of a variable name, you just smoosh it all together:

{{{
    var=$var1/.-
}}}

Otherwise, braces or quotes may be used to disambiguate the right-hand side:
Line 468: Line 602:
    # without braces, var1xyzzy would be interpreted as a variable name
    # Another equivalent way would be:
    # Without braces, var1xyzzy would be interpreted as a variable name
Line 471: Line 605:
    # Alternative syntax
Line 479: Line 614:
Appending data to the end of a string doesn't require any black magic, either. There's no difference when the variable name is reused, either:
Line 568: Line 703:
== How can I remove a file name extension from a string, e.g. file.tar to file? ==
The easiest (and fastest) way is to use the following:

{{{
    $ name="file.tar"
    $ echo "${name%.tar}"
    file
}}}

The {{{${var%pattern}}}} syntax removes the pattern from the end of the variable. {{{${var#pattern}}}} would remove pattern from the start of the string. This could be used to rename all files from "*.doc" to "*.txt":

{{{
    for file in *.doc
    do
        mv "$file" "${file%.doc}".txt
    done
}}}

There's more to ParameterSubstitution, e.g. {{{${var%%pattern}, ${var##pattern}, ${var//old/new}}}}.

Note that this extended form of ParameterSubstitution works with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, {{{sed}}} could be used to remove the filename extension part:

{{{
    for file in *.doc
    do
        base=`echo "$file" | sed 's/\.[^.]*$//'` # remove everything starting with last '.'
        mv "$file" "$base".txt
    done
}}}

Finally, some GNU/Linux/BSD systems offer a {{{rename}}} command. There are multiple different {{{rename}}} commands out there with contradictory syntaxes. Consult your man pages to see which one you have (if any).
== How can I use a logical AND in a shell pattern (glob)? ==
That can be achieved through the !() extglob operator. You'll need {{{extglob}}} set. It can be checked with:
{{{
$ shopt extglob
}}}

and set with:
{{{
$ shopt -s extglob
}}}

To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:
{{{
$ mv foo!(*.d) foo_thursday.d
}}}

For the general case:

Delete all files containing Pink_Floyd AND not containing The_Final_Cut:

{{{
$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)
}}}

By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.
Line 727: Line 856:

I found that on bash 2 you can nest seq in back ticks and this will work as well.

{{{
printf "%03d \n" `seq 300`
}}}
Line 900: Line 1035:
        then rm "$file"-new # file nas not changed         then rm "$file"-new # file has not changed
Line 939: Line 1074:
If you are trying to compare floating point numbers, be aware that a simple ''x < y'' is not supported by all versions of {{{bc}}}. Alternatively, you could use this:

{{{
    if [[ $(bc <<< "1.4 - 2.5") = -* ]]; then
        echo "1.4 is less than 2.5."
    fi
}}}

This example substracts 2.5 from 1.4, and checks the sign of the result. If it is negative, the former number is less than the latter.
Line 953: Line 1098:
== How do I append a string to the contents of a variable? ==
The shell doesn't have a string concatenation operator like Java ("+") or Perl ("."). The following example shows how to append the string ".2004-08-15" to the contents of the shell variable {{{filename}}}:

{{{
    filename="$filename.2004-08-15"
}}}

If the variable name and the string to append could be confused, the variable name can be enclosed in braces, e.g.

{{{
    filename="${filename}old"
}}}

instead of {{{filename=$filenameold}}}
== Removed. ==
Line 1070: Line 1202:
Use {{{${10}}}} instead of {{{$10}}}. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use {{{for}}}, e.g. to get the last parameter: Use {{{${10} }}}instead of {{{$10}}}. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use {{{for}}}, e.g. to get the last parameter:
Line 1108: Line 1240:
== How can I randomize (shuffle) the order of lines in a file? == == How can I randomize (shuffle) the order of lines in a file?  (Or select a random line from a file, or select a random file from a directory.) ==
Line 1256: Line 1388:
Or a shell-independent variant (needs a {{{readlink}}} supporting {{{-f}}}, though)

{{{
  readlink -f "$0";
}}}
Line 1341: Line 1479:
== How can I rename all my *.foo files to *.bar? ==
Some GNU/Linux distributions have a rename command, which you can use for this purpose; however, the syntax differs from one distribution to the next, so it's not a portable answer.

You can do it in POSIX shells like this:
== How can I rename all my *.foo files to *.bar?  How can I convert all upper-case file names to lower case? ==

Some GNU/Linux distributions have a rename command, which you can use for the former; however, the syntax differs from one distribution to the next, so it's not a portable answer.  (Consult your system's man pages if you want to learn how to use yours, if you have one at all. It's often perfectly good for one-shot interactive renames, just not in portable scripts.)

You can do mass renames in POSIX shells like this:
Line 1358: Line 1497:
}}}

To convert filenames to lower case:

{{{
# tolower - convert file names to lower case

for file in *
do
    [ -f "$file" ] || continue # ignore non-existing names
    newname=$(echo "$file" | tr 'A-Z' 'a-z') # lower-case version of file name
    [ "$file" = "$newname" ] && continue # nothing to do
    [ -f "$newname" ] && continue # do not overwrite existing files
    mv "$file" "$newname"
done
}}}

Purists will insist on using
{{{
tr '[:upper:]' '[:lower:]'
}}}
in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them. Note that {{{tr}}} can behave ''very'' strangely when using the {{{A-Z}}} range on some systems:

{{{
imadev:~$ echo Hello | tr A-Z a-z
hÉMMÓ
}}}

To make sure you aren't caught by surprise when using {{{tr}}}, either use the fancy range notations, or set your locale to C.

{{{
imadev:~$ echo Hello | LC_ALL=C tr A-Z a-z
hello
imadev:~$ echo Hello | tr '[:upper:]' '[:lower:]'
hello
# Either way is fine here.
}}}

This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.

{{{
# renamefiles - rename files whose name contain unusual characters
for file in *
do
    [ -f "$file" ] || continue # ignore non-existing names
    newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g')
    [ "$file" = "$newname" ] && continue # nothing to do
    [ -f "$newname" ] && continue # do not overwrite existing files
    mv "$file" "$newname"
done
}}}

The character class in {{{[]}}} contains all allowed characters; modify it as needed.

If you have the utility "mmv" on your machine, you could simply do

{{{
mmv "*" "#l1"
Line 1567: Line 1764:

To use as a function called from a loop on every iteration, for example:
{{{
sp="/-\|"
sc=0
spin() {
   echo -ne "\b${sp:sc++:1}"
   ((sc==4)) && sc=0
}
}}}
When printing the next output line (ie when the spin is over) use: {{{ echo -e "\r$line" }}} or: {{{ echo -en '\r'; echo "$line" }}}
Line 1568: Line 1777:
Line 1590: Line 1800:
    x=1 # Avoids an error if we get no options at all.
Line 1793: Line 2004:
== How can I use array variables? ==

BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.

 {{{
 host[0]="micky"
 host[1]="minnie"
 host[2]="goofy"
 i=0
 while (($i < ${#host[@]} ))
 do
     echo "host number $i is ${host[i++]}"
 done}}}

The awkward experssion {{{ ${#host[@]} }}} returns the number of elements for the array {{{host}}}.

It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:

 {{{
 # BASH
 array=(one two three four)
 # KornShell
 set -A array -- one two three four}}}
== Removed. ==
Line 1818: Line 2007:
== How can I use associative arrays or variable variables? ==

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array:

 {{{
 # KornShell93 script - does not work with BASH
 typeset -A homedir # Declare KornShell93 associative array
 homedir[jim]=/home/jim
 homedir[silvia]=/home/silvia
 homedir[alex]=/home/alex
 
 for user in ${!homedir[@]} # Enumerate all indices (user names)
 do
     echo "Home directory of user $user is ${homedir[$user]}"
 done}}}

BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:

 {{{
 for user in jim silvia alex
 do
     eval homedir_$user=/home/$user
 done}}}

This creates the variables

 {{{
 homedir_jim=/home/jim
 homedir_silvia=/home/silvia
 homedir_alex=/home/alex}}}

with the corresponding content. Note the use of the {{{eval}}} command, which interprets a command line not just one time like the shell usually does, but '''twice'''. In the first step, the shell uses the input {{{homedir_$user=/home/$user}}} to create a new line {{{homedir_jim=/home/jim}}}. In the second step, caused by {{{eval}}}, this variable assignment is executed, actually creating the variable.

Print the variables using

 {{{
 for user in jim silvia alex
 do
     varname=homedir_$user # e.g. "homedir_jim"
     eval varcontent='$'$varname # e.g. "/home/jim"
     echo "home directory of $user is $varcontent"
 done}}}

The {{{eval}}} line needs some explanation. In a first step the command substitution is run:

 {{{
 eval varcontent='$'$varname}}}

becomes

 {{{
 eval varcontent=$homedir_jim}}}

In a second step the {{{eval}}} re-evaluates the line, and converts this to

 {{{
 varcontent=/home/jim}}}

Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:

 1. it's hard to read and to maintain
 1. the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named {{{hong-hu}}}, because a dash '-' can be no valid part of a user name.
 1. Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it.

Here is the summary. "{{{var}}}" is a constant prefix, "{{{$index}}}" contains index string, "{{{$content}}}" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:

 * Set variables

  {{{
  eval "var$index=\"$content\"" # index must only contain characters from [a-zA-Z0-9_]}}}

 * Print variable content

  {{{
  eval "echo \"var$index=\$$varname\""}}}

 * Check if a variable is empty

  {{{
  if eval "[ -z "\$var$index\" ]"
  then echo "variable is empty: $var$index"
  fi}}}

You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.
== Removed. ==
Line 2022: Line 2128:
Let's suppose you have your "list" stored as a big string of words, with spaces in between them. (That's the most common case when people are asking this one.) What you actually want to do is determine whether the string " foo " (note the spaces around it) appears in the list. But since your list may not have leading/trailing spaces, you have to add them as well. So, here's the most portable way to do it:

  {{{
  if echo " $list " | grep " foo " >/dev/null; then ....}}}

GNU grep seems to have a special {{{-w}}} extension which lets you avoid the spaces:

  {{{
  if echo "$list" | GNUgrep -q -w "foo"; then ....}}}

Finally, if you want to use Bash builtins, you can do it thus:

  {{{
  if [[ " $list " = *\ foo\ * ]]; then ....}}}

This is basically the same as the original grep -- we surround both the list and the word (foo) with spaces, and then do a simple text matching.
The safest way to do this would be to loop over all elements in your set/list and check them for the element/word you are looking for. Say we are looking for the content of bar in the array foo:
   {{{
   for element in "${foo[@]}"; do
      [[ $element = $bar ]] && echo "Found $bar."
   done}}}

Or, to stop searching when you find it:
   {{{
   for element in "${foo[@]}"; do
      [[ $element = $bar ]] && { echo "Found $bar."; break; }
   done}}}

If for some reason your list/set is not in an array, but is a string of words, and the element you are searching for is also a word, you can use this:
   {{{
   for element in $foo; do
      [[ $element = $bar ]] && echo "Found $bar."
   done}}}

A less safe, but more clever version:
   {{{
   if [[ " $foo " = *\ "$bar"\ * ]]; then
      echo "Found $bar."
   fi}}}

And, if for some reason you don't know the syntax of for well enough, here's how to check your script's parameters for an element. For example, '-v':
   {{{
   for element; do
      [[ $element = '-v' ]] && echo "Switching to verbose mode."
   done}}}

GNU's grep has a {{{\b}}} feature which allegedly matches the edges of words. Using that, one may attempt to replicate the "clever" approach used above, but it is fraught with peril:

   {{{
   # Is 'foo' one of the positional parameters?
   egrep '\bfoo\b' <<<"$@" >/dev/null && echo yes
   # This is where it fails: is '-v' one of the positional parameters?
   egrep '\b-v\b' <<<"$@" >/dev/null && echo yes
   # Unfortunately, \b sees "v" as a separate word.
   # Nobody knows what the hell it's doing with the "-".

   # Is "someword" in the array 'array'?
   egrep '\bsomeword\b' <<<"${array[@]}"
   # Obviously, you can't use this if someword is '-v'!}}}
 
Since this "feature" of GNU grep is both non-portable and poorly defined, we don't recommend using it.
   
Line 2210: Line 2345:
Too see what the shell is doing with quotes use the set -x command in the terminal or use #!/bin/bash -x in a script
Line 2416: Line 2553:
as in, convert: As in, one wants to convert:
Line 2430: Line 2567:
there are two simple general methods for this:
        a. sort the file, and then iterate over it, collectin entries until the prefix changes, and then print the collected entries with the previous prefix
        b iterate over the file, collect entries for each prefix in an array indexed by the prefix

a basic implementation of a) in bash:
There are two simple general methods for this:

a. sort the file, and then iterate over it, collecting entries until the prefix changes, and then print the collected entries with the previous prefix
 a. iterate over the file, collect entries for each prefix in an array indexed by the prefix

A basic implementation of '''a''' in bash:
Line 2448: Line 2586:
and a basic implementation of b) in awk: And a basic implementation of '''b''' in awk:
Line 2457: Line 2595:
usage: Written out as a shell command:
Line 2464: Line 2602:
the answer is, basically no...
while bash won't have as much problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them.
one instance where such would sometimes be handy is for example storing small temporary bitmaps while working with netpbm... here i resorted to adding an extra pnmnoraw to the pipe, creating (larger) ascii files that bash has no problems storing)

if you are feeling adventurous, consider this experiment:
The answer is, basically, no.

W
hile bash won't have as many problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them.

O
ne instance where such would sometimes be handy is storing small temporary bitmaps while working with netpbm... here I resorted to adding an extra pnmnoraw to the pipe, creating (larger) ASCII files that bash has no problems storing).

If you are feeling adventurous, consider this experiment:
Line 2484: Line 2624:
this suggests that a the 0 character is skipped entirely, because we can't create it with the input generation, enough to conveniently corrupt most binary files we try to process

(note that this refers to storing them in variables... moving data between programs using pipes is always binary clean)

This suggests that the 0 character is skipped entirely, because we can't create it with the input generation, enough to conveniently corrupt most binary files we try to process.

 ''Yes, Bash is written in C, and uses C semantics for handling strings -- including the NUL byte as string terminator -- in its variables. You cannot store NUL in a Bash variable sanely. It simply was never intended to be used for this. - GreyCat''

Note that this refers to storing them in variables... moving data between programs using pipes is always binary clean. Temporary files are also safe, as long as [#faq62 appropriate precautions] are taken when creating them.
Line 2489: Line 2632:
== How can I remove the last character of a line? ==
Using bash and ksh extended parameter substitution:

{{{
    var=${var%?}
}}}

Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of {{{var}}}. As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least).

More portable, but slower:

{{{
    var=`expr "$var" : '\(.*\).'`
}}}

or (using {{{sed}}}):

{{{
    var=`echo "$var" | sed 's/.$//'`
}}}
== Removed. ==
Line 2627: Line 2751:
{{{
read -p 'press enter to continue'
}}}
Line 2704: Line 2831:
scp ~/.ssh/id_rsa.pub me@remote:
ssh me@remote 'cat id_rsa.pub >> .ssh/authorized_keys'
cat ~/.ssh/id_rsa.pub | ssh me@remote "cat >> ~/.ssh/authorized_keys"
Line 2712: Line 2838:
If you're being prompted for a password even with the public key inserted into the remote {{{authorized_keys}}} file, chances are you have a permissions problem on the remote system. Check '''every single directory''' in the full path leading up to the {{{authorized_keys}}} file and make sure they do '''not''' have world- or group-write privilegs. ''E.g.'', if your home directory is {{{/home/fred}}} and {{{/home}}} has group "staff" write privileges, {{{sshd}}} will refuse to honor your key. If you're being prompted for a password even with the public key inserted into the remote {{{authorized_keys}}} file, chances are you have a permissions problem on the remote system. Check '''every single directory''' in the full path leading up to the {{{authorized_keys}}} file and make sure they do '''not''' have world- or group-write privileges. ''E.g.'', if your home directory is {{{/home/fred}}} and {{{/home}}} has group "staff" write privileges, {{{sshd}}} will refuse to honor your key.
Line 2738: Line 2864:
Reading the SOURCECODE of GNU date's date parser reveals that it accepts Unix timestamps prefixed with '@', so: Now, to convert those Unix timestamps back into human-readable values, one needs to use an external tool. One method is to trick GNU {{{date}}} using:
{{{
   date -d "1970-01-01 UTC + 1164128484 seconds"
   # Prints "Tue Nov 21 12:01:24 EST 2006" in the US/Eastern time zone.
}}}

Reading the source code('''!!''') of GNU {{{date}}}'s date parser reveals that it accepts Unix timestamps prefixed with '@', so:
Line 2744: Line 2876:
Another method that was suggested before is to trick GNU date using:
{{{
   date -d "1970-01-01 UTC + 1164128484 seconds"
   # Prints "Tue Nov 21 12:01:24 EST 2006" in the US/Eastern time zone.
}}}

If you don't have GNU date available, an external language such as Perl can be used:
However, this undocumented feature only appears to work in extremely ''new'' versions of GNU {{{date}}}.

If you don't have GNU {{{date}}} available, an external language such as Perl can be used:
Line 2775: Line 2903:
== How do I convert ASCII character to its decimal value and back? == == How do I convert an ASCII character to its decimal (or hexadecimal) value and back? ==
Line 2790: Line 2918:
 
   hex() {
      printf '%x' "'$1"
   }
Line 2846: Line 2978:
== How do I use 'find'?  I can't understand the man page at all! ==

See UsingFind.

[[Anchor(faq74)]]
== How can I use parameter expansion?
==

Parameter expansion is a separate section of the bash manpage ({{{man bash -P 'less -p "^ Parameter Expansion"'}}}). It can be hard to understand parameter expansion without actually using it. (DO NOT think about parameter expansion like a regex. It is different and distinct.)
== How can I use parameter expansion? How can I get substrings? How can I get a file without its extension, or get just a file's extension? ==

Parameter expansion is a separate section of the bash manpage ({{{man bash -P 'less -p "^ Parameter Expansion"'}}} or [http://tiswww.tis.case.edu/~chet/bash/bashref.html#SEC30 see the reference]). It can be hard to understand parameter expansion without actually using it. (DO NOT think about parameter expansion like a regex. It is different and distinct.)
Line 2879: Line 3006:
You cannot nest parameter expansions. You cannot nest parameter expansions. If you need to perform two separate expansions, use a temporary variable to hold the result of the first expansion.

You may find it helpful to associate that, on your keyboard, the "#" is to the left of the "$" symbol and the "%" symbol is to its right; this corresponds with their acting upon the left (beginning) and right (end) parts of the parameter.

Here are a few more examples (but ''please'' see the real documentation for a list of all the features!). I include these mostly so people won't break the wiki again, trying to add new questions that answer this stuff.

{{{
${string:2:1} # The third character of string (0, 1, 2 = third)
${string:1} # The string starting from the second character
  # Note: this is equivalent to ${string#?}
${string%?} # The string with its last character removed.
${string: -1} # The last character of string
${string:(-1)} # The last character of string, alternate syntax
  # Note: string:-1 means something entirely different.

${file%.mp3} # The filename without the .mp3 extension
  # Very useful in loops of the form: for file in *.mp3; do ...
${file%.*} # The filename without its extension (assuming there was
  # only one extension in the first place...).
${file%%.*} # The filename without all of its extensions
${file##*.} # The extension only.
}}}

[[Anchor(faq74)]]
== How do I get the effects of those nifty Bash Parameter Expansions in older shells? ==

The extended forms of ParameterSubstitution work with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, {{{sed}}} and {{{expr}}} can often be used.

For example, to remove the filename extension part:

{{{
    for file in *.doc
    do
        base=`echo "$file" | sed 's/\.[^.]*$//'` # remove everything starting with last '.'
        mv "$file" "$base".txt
    done
}}}

Another example, this time to remove the last character of a variable:

{{{
    var=`expr "$var" : '\(.*\).'`
}}}

or (using {{{sed}}}):

{{{
    var=`echo "$var" | sed 's/.$//'`
}}}

[[Anchor(faq75)]]
== How do I use 'find'? I can't understand the man page at all! ==

See UsingFind.

[[Anchor(faq76)]]
== How do I get the sum of all the numbers in a column? ==
This and all similar questions are best answered with an ["AWK"] one-liner.

{{{awk '{sum += $1} END {print sum}' myfile
}}}

A small bit of effort can adapt this to most similar tasks (finding the average, skipping lines with the wrong number of fields, etc.).

For more examples of using {{{awk}}}, see [http://www.student.northpark.edu/pemente/awk/awk1line.txt handy one-liners for awk].

[[Anchor(faq77)]]
== How do I log history or "secure" bash against history removal? ==
This is a question which has no answer applicable to bash. You are here because you asked or wanted to know how to find out what a user had executed when they unset or /dev/nulled their shell history. There are several problems with this.

The first issue is:
  kill -9 $$

This innocuous looking command does what you would presume it to: it kills the current shell off. However, the .bash_history is ONLY written to when bash is allowed to exit cleanly. As such, sending SIGKILL to bash will prevent logging to .bash_history

Users may also set variables that disable shell history, or simply make their {{{.bash_history}}} a symlink to {{{/dev/null}}}. All of these will defeat any attempt to spy on them through their {{{.bash_history}}} file.

The second issue is permissions. The bash shell is executed as a user. This means that the user can read or write all content produced by or handled by the shell. Any location you would try to log to, MUST be writeable to by the user, and not a privileged user. This is because the shell specifically tries to ensure the user does not exceed its privileges. Imagine a regular user writing a root read/write only history. This is creative license for exploiting and gaining escalated privileges on the server, and thus an extremely bad idea.

The third issue is location. Assume that you pursue a chroot jail for your bash users. This is a fantastic idea, and a good step towards securing your server. However, placing your users in a chroot jail conversely affects the ability to log the users' actions. Once jailed, your user can only write to content within its specific jail. This makes finding user writeable extraneous logs a simple matter, and enables the attacker to find your hidden logs much easier than would otherwise be the case.

Where does this leave you? Unfortunately, nowhere good, and definitely not what you wanted to know. If you want to record all of the commands issues to BASH by a user, your best bet is to modify BASH so that it actually records them, in '''real time''', as they are executed -- not when the user logs off. This is still not reliable, though, because end users may simply upload their own shell and run ''that'' instead of your hacked BASH. Or they may use one of the other shells already on your system, instead of your hacked BASH. But, for those who absolutely must have some form of patch available, you can use the patch located at http://wooledge.org/~greg/bash_logging.txt (patch submitted by _sho_ -- use at your own risk. The results of a code-review with improvements are here: http://phpfi.com/220302 -- Heiner).

For a more serious approach to this problem, consider BSD process accounting (kernel-based) instead of focusing on shells.

[[Anchor(faq78)]]
== I want to set a user's password using the Unix passwd command, but how do I script that? It doesn't read standard input! ==

OK, first of all, I ''know'' there are going to be some people reading this, right now, who don't even understand the question. Here, this '''does not work''':

{{{
{ echo oldpass; echo newpass; echo newpass; } | passwd
# This DOES NOT WORK!
}}}

Nothing you can do in bash can ''possibly'' work. {{{passwd(1)}}} does not read from standard input. This is ''intentional''. It is for your protection. Passwords were never intended to be put into programs, or generated '''by''' programs. They were intended to be entered only by the fingers of an actual human being, with a functional brain, and never, ever written down anywhere.

Nonetheless, we get hordes of users asking how they can circumvent 35 years of Unix security.

You have three choices. The first is to manually generate your own hashed password strings (for example, using http://wooledge.org/~greg/crypt/ or a similar tool) and then write them to your system's local password-hash file (which may be {{{/etc/passwd}}}, or {{{/etc/shadow}}}, or {{{/etc/master.passwd}}}, or {{{/etc/security/passwd}}}, or ...). This requires that you read the relevant man pages on your system, find out where the password hash goes, what formatting the file requires, and then construct code that writes it out in that format.

The second is to use [http://expect.nist.gov/ expect]. I think it even has this ''exact'' problem as one of its canonical examples.

The third is to use some system-specific tools which may or may not exist on your platform. For example, some GNU/Linux systems have a {{{chpasswd(8)}}} tool which can be coerced into doing these sorts of things.

See also [#faq69 FAQ #69].

[[Anchor(faq79)]]
== How can I grep for lines containing foo AND bar, foo OR bar? ==

Well, for lines containing foo AND bar, two grep statements are needed.

{{{
grep foo| grep bar
}}}

If you prefer, you can achieve this in one sed, or awk statement.

{{{
sed -n '/foo/{/bar/p}'
awk '/foo/ && /bar/'
}}}

And for lines containing foo OR bar, grep can do it "nicely", but it can also be done with sed, awk, etc.

{{{
egrep 'foo|bar'
grep -E 'foo|bar'
}}}

[[Anchor(faq80)]]
== How can I make an alias that takes an argument? ==

You can't. Aliases in bash are extremely rudimentary, and not really suitable to any serious purpose. The bash man page even says so explicitly:

 There is no mechanism for using arguments in the replacement text. If arguments are needed, a shell function should be used (see FUNCTIONS below).

Use a function instead. For example,

{{{
settitle() { case $TERM in *xterm*|*rxvt*) echo -en "\e]2;$1\a";; esac; }
}}}

[[Anchor(faq81)]]
== How can I determine whether a command exists anywhere in my PATH? ==

In BASH, there are a couple builtins that are suitable for this purpose: {{{hash}}} and {{{type}}}. Here's an example using {{{hash}}}:

{{{
if hash qwerty 2>/dev/null; then
  echo qwerty exists
else
  echo qwerty does not exist
fi
}}}

If these builtins are not available (because you're in a Bourne shell, or whatever), then you may have to rely on the external command {{{which}}} (which is often a csh script, although sometimes a compiled binary). Unfortunately, {{{which}}} does ''not'' set a useful exit code -- and it doesn't even write errors to stderr! Therefore, one must parse its output.

{{{
# Last resort -- using which(1)
x=$(LC_ALL=C which qwerty 2>&1)
case "$x" in
  no\ *\ in\ *) echo qwerty does not exist;;
  *Command\ not\ found.) echo qwerty does not exist;;
  '') echo qwerty does not exist;;
  *) echo qwerty exists;;
esac
}}}

(Also note that its output is ''not'' consistent across platforms. On HP-UX, for example, it prints {{{no qwerty in /path /path /path ...}}}; on OpenBSD, it prints {{{qwerty: Command not found.}}}; and on GNU/Linux, it prints nothing at all.)

BASH Frequently Asked Questions

These are answers to frequently asked questions on channel #bash on the [http://www.freenode.net/ freenode] IRC network. These answers are contributed by the regular members of the channel (originally heiner, and then others including greycat and r00t), and by users like you. If you find something inaccurate or simply misspelled, please feel free to correct it!

All the information here is presented without any warranty or guarantee of accuracy. Use it at your own risk. When in doubt, please consult the man pages or the GNU info pages as the authoritative references.

["BASH"] is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the KornShell, too. If a question is not strictly shell specific, but rather related to Unix, it may be in the UnixFaq.

If you can't find the answer you're looking for here, try ["BashPitfalls"]. If you want to help, you can add new questions with answers here, or try to answer one of the BashOpenQuestions.

TableOfContents

Anchor(faq1)

1. How can I read a file line-by-line?

    while read line
    do
        echo "$line"
    done < "$file"

If you want to operate on individual fields within each line, you may supply additional variables to read:

    # Input file has 3 columns separated by white space.
    while read first_name last_name phone; do
      ...
    done < "$file"

If the field delimiters are not whitespace, you can set IFS (input field separator):

    while IFS=: read user pass uid gid gecos home shell; do
      ...
    done < /etc/passwd

Also, please note that you do not necessarily need to know how many fields each line of input contains. If you supply more variables than there are fields, the extra variables will be empty. If you supply fewer, the last variable gets "all the rest" of the fields after the preceding ones are satisfied. For example,

    while read first_name last_name junk; do
      ...
    done <<< 'Bob Smith 123 Main Street Elk Grove Iowa 123-555-6789'
    # Inside the loop, first_name will contain "Bob", and
    # last_name will contain "Smith".  The variable "junk" holds
    # everything else.

The read command modifies each line read, e.g. by default it removes all leading whitespace characters (blanks, tab characters, ... -- basically any leading characters present in IFS). If that is not desired, the IFS variable has to be cleared:

    OIFS=$IFS; IFS=
    while read line
    do
        echo "$line"
    done < "$file"
    IFS=$OIFS

As a feature, the read command concatenates lines that end with a backslash '\' character to one single line. To disable this feature, KornShell and ["BASH"] have read -r:

    OIFS=$IFS; IFS=
    while read -r line
    do
        echo "$line"
    done < "$file"
    IFS=$OIFS

Note that reading a file line by line this way is very slow for large files. Consider using e.g. ["AWK"] instead if you get performance problems.

One may also read from a command instead of a regular file:

    some command | while read line; do
       other commands
    done

This method is especially useful for processing the output of find with a block of commands:

    find . -print0 | while read -d $'\0' file; do
        mv $file ${file// /_}
    done

This command reads one filename at a time from the file command and renames the file so that its spaces are replaced by underscores. Note the usage of -print0 in the find command, which uses NULL bytes as filename delimiters, and -d $'\0' in the read command to instruct it to read all text into the file variable until it finds a NULL byte. By default, find and read delimit their input with newlines; however, since filenames can potentially contain newlines themselves, this default behaviour will split those filenames with newlines up and cause the command block to fail. Filenames can never contain NULL bytes.

Using a pipe to send find's output into a while loop places the loop in a subshell and may therefore cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24], or use process substitution like:

    while read line; do
        other commands
    done < <(some command)

Sometimes it's useful to read a file into an array, one array element per line. You can do that with the following example:

    O=$IFS IFS=$'\n' arr=($(< myfile)) IFS=$O

This temporarily changes the Input Field Separator to a newline, so that each line will be considered one field by read. Then it populates the array arr with the fields. Then it sets the IFS back to what it was before.

This same trick works on a stream of data as well as a file:

    O=$IFS IFS=$'\n' arr=($(find . -type f)) IFS=$O

Of course, this will blow up in your face if the filenames contain newlines; see [#faq20 FAQ 20] for hints on dealing with such filenames.

Anchor(faq2)

2. How can I store the return value of a command in a variable?

Well, that depends on exactly what you mean by that question. Some people want to store the command's output (either stdout, or stdout + stderr); and others want to store the command's exit status (0 to 255, with 0 typically meaning "success").

If you want to capture the output:

    var=$(command)      # stdout only; stderr remains uncaptured
    var=$(command 2>&1) # both stdout and stderr will be captured

If you want the exit status:

    command
    var=$?

If you want both:

    var1=$(command)
    var2=$?            # the assignment to var1 has no effect on command's exit status, which is still in $?

If you don't actually want the exit status, but simply want to take an action upon success or failure:

    if command
    then
        echo "it succeeded"
    else
        echo "it failed"
    fi

Or (shorter):

    command && echo "it succeeded" || echo "it failed"

What if you want the exit status of a command in a few that are piped to each other? Use the PIPESTATUS array (BASH only). Say you want the exit status of grep in the following:

    grep foo somelogfile | head -5
    result=${PIPESTATUS[0]}

Now, some trickier stuff. Let's say you want only the stderr, but not stdout. Well, then first you have to decide where you do want stdout to go:

    var=$(command 2>&1 >/dev/null)  # Save stderr, discard stdout.
    var=$(command 2>&1 >/dev/tty)   # Save stderr, send stdout to the terminal.

It's possible, although considerably harder, to let stdout "fall through" to wherever it would've gone if there hadn't been any redirection. This involves "saving" the current value of stdout, so that it can be used inside the command substitution:

    exec 3>&1                    # Save the place that stdout (1) points to.
    var=$(command 2>&1 1>&3)     # Run command.  stderr is captured.
    exec 3>&-                    # Close FD #3.

What you cannot do is capture stdout in one variable, and stderr in another, using only FD redirections. You must use a temporary file to achieve that one.

Anchor(faq3)

3. How can I insert a blank character after each character?

    sed 's/./& /g'

Example:

    $ echo "testing" | sed 's/./& /g'
    t e s t i n g

For more examples of sed 1-liners, see [http://www.student.northpark.edu/pemente/sed/sed1line.txt sed 1-liners] or [http://sed.sourceforge.net/sedfaq.html the sed FAQ].

Anchor(faq4)

4. How can I check whether a directory is empty or not?

We can test for the exit status of ls:

    if ls "$directory"/file.txt; then
         echo "file.txt found!"
    else
         echo "file.txt not found."
    fi

The following idea counts the number of entries in the specified directory (omitting ".." and "."):

    find "$dir" -maxdepth 0 -links 2 \
     -exec echo "empty directory: {}" \;

Conversely, to find a non-empty directory:

    find "$dir" -maxdepth 0 -links +2 \
     -exec echo "directory is non-empty" \;

Most modern systems have an "ls -A" which explicitly omits "." and ".." from the directory listing:

    if [ -n "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi

This can be shortened to:

    if [ "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi

Another way, using Bash features, involves setting the special shell option which changes the behavior of globbing. Some people prefer to avoid this approach, because it's so drastically different and could severely alter the behavior of scripts.

Nevertheless, if you're willing to use this approach, it does greatly simplify this particular task:

    shopt -s nullglob
    if [[ -z $(echo *) ]]; then
        echo directory is empty
    fi

It also simplifies various other operations:

    shopt -s nullglob
    for i in *.zip; do
        blah blah "$i"  # No need to check $i is a file.
    done

Without the shopt, that would have to be:

    for i in *.zip; do
        [[ -f $i ]] || continue  # If no .zip files, i becomes *.zip
        blah blah "$i"
    done

(You may want to use the latter anyway, if there's a possibility that the glob may match directories in addition to files.)

Anchor(faq5)

5. How can I use array variables?

BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.

  •  host[0]="micky"
     host[1]="minnie"
     host[2]="goofy"
     i=0
     while (($i < ${#host[@]} ))
     do
         echo "host number $i is ${host[i++]}"
     done

The awkward experssion  ${#host[@]}  returns the number of elements for the array host. Also noteworthy is the fact that inside the square brackets, i++ works as a C programmer would expect. The square brackets in an array reference force an ArithmeticExpression.

It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:

  •  # BASH
     array=(one two three four)
     # KornShell
     set -A array -- one two three four

Using array elements en masse is one of the key features. Much like "$@" for the positional parameters, "${arr[@]}" expands the array to a list of words, one array element per word, even if the words contain internal whitespace. For example,

  •  for x in "${arr[@]}"; do
       echo "next element is '$x'"
     done

If one simply wants to dump the full array, "${arr[*]}" will cause the elements to be concatenated together, with the first character of IFS (a space by default) between them.

  •  arr=(x y z)
     IFS=/; echo "${arr[*]}"; unset IFS
     # prints x/y/z

BASH's arrays are also sparse. Elements may be added and deleted out of sequence.

  •  arr=(0 1 2 3)
     arr[42]="what was the question?"
     unset arr[2]
     echo "${arr[*]}"
     # prints 0 1 3 what was the question?

BASH 3.0 added the ability to retrieve the list of index values in an array, rather than just iterating over the elements:

  •  echo ${!arr[*]}
     # using the previous array, prints 0 1 3 42

[#faq73 Parameter Expansions] may be performed on array elements en masse as well:

  •  arr=(abc def ghi jkl)
     echo "${arr[@]#?}"          # prints bc ef hi kl
     echo "${arr[@]/[aeiou]/}"   # prints bc df gh jkl

For examples of loading data into arrays, see [#faq1 FAQ #1]. For examples of using arrays to hold complex shell commands, see [#faq50 FAQ #50] and [#faq40 FAQ #40].

Anchor(faq6)

6. How can I use associative arrays or variable variables?

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array:

  •  # KornShell93 script - does not work with BASH
     typeset -A homedir             # Declare KornShell93 associative array
     homedir[jim]=/home/jim
     homedir[silvia]=/home/silvia
     homedir[alex]=/home/alex
     
     for user in ${!homedir[@]}     # Enumerate all indices (user names)
     do
         echo "Home directory of user $user is ${homedir[$user]}"
     done

BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:

  •  for user in jim silvia alex
     do
         eval homedir_$user=/home/$user
     done

This creates the variables

  •  homedir_jim=/home/jim
     homedir_silvia=/home/silvia
     homedir_alex=/home/alex

with the corresponding content. Note the use of the eval command, which interprets a command line not just one time like the shell usually does, but twice. In the first step, the shell uses the input homedir_$user=/home/$user to create a new line homedir_jim=/home/jim. In the second step, caused by eval, this variable assignment is executed, actually creating the variable.

Print the variables using

  •  for user in jim silvia alex
     do
         varname=homedir_$user              # e.g. "homedir_jim"
         eval varcontent='$'$varname        # e.g. "/home/jim"
         echo "home directory of $user is $varcontent"
     done

The eval line needs some explanation. In a first step the command substitution is run:

  •  eval varcontent='$'$varname

becomes

  •  eval varcontent=$homedir_jim

In a second step the eval re-evaluates the line, and converts this to

  •  varcontent=/home/jim

Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:

  1. it's hard to read and to maintain
  2. the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named hong-hu, because a dash '-' can be no valid part of a user name.

  3. Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it.

Here is the summary. "var" is a constant prefix, "$index" contains index string, "$content" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:

  • Set variables
    •   eval "var$index=\"$content\""    # index must only contain characters from [a-zA-Z0-9_]
  • Print variable content
    •   eval "echo \"var$index=\$$varname\""
  • Check if a variable is empty
    •   if eval "[ -z "\$var$index\" ]"
        then echo "variable is empty: $var$index"
        fi

You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.

Anchor(faq7)

7. Is there a function to return the length of a string?

The fastest way, not requiring external programs (but usable only with ["BASH"] and KornShell):

${#varname}

or

expr "$varname" : '.*'

(expr prints the number of characters matching the pattern .*, which is the length of the string)

or

expr length "$varname"

(for a BSD/GNU version of expr. Do not use this, because it is not ["POSIX"]).

Anchor(faq8)

8. How can I recursively search all files for a string?

On most recent systems (GNU/Linux/BSD), you would use grep -r pattern . to search all files from the current directory (.) downward.

You can use find if your grep lacks -r:

    find . -type f -exec grep -l "$search" '{}' \;

The {} characters will be replaced with the current file name.

This command is slower than it needs to be, because find will call grep with only one file name, resulting in many grep invocations (one per file). Since grep accepts multiple file names on the command line, find can be instrumented to call it with several file names at once:

    find . -type f -exec grep -l "$search" '{}' \+

The trailing '+' character instructs find to call grep with as many file names as possible, saving processes and resulting in faster execution. This example works for POSIX find, e.g. with Solaris.

GNU find uses a helper program called xargs for the same purpose:

    find . -type f -print0 | xargs -0 grep -l "$search"

The -print0 / -0 options ensure that any file name can be processed, even ones containing blanks, TAB characters, or new-lines.

90% of the time, all you need is:

Have grep recurse and print the lines (GNU grep):

    grep -r "$search" .

Have grep recurse and print only the names (GNU grep):

    grep -r -l "$search" .

The find command can be used to run arbitrary commands on every file in a directory (including sub-directories). Replace grep with the command of your choice. The curly braces {} will be replaced with the current file name in the case above.

(Note that they must be escaped in some shells, but not in ["BASH"].)

Anchor(faq9)

9. My command line produces no output: tail -f logfile | grep 'foo bar'

Most standard Unix commands buffer their output if used non-interactively. This means, that they don't write each character (or even each line) as they are ready, but collect a larger number (e.g. 4 kilobytes) before printing it. In the case above, the tail command buffers its output, and therefore grep only gets its input in e.g. 4K blocks.

Unfortunately there's no easy solution to this, because the behaviour of the standard programs would need to be changed. *See bottom of section before taking 'no easy solution' to heart*

Some programs provide special command line options for this purpose, e.g.

grep (e.g. GNU version 2.5.1)

--line-buffered

sed (e.g. GNU version 4.0.6)

-u,--unbuffered

awk (some GNU versions)

-W interactive, or use the fflush() function

tcpdump, tethereal

-l

The expect package (http://expect.nist.gov/) has an unbuffer example program, which can help here. It disables buffering for the output of a program.

Example usage:

    unbuffer tail -f logfile | grep 'foo bar'

There is another option when you have more control over the creation of the log file. If you would like to grep the real-time log of a text interface program which does buffered session logging by default (or you were using script to make a session log), then try this instead:

   $ program | tee -a program.log

   In another window:
   $ tail -f program.log | grep whatever

Apparently this works because tee produces unbuffered output. This has only been tested on GNU tee, YMMV.

A solution to this is to use the 'less' command in follow mode. This is simple to do!

   $ less program.log

Then enter your search pattern (/ is search in less, like vi)

  • /foo bar

Next, put less into follow mode by issuing shift+f

Thats all there is to it! Anchor(faq10)

10. How can I recreate a directory structure, without the files?

With the cpio program:

    cd "$srcdir"
    find . -type d -print | cpio -pdumv "$dstdir"

or with GNU-tar, and less obscure syntax:

    cd "$srcdir"
    find . -type d -print | tar c --files-from - --no-recursion | tar x --directory "$dstdir"

This creates a list of directory names with find, non-recursively adds just the directories to an archive, and pipes it to a second tar instance to extract it at the target location.

Anchor(faq11)

11. How can I print the n'th line of a file?

The dirty (but not quick) way would be sed -n ${n}p "$file" but this reads the whole input file, even if you only wanted the third line.

The following sed command line reads a file printing nothing (-n). At line $n the command "p" is run, printing it, with a "q" afterwards: quit the program.

    sed -n "$n{p;q;}" "$file"

Another way, more obvious to some, is to grab the last line from a listing of the first n lines:

   head -n $n $file | tail -n 1 

Using awk:

   awk 'NR==n{print;exit}' file

Anchor(faq12)

12. A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle...

    sh -c 'echo "$1"' -- hello

Anchor(faq13)

13. How can I concatenate two variables? How do I append a string to a variable?

There is no concatenation operator for strings (either literal or variable dereferences) in the shell. The strings are just written one after the other:

    var=$var1$var2

If the right-hand side contains whitespace characters, it needs to be quoted:

    var="$var1 - $var2"

If you're appending a string that doesn't "look like" part of a variable name, you just smoosh it all together:

    var=$var1/.-

Otherwise, braces or quotes may be used to disambiguate the right-hand side:

    var=${var1}xyzzy
    # Without braces, var1xyzzy would be interpreted as a variable name

    var="$var1"xyzzy
    # Alternative syntax

CommandSubstitution can be used as well. The following line creates a log file name logname containing the current date, resulting in names like e.g. log.2004-07-26:

    logname="log.$(date +%Y-%m-%d)"

There's no difference when the variable name is reused, either:

    string="$string more data here"

Bash 3.1 has a new += operator that you may see from time to time:

    string+=" more data here"     # EXTREMELY non-portable!

It's generally best to use the portable syntax.

Anchor(faq14)

14. How can I redirect the output of multiple commands at once?

Redirecting the standard output of a single command is as easy as

    date > file

To redirect standard error:

    date 2> file

To redirect both:

    date > file 2>&1

In a loop or other larger code structure:

    for i in $list; do
        echo "Now processing $i"
        # more stuff here...
    done > file 2>&1

However, this can become tedious if the output of many programs should be redirected. If all output of a script should go into a file (e.g. a log file), the exec command can be used:

    # redirect both standard output and standard error to "log.txt"
    exec > log.txt 2>&1
    # all output including stderr now goes into "log.txt"

Otherwise command grouping helps:

    {
        date
        # some other command
        echo done
    } > messages.log 2>&1

In this example, the output of all commands within the curly braces is redirected to the file messages.log.

Anchor(faq15)

15. How can I run a command on all files with the extention .gz?

Often a command already accepts several files as arguments, e.g.

    zcat *.gz

(One some systems, you would use gzcat instead of zcat. If neither is available, or if you don't care to play guessing games, just use gzip -dc instead.) If an explicit loop is desired, or if your command does not accept multiple filename arguments in one invocation, the for loop can be used:

    for file in *.gz
    do
        echo "$file"
        # do something with "$file"
    done

To do it recursively, you should use a loop, plus the find command:

    while read file; do
        echo "$file"
        # do something with "$file"
    done < <(find . -name '*.gz' -print)

For more hints in this direction, see [#faq20 FAQ #20], below. To see why the find command comes after the loop instead of before it, see [#faq24 FAQ #24].

Anchor(faq16)

16. How can I use a logical AND in a shell pattern (glob)?

That can be achieved through the !() extglob operator. You'll need extglob set. It can be checked with:

$ shopt extglob

and set with:

$ shopt -s extglob

To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:

$ mv foo!(*.d) foo_thursday.d

For the general case:

Delete all files containing Pink_Floyd AND not containing The_Final_Cut:

$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)

By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.

Anchor(faq17)

17. How can I group expressions, e.g. (A AND B) OR C?

The TestCommand [ uses parentheses () for expression grouping. Given that "AND" is "-a", and "OR" is "-o", the following expression

    (0<n AND n<=10) OR n=-1

can be written as follows:

    if [ \( $n -gt 0 -a $n -le 10 \) -o $n -eq -1 ]
    then
        echo "0 < $n <= 10, or $n=-1"
    else
        echo "invalid number: $n"
    fi

Note that the parentheses have to be quoted: \(, '(' or "(".

["BASH"] and KornShell have different, more powerful comparison commands with slightly different (easier) quoting:

Examples:

    if (( (n>0 && n<10) || n == -1 ))
    then echo "0 < $n < 10, or n==-1"
    fi

or

    if [[ ( -f $localconfig && -f $globalconfig ) || -n $noconfig ]]
    then echo "configuration ok (or not used)"
    fi

Note that the distinction between numeric and string comparisons is strict. Consider the following example:

    n=3
    if [[ n>0 && n<10 ]]
    then echo "$n is between 0 and 10"
    else echo "ERROR: invalid number: $n"
    fi

The output will be "ERROR: ....", because in a string comparision "3" is bigger than "10", because "3" already comes after "1", and the next character "0" is not considered. Changing the square brackets to double parentheses (( makes the example work as expected.

Anchor(faq18)

18. How can I use numbers with leading zeros in a loop, e.g. 01, 02?

As always, there are different ways to solve the problem, each with its own advantages and disadvantages.

If there are not many numbers, BraceExpansion can be used:

    for i in 0{1,2,3,4,5,6,7,8,9} 10
    do
        echo $i
    done

Output:

00
01
02
03
[...]

This gets tedious for large sequences, but there are other ways, too. If the command seq is available, you can use it as follows:

    seq -w 1 10

or, for arbitrary numbers of leading zeros (here: 3):

    seq -f "%03g" 1 10

If you have the printf command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number, too:

    for ((i=1; i<=10; i++))
    do
        printf "%02d " "$i"
    done

The KornShell and KornShell93 have the typeset command to specify the number of leading zeros:

    $ typeset -Z3 i=4
    $ echo $i
    004

Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes:

i=0
while test $i -le 10
do
    echo "00$i"
    i=`expr $i + 1`
done |
    sed 's/.*\(...\)$/\1/g'

In this example, the number of '.' inside the parentheses in the sed statement determins how many total bytes from the echo command (at the end of each line) will be kept and printed.

One more addendum: in Bash 3, you can use:

printf "%03d \n" {1..300}

Which is slightly easier in some cases.

Also you can use the printf command with xargs and wget to fetch files:

printf "%03d \n" {$START..$END} | xargs -i% wget $LOCATION/%

Sometimes a good solution.

I found that on bash 2 you can nest seq in back ticks and this will work as well.

printf "%03d \n" `seq 300`

Anchor(faq19)

19. How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?

Some Unix systems provide the split utility for this purpose:

    split --lines 10 --numeric-suffixes input.txt output-

For more flexibility you can use sed. The sed command can print e.g. the line number range 1-10:

    sed -n '1,10p'

This stops sed from printing each line (-n). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). sed still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making sed terminate immediately after printing line 10:

    sed -n -e '1,10p' -e '10q'

Now the command will quit after reading line 10 ("10q"). The -e arguments indicate a script (instead of a file name). The same can be written a little shorter:

    sed -n '1,10p;10q'

We can now use this to print an arbitrary range of a file (specified by line number):

file=/etc/passwd
range=10
firstline=1
maxlines=$(wc -l < "$file") # count number of lines
while (($firstline < $maxlines))
do
    ((lastline=$firstline+$range+1))
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    ((firstline=$firstline+$range+1))
done

This example uses ["BASH"] and KornShell ArithmeticExpressions, which older [wiki:BourneShell Bourne shells] do not have. In that case the following example should be used instead:

file=/etc/passwd
range=10
firstline=1
maxlines=`wc -l < "$file"` # count line numbers
while [ $firstline -le $maxlines ]
do
    lastline=`expr $firstline + $range + 1`
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    firstline=`expr $lastline + 1`
done

Anchor(faq20)

20. How can I find and deal with file names containing newlines, spaces or both?

The preferred method is still to use

    find ... -exec command {} \;

or, if you need to handle filenames en masse:

    find ... -print0 | xargs -0 command

for GNU find/xargs, or (POSIX find):

    find ... -exec command {} +

Use that unless you really can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion (["globbing"]). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well.

This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. (But it will not work in the original BourneShell.)

for file in *.mp3; do
    mv "$file" "${file// /_}"
done

You could do the same thing for all files (regardless of extension) by using

for file in *\ *; do

instead of *.mp3.

Another way to handle filenames recursively involes using the -print0 option of find (a GNU/BSD extension), together with bash's -d option for read:

unset a i
while read -d $'\0' file; do
  a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its word delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec.

Anchor(faq21)

21. How can I replace a string with another string in all files?

sed is a good command to replace strings, e.g.

    sed 's/olddomain\.com/newdomain\.com/g' input > output

To replace a string in all files of the current directory:

    for i in *; do
        sed 's/old/new/g' "$i" > atempfile && mv atempfile "$i"
    done

GNU sed 4.x (but no other version of sed) has a special -i flag which makes the temp file unnecessary:

   for i in *; do
      sed -i 's/old/new/g' "$i"
   done

Those of you who have perl 5 can accomplish the same thing using this code:

    perl -pi -e 's/old/new/g' *

Recursively:

    find . -type f -print0 | xargs -0 perl -pi -e 's/old/new/g'

To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...:

    perl -i.bak -pne 's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g' $(find . -type f)

Finally, here's a script that some people may find useful:

    :
    # chtext - change text in several files

    # neither string may contain '|' unquoted
    old='olddomain\.com'
    new='newdomain\.com'

    # if no files were specified on the command line, use all files:
    [ $# -lt 1 ] && set -- *

    for file
    do
        [ -f "$file" ] || continue # do not process e.g. directories
        [ -r "$file" ] || continue # cannot read file - ignore it
        # Replace string, write output to temporary file. Terminate script in case of errors
        sed "s|$old|$new|g" "$file" > "$file"-new || exit
        # If the file has changed, overwrite original file. Otherwise remove copy
        if cmp "$file" "$file"-new >/dev/null 2>&1
        then rm "$file"-new              # file has not changed
        else mv "$file"-new "$file"      # file has changed: overwrite original file
        fi
    done

If the code above is put into a script file (e.g. chtext), the resulting script can be used to change a text e.g. in all HTML files of the current and all subdirectories:

    find . -type f -name '*.html' -exec chtext {} \;

Many optimizations are possible:

  • use another sed separator character than '|', e.g. ^A (ASCII 1)

  • some implementations of sed (e.g. GNU sed) have an "-i" option that can change a file in-place; no temporary file is necessary in that case

  • the find command above could use either xargs or the built-in xargs of POSIX find

Note: set -- * in the code above is safe with respect to files whose names contain spaces. The expansion of * by set is the same as the expansion done by for, and filenames will be preserved properly as individual parameters, and not broken into words on whitespace.

A more sophisticated example of chtext is here: http://www.shelldorado.com/scripts/cmds/chtext

Anchor(faq22)

22. How can I calculate with floating point numbers instead of just integers?

["BASH"] does not have built-in floating point arithmetic:

    $ echo $((10/3))
    3

For better precision, an external program must be used, e.g. bc, awk or dc:

    $ echo "scale=3; 10/3" | bc
    3.333

The "scale=3" command notifies bc that three digits of precision after the decimal point are required.

If you are trying to compare floating point numbers, be aware that a simple x < y is not supported by all versions of bc. Alternatively, you could use this:

    if [[ $(bc <<< "1.4 - 2.5") = -* ]]; then
        echo "1.4 is less than 2.5."
    fi

This example substracts 2.5 from 1.4, and checks the sign of the result. If it is negative, the former number is less than the latter.

awk can be used for calculations, too:

    $ awk 'BEGIN {printf "%.3f\n", 10 / 3}' /dev/null
    3.333

There is a subtle but important difference between the bc and the awk solution here: bc reads commands and expressions from standard input. awk on the other hand evaluates the expression as part of the program. Expressions on standard input are not evaluated, i.e. echo 10/3 | awk '{print $0}' will print 10/3 instead of the evaluated result of the expression.

This explains why the example uses /dev/null as an input file for awk: the program evaluates the BEGIN action, evaluating the expression and printing the result. Afterwards the work is already done: it reads its standard input, gets an end-of-file indication, and terminates. If no file had been specified, awk would wait for data on standard input.

Newer versions of KornShell93 have built-in floating point arithmetic, together with mathematical functions like sin() or cos() .

Anchor(faq23)

23. Removed.

Anchor(faq24)

24. I set variables in a loop. Why do they suddenly disappear after the loop terminates?

The following command always prints "total number of lines: 0", although the variable linecnt has a larger value in the while loop:

    linecnt=0
    cat /etc/passwd | while read line
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

The reason for this surprising behaviour is that a while/for/until loop runs in a subshell when its input or output is redirected from a pipeline. For the while loop above, a new subshell with its own copy of the variable linecnt is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecnt of the parent (whose value has not changed) is used in the echo command.

It's hard to tell when shell would create a new process for a loop:

  • BourneShell creates it when the input or output is redirected, either by using a pipeline or by a redirection operator ('<', '>').

  • ["BASH"] creates a new process only if the loop is part of a pipeline
  • KornShell creates it only if the loop is part of a pipeline, but not if the loop is the last part of it.

To solve this, either use a method that works without a subshell (shown below), or make sure you do all processing inside that subshell (a bit of a kludge, but easier to work with):

    linecnt=0
    cat /etc/passwd |
    (
        while read line ; do
                linecnt="$((linecnt+1))"
        done
        echo "total number of lines: $linecnt"
    )

To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem at least for ["BASH"] and KornShell (but still for BourneShell):

    linecnt=0
    while read line ; do
        linecnt="$((linecnt+1))"
   done < /etc/passwd
   echo "total number of lines: $linecnt"

For ["BASH"], when the first part of the pipe is a command, you can use "process substitution". The command used here is a simple "echo -e $'a\nb\nc'" as a substitute for a command with a multiline output:

    while read LINE; do
        echo "-> $LINE"
    done < <(echo -e $'a\nb\nc')

A portable and common work-around is to redirect the input of the read command using exec:

    linecnt=0
    exec < /etc/passwd    # redirect standard input from the file /etc/passwd
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:

    exec 3<&0    # save original standard input file descriptor "0" as FD "3"
    exec 0</etc/passwd    # redirect standard input from the file /etc/passwd

    linecnt=0
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done

    exec 0<&3   # restore saved standard input (fd 0) from file descriptor "3"
    exec 3<&-   # close the no longer needed file descriptor "3"

    echo "total number of lines: $linecnt"

Subsequent exec commands can be combined into one line, which is interpreted left-to-right:

    exec 3<&0
    exec 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3
    exec 3<&-

is equivalent to

    exec 3<&0 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3 3<&-

Anchor(faq25)

25. How can I access positional parameters after $9?

Use ${10} instead of $10. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use for, e.g. to get the last parameter:

    for last
    do
        : # nothing
    done

    echo "last argument is: $last"

To get an argument by number, we can use a counter:

    n=12        # This is the number of the argument we are interested in
    i=1
    for arg
    do
        if [ $i -eq $n ]
        then
            argn=arg
            break
        fi
        i=`expr $i + 1`
    done
    echo "argument number $n is: $argn"

This has the advantage of not "consuming" the arguments. If this is no problem, the shift command discards the first positional arguments:

    shift 11
    echo "the 12th argument is: $1"

Although direct access to any positional argument is possible this way, it's hardly needed. The common way is to use getopts(3) to process command line options (e.g. "-l", or "-o filename"), and then use either for or while to process all arguments in turn. An explanation of how to process command line arguments is available here: http://www.shelldorado.com/goodcoding/cmdargs.html

Anchor(faq26)

26. How can I randomize (shuffle) the order of lines in a file? (Or select a random line from a file, or select a random file from a directory.)

    randomize(){
        while read l ; do echo "0$RANDOM $l" ; done |
        sort -n |
        cut -d" " -f2-
    }

Note: the leading 0 is to make sure it doesnt break if the shell doesnt support $RANDOM, which is supported by ["BASH"], KornShell, KornShell93 and ["POSIX"] shell, but not BourneShell.

The same idea (printing random numbers in front of a line, and sorting the lines on that column) using other programs:

    awk '
        BEGIN { srand() }
        { print rand() "\t" $0 }
    ' |
    sort -n |    # Sort numerically on first (random number) column
    cut -f2-     # Remove sorting column

This is faster than the previous solution, but will not work for very old AWK implementations (try "nawk", or "gawk", if available).

A related question we frequently see is, "How can I print a random line from a file?" The problem here is that you need to know in advance how many lines the file contains. Lacking that knowledge, you have to read the entire file through once just to count them -- or, you have to suck the entire file into memory. Let's explore both of these approaches.

   n=$(wc -l < "$file")        # Count number of lines.
   r=$((RANDOM % n + 1))       # Random number from 1..n.
   sed -n "$r{p;q;}" "$file"   # Print the r'th line.

(These examples use the answer from [#faq11 FAQ 11] to print the n'th line.) The first one's pretty straightforward -- we use wc to count the lines, choose a random number, and then use sed to print the line. If we already happened to know how many lines were in the file, we could skip the wc command, and this would be a very efficient approach.

The next example sucks the entire file into memory. This approach saves time reopening the file, but obviously uses more memory. (Arguably: on systems with sufficient memory and an effective disk cache, you've read the file into memory by the earlier methods, unless there's insufficient memory to do so, in which case you shouldn't, QED)

   oIFS=$IFS IFS=$'\n' lines=($(<"$file")) IFS=$oIFS
   n=${#lines[@]}
   r=$((RANDOM % n))
   echo "${lines[r]}"

Note that we don't add 1 to the random number in this example, because the array of lines is indexed counting from 0.

Also, some people want to choose a random file from a directory (for a signature on an e-mail, or to chose a random song to play, or a random image to display, etc.). A similar technique can be used:

    files=(*.ogg)               # Or *.gif, or *
    n=${#files[@]}              # For aesthetics
    xmms "${files[RANDOM % n]}" # Choose a random element

Anchor(faq27)

27. How can two processes communicate using named pipes (fifos)?

NamedPipes, also known as FIFOs ("First In First Out") are well suited for inter-process communication. The advantage over using files as a means of communication is, that processes are synchronized by pipes: a process writing to a pipe blocks if there is no reader, and a process reading from a pipe blocks if there is no writer.

Here is a small example of a server process communicating with a client process. The server sends commands to the client, and the client acknowledges each command:

Server

# server - communication example

# Create a FIFO. Some systems don't have a "mkfifo" command, but use
# "mknod pipe p" instead

mkfifo pipe

while sleep 1
do
    echo "server: sending GO to client"

    # The following command will cause this process to block (wait)
    # until another process reads from the pipe
    echo GO > pipe

    # A client read the string! Now wait for its answer. The "read"
    # command again will block until the client wrote something
    read answer < pipe

    # The client answered!
    echo "server: got answer: $answer"
done

Client

# client

# We cannot start working until the server has created the pipe...
until [ -p pipe ]
do
    sleep 1;    # wait for server to create pipe
done

# Now communicate...

while sleep 1
do
    echo "client: waiting for data"

    # Wait until the server sends us one line of data:
    read data < pipe

    # Received one line!
    echo "client: read <$data>, answering"

    # Now acknowledge that we got the data. This command
    # again will block until the server read it.
    echo ACK > pipe
done

Write both examples to files server and client respectively, and start them concurrently to see it working:

    $ chmod +x server client
    $ server & client &
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    [...]

Anchor(faq28)

28. How do I determine the location of my script? I want to read some config files from the same place.

This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable $0. But providing the script name in $0 is only a (very common) convention, not a requirement.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". That's not the case in ["BASH"]. But this isn't reliable across shells; some of them return the actual command typed in by the user instead of the fully qualified path. In those cases, if all you want is the fully qualified version of $0, you can use something like this (["POSIX"], non-Bourne):

  [[ $0 = /* ]] && echo $0 || echo $PWD/$0

Or the BourneShell version:

  case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac

Or a shell-independent variant (needs a readlink supporting -f, though)

  readlink -f "$0";

However, this approach has some major drawbacks. The most important is, that the script name (as seen in $0) may not be relative to the current working directory, but relative to a directory from the program search path $PATH (this is often seen with KornShell).

Another drawback is that there is really no guarantee that your script is still in the same place it was when it first started executing. Suppose your script is loaded from a temporary file which is then unlinked immediately... your script might not even exist on disk any more! The script could also have been moved to a different location while it was executing. Or (and this is most likely by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common PATH directory like /usr/local/bin, which is how it's being invoked. Your script might be in /opt/foobar/bin/script but the naive approach of reading $0 won't tell you that.

(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [http://www.cs.bell-labs.com/sys/doc/lexnames.html this Plan 9 paper].)

So if the name in $0 is a relative one, i.e. does not start with '/', we can still try to search the script like the shell would have done: in all directories from $PATH.

The following script shows how this could be done:

    myname=$0
    if [ -s "$myname" ] && [ -x "$myname" ]
    then                   # $myname is already a valid file name
        mypath=$myname
    else
        case "$myname" in
        /*) exit 1;;             # absolute path - do not search PATH
        *)
            # Search all directories from the PATH variable. Take
            # care to interpret leading and trailing ":" as meaning
            # the current directory; the same is true for "::" within
            # the PATH.

            for dir in `echo "$PATH" | sed 's/^:/.:/g;s/::/:.:/g;s/:$/:./;s/:/ /g'`
            do
                [ -f "$dir/$myname" ] || continue # no file
                [ -x "$dir/$myname" ] || continue # not executable
                mypath=$dir/$myname
                break           # only return first matching file
            done
            ;;
        esac
    fi

    if [ -f "$mypath" ]
    then
        : # echo >&2 "DEBUG: mypath=<$mypath>"
    else
        echo >&2 "cannot find full path name: $myname"
        exit 1
    fi

    echo >&2 "path of this script: $mypath"

Note that $mypath is not necessarily an absolute path name. It still can contain relative parts like ../bin/myscript.

Generally storing data files in the same directory as their scripts is a bad practice. The Unix file system layout assumes that files in one place (e.g. /bin) are executable programs, while files in another place (e.g. /etc) are data files. (Let's ignore legacy Unix systems with programs in /etc for the moment, shall we....)

It really makes the most sense to keep your script's configuration in a single, static location such as $SCRIPTROOT/etc/foobar.conf. If you need to define multiple configuration files, then you can have a directory (say, /var/lib/foobar or /usr/local/lib/foobar), and read that directory's location from a variable in /etc/foobar.conf. If you don't even want that much to be hard-coded, you could pass the location of foobar.conf as a parameter to the script. If you need the script to assume certain default in the absence of /etc/foobar.conf, you can put defaults in the script itself, and/or fall back to something like $HOME/.foobar.conf if /etc/foobar.conf is missing. (This depends on what your script does. In some cases, it may make more sense to abort gracefully.)

Anchor(faq29)

The external command readlink can be used to display the value of a symbolic link.

$ readlink /bin/sh
bash

you can also use GNU find's %l directive, which is especially useful if you need to resolve links in batches:

$ find /bin/ -type l -printf '%p points to %l\n'
/bin/sh points to bash
/bin/bunzip2 points to bzip2
...

If your system lacks readlink, you can use a function like this one:

readlink() {
    local path=$1 ll

    if [ -L "$path" ]; then
        ll="$(LC_ALL=C ls -l "$path" 2> /dev/null)" &&
        echo "${ll/* -> }"
    else
        return 1
    fi
}

Anchor(faq30)

30. How can I rename all my *.foo files to *.bar? How can I convert all upper-case file names to lower case?

Some GNU/Linux distributions have a rename command, which you can use for the former; however, the syntax differs from one distribution to the next, so it's not a portable answer. (Consult your system's man pages if you want to learn how to use yours, if you have one at all. It's often perfectly good for one-shot interactive renames, just not in portable scripts.)

You can do mass renames in POSIX shells like this:

for f in *.foo; do mv "$f" "${f%.foo}.bar"; done

This invokes the external command mv once for each file, so it may not be as efficient as some of the rename implementations.

If you want to do it recursively, then it becomes much more challenging. This example works (in ["BASH"]) as long as no files have newlines in their names:

find . -name '*.foo' -print | while IFS=$'\n' read -r f; do
  mv "$f" "${f%.foo}.bar"
done

To convert filenames to lower case:

# tolower - convert file names to lower case

for file in *
do
    [ -f "$file" ] || continue                  # ignore non-existing names
    newname=$(echo "$file" | tr 'A-Z' 'a-z')    # lower-case version of file name
    [ "$file" = "$newname" ] && continue        # nothing to do
    [ -f "$newname" ] && continue               # do not overwrite existing files
    mv "$file" "$newname"
done

Purists will insist on using

tr '[:upper:]' '[:lower:]'

in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them. Note that tr can behave very strangely when using the A-Z range on some systems:

imadev:~$ echo Hello | tr A-Z a-z
hÉMMÓ

To make sure you aren't caught by surprise when using tr, either use the fancy range notations, or set your locale to C.

imadev:~$ echo Hello | LC_ALL=C tr A-Z a-z
hello
imadev:~$ echo Hello | tr '[:upper:]' '[:lower:]'
hello
# Either way is fine here.

This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.

# renamefiles - rename files whose name contain unusual characters
for file in *
do
    [ -f "$file" ] || continue                  # ignore non-existing names
    newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g')
    [ "$file" = "$newname" ] && continue        # nothing to do
    [ -f "$newname" ] && continue               # do not overwrite existing files
    mv "$file" "$newname"
done

The character class in [] contains all allowed characters; modify it as needed.

If you have the utility "mmv" on your machine, you could simply do

mmv "*" "#l1"

Another common form of this question is "How do I rename all my MP3 files so that they have underscores instead of spaces?" You can use this:

for f in *\ *.mp3; do mv "$f" "${f// /_}"; done

Anchor(faq31)

31. What is the difference between the old and new test commands ([ and [[)?

[ ("test" command) and [[ ("new test" command) are both used to evaluate expressions. Some examples:

    if [ -z "$variable" ]
    then
        echo "variable is empty!"
    fi

    if [ -f "$filename" ]
    then
        echo "not a valid, existing file name: $filename"
    fi

and

    if [[ -e $file ]]
    then
        echo "directory entry does not exist: $file"
    fi

    if [[ $file0 -nt $file1 ]]
    then
        echo "file $file0 is newer than $file1"
    fi

To cut a long story short: [ implements the old, portable syntax of the command. Although all modern shells have built-in implementations, there usually still is an external executable of that name, e.g. /bin/[. [[ is a new improved version of it, which is a keyword, not a program. This has benefical effects on the ease of use, see below. [[ is understood by KornShell, ["BASH"] (e.g. 2.03), KornShell93, ["POSIX"] shell, but not by the older BourneShell.

Although [ and [[ have much in common, and share many expression operators like "-f", "-s", "-n", "-z", there are some notable differences. Here is a comparison list:

Feature

new test [[

old test [

Example

string comparison

>

(not available)

-

<

(not available)

-

== (or =)

=

-

!=

!=

-

expression grouping

&&

-a

[[ -n $var && -f $var ]] && echo "$var is a file"

||

-o

-

Pattern matching

== (or =)

(not available)

[[ $name = a* ]] || echo "name does not start with an 'a': $name"

In-process regular expression matching

=~

(not available)

[[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"

Special primitives that [[ is defined to have, but [ may be lacking (depending on the implementation):

Description

Primitive

Example

entry (file or directory) exists

-e

[[ -e $config ]] && echo "config file exists: $config"

file is newer/older than other file

-nt / -ot

[[ $file0 -nt $file1 ]] && echo "$file0 is newer than $file1"

two files are the same

-ef

[[ $input -ef $output ]] && { echo "will not overwrite input file: $input"; exit 1; } 

negation

!

-

But there are more subtle differences.

  • No field splitting will be done for [[ (and therefore many arguments need not to be quoted)

     file="file name"
     [[ -f $file ]] && echo "$file is a file"

    will work even though $file is not quoted and contains whitespace. With [ the variable needs to be quoted:

     file="file name"
     [ -f "$file" ] && echo "$file is a file"

    This makes [[ easier to use and less error prone.

  • No file name generation will be done for [[. Therefore the following line tries to match the contents of the variable $path with the pattern /*

     [[ $path = /* ]] && echo "\$path starts with a forward slash /: $path"

    The next command most likely will result in an error, because /* is subject to file name generation:

     [ $path = /* ] && echo "this does not work"

    [[ is strictly used for strings and files. If you want to compare numbers, use ArithmethicExpression ((expression)), e.g.

     i=0
     while ((i<10))
     do
        echo $i
        ((i=$i+1))
     done

When should the new test command [[ be used, and when the old one [? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires ["BASH"] or KornShell, the new syntax could be preferable.

Anchor(faq32)

32. How can I redirect the output of 'time' to a variable or file?

The reason that 'time' needs special care for redirecting its output is one of those mysteries of the universe. The answer will probably be solved around the same time we find dark matter.

  • File Redirection

     bash -c "time ls" > /path/to/foo 2>&1
     ( time ls ) > /path/to/foo 2>&1
     { time ls; } > /path/to/foo 2>&1
  • Variable Redirection

     foo=$( bash -c "time ls" 2>&1 )
     foo=$( ( time ls ) 2>&1 )
     foo=$( { time ls; } 2>&1 )

Note: Using 'bash -c' and ( ) creates a subshell, using { } does not. Do with that as you wish.

Anchor(faq33)

33. How can I find a process ID for a process given its name?

Usually a process is referred to using its process ID (PID), and the ps command can display the information for any process given its process ID, e.g.

    $ echo $$         # my process id
    21796
    $ ps -p 21796
    PID TTY          TIME CMD
    21796 pts/5    00:00:00 ksh

But frequently the process ID for a process is not known, but only its name. Some operating systems, e.g. Solaris, BSD, and some versions of Linux have a dedicated command to search a process given its name, called pgrep:

    $ pgrep init
    1

Often there is an even more specialized program available to not just find the process ID of a process given its name, but also to send a signal to it:

    $ pkill myprocess

Some systems also provide pidof. It differs from pgrep in that multiple output process IDs are only space separated, not newline separated.

    $ pidof cron
    5392

If these programs are not available, a user can search the output of the ps(1) command using grep.

The major problem when grepping the ps output is that grep may match its own ps entry (try: ps aux | grep init). To make matters worse, this does not happen every time; the techicnal name for this is a "race condition". To avoid this, there are several ways:

  • Using grep -v at the end

     ps aux | grep name | grep -v grep

will throw away all lines containing "grep" from the output. Disadvantage: You always have the exit state of the grep -v, so you can't e.g. check if a specific process exists.

  • Using grep -v in the middle

     ps aux | grep -v grep | grep name

This does exactly the same, beside that the exit state of "grep name" is acessible and a representation for "name is a process in ps" or "name is not a process in ps". It still has the disadvantage to start a new process (grep -v).

  • Using [] in grep

     ps aux | grep [n]ame

This spawns only the needed grep-process. The trick is to use the []-character class (regular expressions). To put only one character in a character group normally makes no sense at all, because a [c] will always be a "c". In this case, it's the same. grep [n]ame searches for "name". But as grep's own process list entry is what you executed ("grep [n]ame") and not "grep name", it will not match itself.

===BEGIN greycat rant===

Most of the time when someone asks a question like this, it's because they want to manage a long-running daemon using primitive shell scripting techniques. Common variants are "How can I get the PID of my foobard process.... so I can start one if it's not already running" or "How can I get the PID of my foobard process... because I want to prevent the foobard script from running if foobard is already active." Both of these questions will lead to seriously flawed production systems.

If what you really want is to restart your daemon whenever it dies, just do this:

while true; do
   mydaemon --in-the-foreground
done

where --in-the-foreground is whatever switch, if any, you must give to the daemon to PREVENT IT from automatically backgrounding itself. (Often, -d does this and has the additional benefit of running the daemon with increased verbosity.) Self-daemonizing programs may or may not be the target of a future greycat rant....

If that's too simplistic, look into [http://cr.yp.to/daemontools.html daemontools] or [http://smarden.org/runit/ runit], which are programs for managing services.

If what you really want is to prevent multiple instances of your program from running, then the only sure way to do that is by using a lock. For details on doing this, see ProcessManagement or [#faq45 FAQ 45].

===END greycat rant===

Anchor(faq34)

34. Can I do a spinner in Bash?

Sure.

    i=1
    sp="/-\|"
    echo -n ' '
    while true
    do
        echo -en "\b${sp:i++%${#sp}:1}"
    done

You can also use \r instead of \b. You can use pretty much any character sequence you want as well. If you want it to slow down, put a sleep command inside the loop.

To use as a function called from a loop on every iteration, for example:

sp="/-\|"
sc=0
spin() {
   echo -ne "\b${sp:sc++:1}"
   ((sc==4)) && sc=0
}

When printing the next output line (ie when the spin is over) use:  echo -e "\r$line"  or:  echo -en '\r'; echo "$line" 

A similar technique can be used to build progress bars.

Anchor(faq35)

35. How can I handle command-line arguments to my script easily?

Well, that depends a great deal on what you want to do with them. Here's a general template that might help for the simple cases:

    while [[ $1 == -* ]]; do
        case "$1" in
          -h|--help) show_help; exit 0;;
          -v) verbose=1; shift;;
          -f) output_file=$2; shift 2;;
        esac
    done
    # Now all of the remaining arguments are the filenames which followed
    # the optional switches.  You can process those with "for i" or "$@".

For more complex/generalized cases, or if you want things like "-xvf" to be handled as three separate flags, you can use getopts. (NEVER use getopt(1)!)

Here is a simplistic getopts example:

    x=1         # Avoids an error if we get no options at all.
    while getopts "abcf:g:h:" opt; do
      case "$opt" in
        a) echo "You said a";;
        b) echo "You said b";;
        c) echo "You said c";;
        f) echo "You said f, with argument $OPTARG";;
        g) echo "You said g, with argument $OPTARG";;
        h) echo "You said h, with argument $OPTARG";;
      esac
      x=$OPTIND
    done
    shift $((x-1))
    echo "Left overs: $@"

Anchor(faq36)

36. How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction).

Use the comm(1) command.

  # intersection of file1 and file2
  comm -12 <(sort file1) <(sort file2)
  # subtraction of file1 from file2
  comm -13 <(sort file1) <(sort file2)

Read the comm(1) manpage for details.

If for some reason you lack the core comm(1) program, you can use these other methods:

an amazingly simple and fast implementation, that took just 20 seconds to match a 30k line file against a 400k line file for me.

note that it probably only works with GNU grep, and that the file specified with -f is will be loaded into ram, so it doesn't scale for very large files.

it has grep read one of the sets as a pattern list from a file (-f), and interpret the patterns as plain strings not regexps (-F), matching only whole lines (-x).

  # intersection of file1 and file2
  grep -xF -f file1 file2
  # substraction of file1 from file2
  grep -vxF -f file1 file2

an implementation using sort and uniq

  # intersection of file1 and file2
  sort file1 file2 | uniq -d  (Assuming each of file1 or file2 does not have repeated content)
  # file1-file2 (Subtraction)
  sort file1 file2 file2 | uniq -u
  # same way for file2 - file1, change last file2 to file1
  sort file1 file2 file1 | uniq -u

another implementation of substraction:

  cat file1 file1 file2 | sort | uniq -c |
  awk '{ if ($1 == 2) { $1 = ""; print; } }'

This may introduce an extra space at the start of the line; if that's a problem, just strip it away.

Also, this approach assumes that neither file1 nor file2 has any duplicates in it.

Finally, it sorts the output for you. If that's a problem, then you'll have to abandon this approach altogether. Perhaps you could use awk's associative arrays (or perl's hashes or tcl's arrays) instead.

Anchor(faq37)

37. How can I print text in various colors?

Do not hard-code ANSI color escape sequences in your program! The tput command lets you interact with the terminal database in a sane way.

  tput setaf 1; echo this is red
  tput setaf 2; echo this is green
  tput setaf 0; echo now we are back in black

tput reads the terminfo database which contains all the escape codes necessary for interacting with your terminal, as defined by the $TERM variable. For more details, see the terminfo(5) man page.

If you don't know in advance what your user's terminal's default text color is, you can use tput sgr0 to reset the colors to their default settings. This also removes boldface (tput bold), etc.

Anchor(faq38)

38. How do Unix file permissions work?

See ["Permissions"].

Anchor(faq39)

39. What are all the dot-files that bash reads?

See DotFiles.

Anchor(faq40)

40. How do I use dialog to get input from the user?

  foo=$(dialog --inputbox "text goes here" 8 40 2>&1 >/dev/tty)
  echo "The user typed '$foo'"

The redirection here is a bit tricky.

  1. The foo=$(command) is set up first, so the standard output of the command is being captured by bash.

  2. Inside the command, the 2>&1 causes standard error to be sent to where standard out is going -- in other words, stderr will now be captured.

  3. >/dev/tty sends standard output to the terminal, so the dialog box will be seen by the user. Standard error will still be captured, however.

Another common dialog(1)-related question is how to dynamically generate a dialog command that has items which must be quoted (either because they're empty strings, or because they contain internal white space). One can use eval for that purpose, but the cleanest way to achieve this goal is to use an array.

  unset m; i=0
  words=(apple banana cherry "dog droppings")
  for w in "${words[@]}"; do
    m[i++]=$w; m[i++]=""
  done
  dialog --menu "Which one?" 12 70 9 "${m[@]}"

In the previous example, the while loop that populates the m array could have been reading from a pipeline, a file, etc.

Recall that the construction "${m[@]}" expands to the entire contents of an array, but with each element implicitly quoted. It's analogous to the "$@" construct for handling positional parameters. For more details, see [#faq50 FAQ50] below.

Here's another example, using filenames:

    files=(*.mp3)       # These may contain spaces, apostrophes, etc.
    cmd=(dialog --menu "Select one:" 22 76 16); n=6
    i=0
    for f in "${files[@]}"; do
        cmd[n++]=$((i++)); cmd[n++]="$f"
    done
    choice=$("${cmd[@]}" 2>&1 >/dev/tty)

The user's choice will be stored in the choice variable, as an integer, which can in turn be used as an index into the files array.

A seperate but useful function of dialog is to track progress of a process that produces output. Below is an example that uses dialog to track processes writing to a log file. In the dialog window, there is a tailbox where output is stored, and a msgbox with a clickable Quit. Clicking quit will cause trap to execute, removing the tempfile, and destroying the tail process.

  #you can not tail a nonexistant file, so always ensure it pre-exists!
  rm -f dialog-tail.log; echo Initialize log >> dialog-tail.log
  date >> dialog-tail.log
  tempfile=`tempfile 2>/dev/null` || tempfile=/tmp/test$$
  trap "rm -f $tempfile" 0 1 2 5 15
  dialog --title "TAIL BOXES" \
        --begin 10 10 --tailboxbg dialog-tail.log 8 58 \
        --and-widget \
        --begin 3 10 --msgbox "Press OK " 5 30 \
        2>$tempfile &
  mypid=$!;
  for i in 1 2 3;  do echo $i >> dialog-tail.log; sleep 1; done
  echo Done. >> dialog-tail.log
  wait $mypid;

Anchor(faq41)

41. How do I determine whether a variable contains a substring?

  if [[ $foo = *bar* ]]

The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions:

  if [[ $foo =~ ab*c ]]   # bash 3, matches abbbbcde, or ac, etc.

If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax:

  case "$foo" in
    *bar*) .... ;;
  esac

This should allow you to match variables against globbing-style patterns. if you need a portable way to match variables against regular expressions, use grep or egrep.

  if echo "$foo" | egrep some-regex >/dev/null; then ...

Anchor(faq42)

42. How can I find out if a process is still running?

The kill command is used to send signals to a running process. As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running:

  •  myprog &          # Start program in the background
     daemonpid=$!      # ...and save its process id
    
     while sleep 60
     do
         if kill -0 $daemonpid       # Is the process still alive?
         then
             echo >&2 "OK - process is still running"
         else
             echo >&2 "ERROR - process $daemonpid is no longer running!"
             break
         fi
     done

This is one of those questions that usually masks a much deeper issue. It's rare that someone wants to know whether a process is still running simply to display a red or green light to an operator. More often, there's some ulterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running, or to ensure mutually exclusive access to a resource, etc. For much better discussion of these issues, see ProcessManagement or [#faq33 FAQ #33].

Anchor(faq43)

43. Removed.

Anchor(faq44)

44. Removed.

Anchor(faq45)

45. How can I ensure that only one instance of a script is running at a time (mutual exclusion)?

We need some means of mutual exclusion. One easy way is to use a "lock": any number of processes can try to acquire the lock simultaneously, but only one of them will succeed.

How can we implement this using shell scripts? Some people suggest creating a lock file, and checking for its presence:

  •  # locking example -- WRONG
    
     lockfile=/tmp/myscript.lock
     if [ -f "$lockfile" ]
     then                      # lock is already held
         echo >&2 "cannot acquire lock, giving up: $lockfile"
         exit 0
     else                      # nobody owns the lock
         > "$lockfile"         # create the file
         #...continue script
     fi

This example does not work, because there is a time window between checking and creating the file. Assume two processes are running the code at the same time. Both check if the lockfile exists, and both get the result that it does not exist. Now both processes assume they have acquired the lock -- a disaster waiting to happen. We need an atomic check-and-create operation, and fortunately there is one: mkdir, the command to create a directory:

  •  # locking example -- CORRECT
    
     lockdir=/tmp/myscript.lock
     if mkdir "$lockdir"
     then    # directory did not exist, but was created successfully
         echo >&2 "successfully acquired lock: $lockdir"
         # continue script
     else
         echo >&2 "cannot acquire lock, giving up on $lockdir"
         exit 0
     fi

The advantage over using a lock file is, that even when two processes call mkdir at the same time, only one process can succeed at most. This atomicity of check-and-create is ensured at the operating system kernel level.

Note that we cannot use "mkdir -p" to automatically create missing path components: "mkdir -p" does not return an error if the directory exists already, but that's the feature we rely upon to ensure mutual exclusion.

Now let's spice up this example by automatically removing the lock when the script finishes:

  •  lockdir=/tmp/myscript.lock
     if mkdir "$lockdir"
     then
         echo >&2 "successfully acquired lock"
     
         # Remove lockdir when the script finishes, or when it receives a signal
         trap 'rm -rf "$lockdir"' 0    # remove directory when script finishes
         trap "exit 2" 1 2 3 15        # terminate script when receiving signal
     
         # Optionally create temporary files in this directory, because
         # they will be removed automatically:
         tmpfile=$lockdir/filelist
     
     else
         echo >&2 "cannot acquire lock, giving up on $lockdir"
         exit 0
     fi

This example provides reliable mutual exclusion. There is still the disadvantage that a stale lock file could remain when the script is terminated with a signal not caught (or signal 9, SIGKILL), but it's a good step towards reliable mutual exclusion. An example that remedies this (contributed by Charles Duffy) follows:

  • Are we sure this code's correct? There seems to be a discrepancy between the names LOCK_DEFAULT_NAME and DEFAULT_NAME; and it checks for processes in what looks to be a race condition; and it uses the Linux-specific /proc file system and the GNU-specific egrep -o to do so.... I don't trust it. It looks overly complex and fragile. And quite non-portable. -- GreyCat

     LOCK_DEFAULT_NAME=$0
     LOCK_HOSTNAME="$(hostname -f)"
     
     ## function to take the lock if free; will fail otherwise
     function grab-lock {
       local PROGRAMNAME="${1:-$DEFAULT_NAME}"
       local PID=${2:-$$}
       (
         umask 000;
         mkdir -p "/tmp/${PROGRAMNAME}-lock"
         mkdir "/tmp/${PROGRAMNAME}-lock/held" || return 1
         mkdir "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-${PID}" && return 0 || return 1
       ) 2>/dev/null
       return $?
     }
     
     ## function to nicely let go of the lock
     function release-lock {
       local PROGRAMNAME="${1:-$DEFAULT_NAME}"
       local PID=${2:-$$}
       (
         rmdir "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-${PID}" || true
         rmdir "/tmp/${PROGRAMNAME}-lock/held" && return 0 || return 1
       ) 2>/dev/null
       return $?
     }
     
     ## function to force anyone else off of the lock
     function break-lock {
       local PROGRAMNAME="${1:-$DEFAULT_NAME}"
       (
         [ -d "/tmp/${PROGRAMNAME}-lock/held" ] || return 0
         for DIR in "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-"* ; do
           OTHERPID="$(echo $DIR | egrep -o '[0-9]+$')"
           [ -d /proc/${OTHERPID} ] || rmdir $DIR
         done
         rmdir /tmp/${PROGRAMNAME}-lock/held && return 0 || return 1
       ) 2>/dev/null
       return $?
     }
     
     ## function to take the lock nicely, freeing it first if needed
     function get-lock {
       break-lock "$@" && grab-lock "$@"
     }

Instead of using mkdir we could also have used the program to create a symbolic link, ln -s.

For more discussion on these issues, see ProcessManagement.

Anchor(faq46)

46. I want to check to see whether a word is in a list (or an element is a member of a set).

The safest way to do this would be to loop over all elements in your set/list and check them for the element/word you are looking for. Say we are looking for the content of bar in the array foo:

  •    for element in "${foo[@]}"; do
          [[ $element = $bar ]] && echo "Found $bar."
       done

Or, to stop searching when you find it:

  •    for element in "${foo[@]}"; do
          [[ $element = $bar ]] && { echo "Found $bar."; break; }
       done

If for some reason your list/set is not in an array, but is a string of words, and the element you are searching for is also a word, you can use this:

  •    for element in $foo; do
          [[ $element = $bar ]] && echo "Found $bar."
       done

A less safe, but more clever version:

  •    if [[ " $foo " = *\ "$bar"\ * ]]; then
          echo "Found $bar."
       fi

And, if for some reason you don't know the syntax of for well enough, here's how to check your script's parameters for an element. For example, '-v':

  •    for element; do
          [[ $element = '-v' ]] && echo "Switching to verbose mode."
       done

GNU's grep has a \b feature which allegedly matches the edges of words. Using that, one may attempt to replicate the "clever" approach used above, but it is fraught with peril:

  •    # Is 'foo' one of the positional parameters? 
       egrep '\bfoo\b' <<<"$@" >/dev/null && echo yes
       # This is where it fails: is '-v' one of the positional parameters?
       egrep '\b-v\b' <<<"$@" >/dev/null && echo yes
       # Unfortunately, \b sees "v" as a separate word.
       # Nobody knows what the hell it's doing with the "-".
    
       # Is "someword" in the array 'array'?
       egrep '\bsomeword\b' <<<"${array[@]}"
       # Obviously, you can't use this if someword is '-v'!

Since this "feature" of GNU grep is both non-portable and poorly defined, we don't recommend using it.

Anchor(faq47)

47. How can I redirect stderr to a pipe?

A pipe can only carry stdout of a program. To pipe stderr through it, you need to redirect stderr to the same destination as stdout. Optionally you can close stdout or redirect it to /dev/null to only get stderr. Some sample code:

# - 'myprog' is an example for a program that outputs both, stdout and
#   stderr
# - after the pipe I will just use a 'cat', of course you can put there
#   what you want

# version 1: redirect stderr towards the pipe while stdout survives (both come
# mixed)
myprog 2>&1 | cat                                                               
                                                                                
# version 2: redirect stderr towards the pipe without getting stdout (it's
# redirected to /dev/null)
myprog 2>&1 >/dev/null | cat
#Note that '>/dev/null' comes after '2>&1', otherwise the stderr will also be directed to /dev/null
                                                                                
# version 3: redirect stderr towards the pipe while the "original" stdout gets
# closed
myprog 2>&1 >&- | cat

One may also pipe stderr only but keep stdout intact (without a priori knowledge of where the script's output is going). This is a bit trickier.

This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade. (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!)

On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick:

# Redirecting only stderr to a pipe.

exec 3>&1                                       # Save current "value" of stdout.
ls -l /dev/fd/ 2>&1 >&3 3>&- | grep bad 3>&-    # Close fd 3 for 'grep' and 'ls'.
#                       ^^^^            ^^^^
exec 3>&-                                       # Now close it for the remainder of the script.

# Thanks, S.C.

The output of the ls command shows where each file descriptor points to.

The same can be done without exec:

{ ls -l /dev/fd/ 2>&1 1>&3 3>&- | grep bad 3>&-; } 3>&1

To show it as a dialog one-liner:

exec 3>&1
dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/'
exec 3>&-

This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers.

A similar effect can be achieved with process substitution:

perl -e 'print "stdout\n"; warn "stderr\n"' 2> >(tr a-z A-Z)

This will pipe standard error through the tr command.

Anchor(faq48)

48. Why should I never use eval?

"eval" is a common misspelling of "evil". The section dealing with spaces in file names used to include the following quote "helpful tool (which is probably not as safe as the \0 technique)", end quote.

    Syntax : nasty_find_all [path] [command] <maxdepth>

    #This code is evil and must never be used
    export IFS=" "
    [ -z "$3" ] && set -- "$1" "$2" 1
    FILES=`find "$1" -maxdepth "$3" -type f -printf "\"%p\" "`
    #warning, evilness
    eval FILES=($FILES)
    for ((I=0; I < ${#FILES[@]}; I++))
    do
        eval "$2 \"${FILES[I]}\""
    done
    unset IFS

This script is supposed to recursively search for files with newlines and/or spaces in them, arguing that find -print0 | xargs -0 was unsuitable for some purposes such as multiple commands. It was followed by an instructional description on all the lines involved, which we'll skip.

To its defense, it works:

$ ls -lR
.:
total 8
drwxr-xr-x  2 vidar users 4096 Nov 12 21:51 dir with spaces
-rwxr-xr-x  1 vidar users  248 Nov 12 21:50 nasty_find_all

./dir with spaces:
total 0
-rw-r--r--  1 vidar users 0 Nov 12 21:51 file?with newlines
$ ./nasty_find_all . echo 3
./nasty_find_all
./dir with spaces/file
with newlines
$ 

But consider this:

$ touch "\"); ls -l $'\x2F'; #"

You just created a file called  "); ls -l $'\x2F'; #

Now FILES will contain  ""); ls -l $'\x2F'; #. When we do eval FILES=($FILES), it becomes

FILES=(""); ls -l $'\x2F'; #"

Which becomes the two statements  FILES=("");  and  ls -l / . Congratulations, you just allowed execution of arbitrary commands.

$ touch "\"); ls -l $'\x2F'; #"
$ ./nasty_find_all . echo 3
total 1052
-rw-r--r--   1 root root 1018530 Apr  6  2005 System.map
drwxr-xr-x   2 root root    4096 Oct 26 22:05 bin
drwxr-xr-x   3 root root    4096 Oct 26 22:05 boot
drwxr-xr-x  17 root root   29500 Nov 12 20:52 dev
drwxr-xr-x  68 root root    4096 Nov 12 20:54 etc
drwxr-xr-x   9 root root    4096 Oct  5 11:37 home
drwxr-xr-x  10 root root    4096 Oct 26 22:05 lib
drwxr-xr-x   2 root root    4096 Nov  4 00:14 lost+found
drwxr-xr-x   6 root root    4096 Nov  4 18:22 mnt
drwxr-xr-x  11 root root    4096 Oct 26 22:05 opt
dr-xr-xr-x  82 root root       0 Nov  4 00:41 proc
drwx------  26 root root    4096 Oct 26 22:05 root
drwxr-xr-x   2 root root    4096 Nov  4 00:34 sbin
drwxr-xr-x   9 root root       0 Nov  4 00:41 sys
drwxrwxrwt   8 root root    4096 Nov 12 21:55 tmp
drwxr-xr-x  15 root root    4096 Oct 26 22:05 usr
drwxr-xr-x  13 root root    4096 Oct 26 22:05 var
./nasty_find_all
./dir with spaces/file
with newlines
./
$

It doesn't take much imagination to replace  ls -l  with  rm -rf  or worse.

One might think these circumstances are obscure, but one should not be tricked by this. All it takes is one malicious user, or perhaps more likely, a benign user who left the terminal unlocked when going to the bathroom, wrote a funny php uploading script that doesn't sanity check file names or who made the same mistake as oneself in allowing arbitrary code execution (now instead of being limited to the www-user, an attacker can use nasty_find_all to traverse chroot jails and/or gain additional privileges), uses an IRC or IM client that's too liberal in the filenames it accepts for file transfers or conversation logs, etc.

Anchor(faq49)

49. How can I view periodic updates/appends to a file? (ex: growing log file)

tail -f will show you the growing log file. On some systems (e.g. OpenBSD), this will automatically track a rotated log file to the new file with the same name (which is usually what you want). To get the equivalent functionality on GNU systems, use tail --follow=name instead.

This is helpful if you need to view only the updates to the file after your last view.

# Start by setting n=1
   tail -n $n testfile; n="+$(( $(wc -l < testfile) + 1 ))"

Every invocation of this gives the update to the file from where we stopped last. If you know the line number from where you want to start, set n to that.

Anchor(faq50)

50. I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments.

Too see what the shell is doing with quotes use the set -x command in the terminal or use #!/bin/bash -x in a script

Some people attempt to do things like this:

    # Non-working example
    args="-s 'The subject' $address"
    mail $args < $body

This fails because of word-splitting. When $args is evaluated, it becomes four words: 'The is the second word, and subject' is the third word.

What's needed is a way to maintain each word as a separate item, even if that word contains multiple spaces. Quotes won't do it, but an array will.

    # Working example
    args=(-s "The subject" "$address")
    mail "${args[@]}" < $body

Usually, this question arises when someone is trying to use dialog to construct a menu on the fly. For an example of how to do this properly, see [#faq40 FAQ #40] above.

Anchor(faq51)

51. I want history-search just like in tcsh. How can I bind it to the up and down keys?

Just add the following to /etc/inputrc or your ~/.inputrc

"\e[A":history-search-backward
"\e[B":history-search-forward

Anchor(faq52)

52. How do I convert a file in DOS format to UNIX format. ( Remove CRLF line terminators )

All these are from the sed one-liners page

sed 's/.$//' dosfile              # assumes that all lines end with CR/LF
sed 's/^M$//' dosfile             # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' dosfile            

Some distributions have dos2unix command which can do this. In vim, you can use :set fileformat=unix

Anchor(faq53)

53. I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is. Lines wrap around incorrectly.

You must put \[ and \] around any non-printing escape sequences in your prompt. Thus:

BLUE=$(tput setaf 4)
PURPLE=$(tput setaf 5)
BLACK=$(tput setaf 0)
PS1='\[$BLUE\]\h:\[$PURPLE\]\w\[$BLACK\]\$ '

Without the \[ \], bash will think the bytes which constitute the escape sequences for the color codes will actually take up space on the screen, so bash won't be able to know where the cursor actually is.

Anchor(faq54)

54. How can I tell whether a variable contains a valid number?

First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign".

if [[ $foo = *[^0-9]* ]]; then
   echo "'$foo' has a non-digit somewhere in it"
else
   echo "'$foo' is strictly numeric"
fi

This can be done in Korn and legacy Bourne shells as well, using case:

case "$foo" in
    *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
esac

If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression. Bash version 3 and above have regular expression support in the [[ command:

if [[ $foo =~ ^[-+]?[0-9]+\(\.[0-9]+\)?$ ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

If you don't have bash version 3, then you would use egrep:

if echo "$foo" | egrep '^[-+]?[0-9]+(\.[0-9]+)?$' >/dev/null; then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
fi

Note that the parentheses in the egrep regular expression don't require backslashes in front of them, whereas the ones in the bash3 command do.

Anchor(faq55)

55. Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which?

Bash processes all redirections from left to right, in order. And the order is significant. Moving them around within a command may change the results of that command.

For newbies who've somehow managed to miss the previous hundred or so examples, here's what you want:

foo >file 2>&1          # Sends both stdout and stderr to file.

Now for the rest of you, here's a simple demonstration of what's happening:

foo() {
  echo "This is stdout"
  echo "This is stderr" 1>&2
}
foo >/dev/null 2>&1             # produces no output
foo 2>&1 >/dev/null             # writes "This is stderr" on the screen

Why do the results differ? In the first case, >/dev/null is performed first, and therefore the standard output of the command is sent to /dev/null. Then, the 2>&1 is performed, which causes standard error to be sent to the same place that standard output is already going. So both of them are discarded.

In the second example, 2>&1 is performed first. This means standard error is sent to wherever standard output happens to be going -- in this case, the user's terminal. Then, standard output is sent to /dev/null and is therefore discarded. So when we run foo the second time, we see only its standard error, not its standard output.

There are times when we really do want 2>&1 to appear first -- for one example of this, see [#faq40 FAQ 40].

There are other times when we may use 2>&1 without any other redirections. Consider:

find ... 2>&1 | grep "some error"

In this example, we want to search find's standard error (as well as its standard output) for the string "some error". The 2>&1 in the piped command forces standard error to go into the pipe along with standard output. (When pipes and redirections are mixed in this way, remember: the pipe is done first, before any redirections. So find's standard output is already set to point to the pipe before we process the 2>&1 redirection.)

If we wanted to read only standard error in the pipe, and discard standard output, we could do it like this:

find ... 2>&1 >/dev/null | grep "some error"

The redirections in that example are processed thus:

  1. First, the pipe is created. find's output is sent to it.

  2. Next, 2>&1 causes find's standard error to go to the pipe as well.

  3. Finally, >/dev/null causes find's standard output to be discarded, leaving only stderr going into the pipe.

A related question is [#faq47 FAQ #47], which discusses how to send stderr to a pipeline.

Anchor(faq56)

56. How can I untar or unzip multiple tarballs at once?

As the tar command was originally designed to read from and write to tape devices (tar - Tape ARchiver), you can specify only filenames to put inside an archive or to extract out of an archive (e.g. tar x myfileonthe.tape). There is an option to tell tar that the archive is not on some tape, but in a file: -f. This option takes exactly one argument: the filename of the file containing the archive. All other (following) filenames are taken to be archive members:

    tar -x -f backup.tar myfile.txt
    # OR (more common syntax IMHO)
    tar xf backup.tar myfile.txt

Now here's a common mistake -- imagine a directory containing the following archive-files you want to extract all at once:

    $ ls
    backup1.tar backup2.tar backup3.tar

Maybe you think of tar xf *.tar. Let's see:

    $ tar xf *.tar
    tar: backup2.tar: Not found in archive
    tar: backup3.tar: Not found in archive
    tar: Error exit delayed from previous errors

What happened? The shell replaced your *.tar by the matching filenames. You really wrote:

    tar xf backup1.tar backup2.tar backup3.tar

And as we saw earlier, it means: "extract the files backup2.tar and backup3.tar from the archive backup1.tar", which will of course only succeed when there are such filenames stored in the archive.

The solution is relatively easy: extract the contents of all archives one at a time. As we use a UNIX shell and we are lazy, we do that with a loop:

    for tarname in *.tar; do
      tar xf "$tarname"
    done

What happens? The for-loop will iterate through all filenames matching *.tar and call tar xf for each of them. That way you extract all archives one-by-one and you even do it automagically.

The second common archive type in these days is ZIP. The command to extract contents from a ZIP file is unzip (who would have guessed that!). The problem here is the very same: unzip takes only one option specifying the ZIP-file. So, you solve it the very same way:

    for zipfile in *.zip; do
      unzip "$zipfile"
    done

Not enough? Ok. There's another option with unzip: it can take shell-like patterns to specify the ZIP-file names. And to avoid interpretion of those patterns by the shell, you need to quote them. unzip itself and not the shell will interpret *.zip in this case:

    unzip "*.zip"
    # OR, to make more clear what we do:
    unzip \*.zip

(This feature of unzip derives mainly from its origins as an MS-DOS program. MS-DOS's command interpreter does not perform glob expansions, so every MS-DOS program must be able to expand wildcards into a list of filenames. This feature was left in the Unix version, and as we just demonstrated, it can occasionally be useful.)

Anchor(faq57)

57. How can group entries (in a file by common prefixes)?

As in, one wants to convert:

    foo: entry1
    bar: entry2
    foo: entry3
    baz: entry4

to

    foo: entry1 entry3
    bar: entry2
    baz: entry4

There are two simple general methods for this:

  1. sort the file, and then iterate over it, collecting entries until the prefix changes, and then print the collected entries with the previous prefix
  2. iterate over the file, collect entries for each prefix in an array indexed by the prefix

A basic implementation of a in bash:

old=xxx ; stuff=
(sort file ; echo xxx) | while read prefix line ; do 
        if [[ $prefix = $old ]] ; then
                stuff="$stuff $line"
        else
                echo "$old: $stuff"
                old="$prefix"
                stuff=
        fi
done 

And a basic implementation of b in awk:

    {
        a[$1] = a[$1] " " $2
    }
    END{
        for (x in a) print x, a[x]
    }

Written out as a shell command:

    awk '{a[$1] = a[$1] " " $2}END{for (x in a) print x, a[x]}' file

Anchor(faq58)

58. Can bash handle binary data?

The answer is, basically, no.

While bash won't have as many problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them.

One instance where such would sometimes be handy is storing small temporary bitmaps while working with netpbm... here I resorted to adding an extra pnmnoraw to the pipe, creating (larger) ASCII files that bash has no problems storing).

If you are feeling adventurous, consider this experiment:

    # bindec.bash, attempt to decode binary data to ascii decimals
    IFS=
    while read -n1 x ;do
        case "$x" in
            '') echo empty ;;
            # insert the 256 lines generated by the following oneliner here:
            # for x in $(seq 0 255) ;do echo "        $'\\$(printf %o $x)') echo $x;;" ;done
        esac
    done

and then pipe binary data into it, maybe like so:

    for x in $(seq 0 255) ;do echo -ne "\\$(printf %o $x)" ;done | bash bindec.bash | nl | less

This suggests that the 0 character is skipped entirely, because we can't create it with the input generation, enough to conveniently corrupt most binary files we try to process.

  • Yes, Bash is written in C, and uses C semantics for handling strings -- including the NUL byte as string terminator -- in its variables. You cannot store NUL in a Bash variable sanely. It simply was never intended to be used for this. - GreyCat

Note that this refers to storing them in variables... moving data between programs using pipes is always binary clean. Temporary files are also safe, as long as [#faq62 appropriate precautions] are taken when creating them.

Anchor(faq59)

59. Removed.

Anchor(faq60)

60. I'm trying to write a script that will change directory (or set a variable), but after the script finishes, I'm back where I started (or my variable isn't set)!

Consider this:

   #!/bin/sh
   cd /tmp

If one executes this simple script, what happens? Bash forks, and the parent waits. The child executes the script, including the chdir(2) system call, and then exits. The parent, which was waiting for the child, harvests the child's exit status (presumably 0 for success), and then bash carries on with the next command.

Since the chdir was done by a child process, it has no effect on the parent.

Moreover, there is no conceivable way you can ever have a child process affect any part of the parent's environment, which includes its variables as well as its current working directory.

So, how does one go about it? You can still have the cd command in an external file, but you can't run it as a script. Instead, you must source it (or "dot it in", using the . command, which is a synonym for source).

   echo 'cd /tmp' > $HOME/mycd
   source $HOME/mycd
   pwd                          # Now, we're in /tmp

The same thing applies to setting variables. source the file that contains the commands; don't try to run it.

Functions are run in the same same shell, so it is possible to put

   mycd() { cd /tmp; }

in .bashrc or similar, and then use mycd to change the directory.

Anchor(faq61)

61. Is there a list of which features were added to specific releases (versions) of Bash?

Here's a partial list of the changes, in a more compact format:

Feature

Added in version

x+=string

3.1-alpha1

{x..y}

3.0-alpha

${!array[@]}

3.0-alpha

[[ =~

3.0-alpha

<<<

2.05b-alpha1

i++

2.04-devel

for ((;;))

2.04-devel

/dev/fd/N, /dev/tcp/host/port, etc.

2.04-devel

a=(*.txt) file expansion

2.03-alpha

extglob

2.02-alpha1

[[

2.02-alpha1

builtin printf

2.02-alpha1

$(< filename)

2.02-alpha1

** (exponentiation)

2.02-alpha1

\xNNN

2.02-alpha1

(( ))

2.0-beta2

Anchor(faq62)

62. How do I create a temporary file in a secure manner?

Good question. To be filled in later. (Interim hints: tempfile is not portable. mktemp exists more widely, but it may require a -c switch to create the file in advance; or it may create the file by default and barf if -c is supplied. There does not appear to be any single command that simply works everywhere, without testing various arguments.)

Anchor(faq63)

63. My ssh client hangs when I try to run a remote background job!

The following will not do what you expect:

   ssh me@remotehost 'sleep 120 &'
   # Client hangs for 120 seconds

This is a "feature" of [http://www.openssh.org/ OpenSSH]. The client will not close the connection as long as the remote end's terminal still is still in use -- and in the case of sleep 120 &, stdout and stderr are still connected to the terminal.

The immediate answer to your question -- "How do I get the client to disconnect so I can get my shell back?" -- is to kill the ssh client. You can do this with the kill or pkill commands, of course; or by sending the INT signal (usually Ctrl-C) for a non-interactive ssh session (as above); or by pressing <Enter><~><.> (Enter, Tilde, Period) in the client's terminal window for an interactive remote shell.

The long-term workaround for this is to ensure that all the file descriptors are redirected to a log file (or /dev/null) on the remote side:

   ssh me@remotehost 'sleep 120 >/dev/null 2>&1 &'
   # Client should return immediately

This also applies to restarting daemons on some legacy Unix systems.

   ssh root@hp-ux-box   # Interactive shell
   ...                  # Discover that the problem is stale NFS handles
   /sbin/init.d/nfs.client stop   # autofs is managed by this script and
   /sbin/init.d/nfs.client start  # killing it on HP-UX is OK (unlike Linux)
   exit
   # Client hangs -- use Enter ~ . to kill it.

The legacy Unix /sbin/init.d/nfs.client script runs daemons in the background but leaves their stdout and stderr attached to the terminal (and they don't fully self-daemonize). The solution is either to fix the Unix vendor's broken init script, or to kill the ssh client process after this happens. The author of this article uses the latter approach.

Anchor(faq64)

64. Why is it so hard to get an answer to the question that I asked in #bash ?

  • #bash aphorism #1 "The questioner's first description of the problem/question will be misleading."
  • corollary 1.1 "The questioner's second description of the problem/question will also be misleading"
  • corollary 1.2 "The questioner is never precise" ex: will say "print the file" when they mean print the file's name, rather than printing the file itself."
  • #bash aphorism #2, "The questioner will keep changing their original question until it drives the helpers in the channel insane."
  • #bash aphorism #3, "The data is never formatted in the way that makes it easiest to manipulate :-)"
  • #bash aphorism #4, "30 to 40 percent of the conversations in #bash will be about aphorisms #1 and #2"

Anchor(faq65)

65. Is there a "PAUSE" command in bash like there is in MSDOS batch scripts? To prompt the user to press any key to continue?

No, but you can use these:

echo press enter to continue; read

echo press any key to continue; read -n 1

read -p 'press enter to continue'

Anchor(faq66)

66. I want to check if [[ $var == foo || $var == bar || $var = more ]] without repeating $var n times.

   case $var in
      foo|bar|more) ... ;;
   esac

Anchor(faq67)

67. How can I trim leading/trailing white space from one of my variables?

There are a few ways to do this -- none of them elegant.

First, the most portable way would be to use sed:

   x=$(echo "$x" | sed -e 's/^ *//' -e 's/ *$//')
   # Note: this only removes spaces.  For tabs too:
   x=$(echo "$x" | sed -e $'s/^[ \t]*//' -e $'s/[ \t]*$//')
   # Or possibly, with some systems:
   x=$(echo "$x" | sed -e 's/^[[:space:]]\+//' -e 's/[[:space:]]\+$//')

One can achieve the goal using builtins, although at the moment I'm not sure which shells the following syntax supports:

   # Remove leading whitespace:
   while [[ $x = [$' \t\n']* ]]; do x=${x#[$' \t\n']}; done
   # And now trailing:
   while [[ $x = *[$' \t\n'] ]]; do x=${x%[$' \t\n']}; done

Of course, the preceding example is pretty slow, because it removes one character at a time, in a loop (although it's good enough in practice for most purposes). If you want something a bit fancier, there's a bash-only solution using extglob:

   shopt -s extglob
   x=${x##*([$' \t\n'])}; x=${x%%*([$' \t\n'])}
   shopt -u extglob

There are many, many other ways to do this. These are not necessarily the most efficient, but they're known to work.

Anchor(faq68)

68. How do I run a command, and have it abort (timeout) after N seconds?

There are two C programs that can do this: [http://pilcrow.madison.wi.us/ doalarm], and [http://www.porcupine.org/forensics/tct.html timeout]. (Compiling them is beyond the scope of this document; suffice to say, it'll be trivial on GNU/Linux systems, easy on most BSDs, and painful on anything else....)

If you don't have or don't want one of the above two programs, you can use a perl one-liner to set an ALRM and then exec the program you want to run under a time limit. In any case, you must understand what your program does with SIGALRM.

function doalarm () { perl -e 'alarm shift; exec @ARGV' "$@" ; }

doalarm ${NUMBER_OF_SECONDS_BEFORE_ALRMING} program arg arg ...

If you can't or won't install one of these programs (which really should have been included with the basic core Unix utilities 30 years ago!), then the best you can do is an ugly hack like:

   command & pid=$!; { sleep 10 && kill $pid; } &

This will, as you will soon discover, produce quite a mess regardless of whether the timeout condition kicked in or not. Cleaning it up is not something worth my time -- just use doalarm or timeout instead. Really.

Anchor(faq69)

69. I want to automate an ssh (or scp, or sftp) connection, but I don't know how to send the password....

STOP!

First of all, if you actually were to embed your password in a script somewhere, it would be visible to the entire world (or at least, anyone who can read files on your system). This would defeat the entire purpose of having a password on your remote account.

If you understand this and still want to continue, then the next thing you need to do is read and understand the man page for ssh-keygen(1). This will tell you how to generate a public/private key pair (in either RSA or DSA format), and how to use these keys to authenticate to the remote system without sending a password at all.

Since many of you are too lazy to read man pages, and instead prefer to ask us in #bash to read them for you, I'll even give a brief summary of the procedure here:

ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub | ssh me@remote "cat >> ~/.ssh/authorized_keys"
ssh me@remote date     # should not prompt for passWORD,
                       # but your key may have a passPHRASE

If your key has a passphrase on it, and you want to avoid typing it every time, look into ssh-agent(1). It's beyond the scope of this document, though.

If you're being prompted for a password even with the public key inserted into the remote authorized_keys file, chances are you have a permissions problem on the remote system. Check every single directory in the full path leading up to the authorized_keys file and make sure they do not have world- or group-write privileges. E.g., if your home directory is /home/fred and /home has group "staff" write privileges, sshd will refuse to honor your key.

If that's not it, then make sure you didn't spell it authorised_keys. SSH uses the US spelling, authorized_keys.

If you really want to use a password instead of public keys, first have your head examined. Then, if you still want to use a password, use expect(1). And don't ask us for help with it.

Anchor(faq70)

70. How do I convert Unix (epoch) timestamps to human-readable values?

The only sane way to handle time values within a program is to convert them into a linear scale. You can't store "January 17, 2005 at 5:37 PM" in a variable and expect to do anything with it. Therefore, any competent program is going to use time stamps with semantics such as "the number of seconds since point X". These are called epoch timestamps. If the epoch is January 1, 1970 at midnight UTC, then it's also called a "Unix timestamp", because this is how Unix stores all times (such as file modification times).

Standard Unix, unfortunately, has no tools to work with Unix timestamps. (Ironic, eh?) GNU date, and later BSD date, has a %s extension to generate output in Unix timestamp format:

    date +%s    # Prints the current time in Unix format, e.g. 1164128484

This is commonly used in scripts when one requires the interval between two events:

   start=$(date +%s)
   ...
   end=$(date +%s)
   echo "Operation took $((end - start)) seconds."

Now, to convert those Unix timestamps back into human-readable values, one needs to use an external tool. One method is to trick GNU date using:

   date -d "1970-01-01 UTC + 1164128484 seconds"
   # Prints "Tue Nov 21 12:01:24 EST 2006" in the US/Eastern time zone.

Reading the source code(!!) of GNU date's date parser reveals that it accepts Unix timestamps prefixed with '@', so:

   $ date -d "@1164128484"
   # Prints "Tue Nov 21 18:01:24 CET 2006" in the central European time zone

However, this undocumented feature only appears to work in extremely new versions of GNU date.

If you don't have GNU date available, an external language such as Perl can be used:

   perl -le "print scalar localtime 1164128484"
   # Prints "Tue Nov 21 12:01:24 2006"

I used double quotes in these examples so that the time constant could be replaced with a variable reference. See the documentation for date(1) and Perl for details on changing the output format.

Newer versions of Tcl (such as 8.5) have very good support of date and clock functions. See the tclsh man page for usage details. For example:

   echo 'puts [clock format [clock scan "today"]]' | tclsh
   # Prints today's date (the format can be adjusted with parameters to "clock format").
   
   echo 'puts [clock format [clock scan "fortnight"]]' | tclsh
   # Prints the date two weeks from now.
   
   echo 'puts [clock format [clock scan "5 years + 6 months ago"]]' | tclsh
   # Five and a half years ago, compensating for leap days and daylight savings time.

Anchor(faq71)

71. How do I convert an ASCII character to its decimal (or hexadecimal) value and back?

This task is quite easy while using the printf builtin. You can either write two simple functions as shown below or use the plain printf constructions alone.

   # chr() - converts decimal value to its ASCII character representation
   # ord() - converts ASCII character to its decimal value
 
   chr() {
     printf \\$(printf '%03o' $1)
   }
 
   ord() {
     printf '%d' "'$1"
   }

   hex() { 
      printf '%x' "'$1"
   }

   # examples:
 
   chr $(ord A)    # -> A
   ord $(chr 65)   # -> 65

The ord function above is quite tricky. It can be re-written in several other ways (use that one that will best suite your coding style or your actual needs).

  • Q: Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on earth did you find out about it? Source diving? -- GreyCat

    A: It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see [http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html printf()] to know more) -- mjf

   ord() {
     printf '%d' \"$1\"
   }

Or:

   ord() {
     printf '%d' \'$1\'
   }

Or, rather:

   ord() {
     printf '%d' "'$1'"
   }

Etc. All of the above ord functions should work properly. Which one you choose highly depends on particular situation.

Anchor(faq72)

72. How can I ensure my environment is configured for cron, batch, and at jobs?

If a shell or other script calling shell commands runs fine interactively but fails due to environment configurations (say: a complex $PATH) when run noninteractively, you'll need to force your environment to be properly configured.

You can write a shell wrapper around your script which configures your environment. You may also want to have a "testenv" script (bash or other scripting language) which tests what shell and environment are present when running under different configurations.

In cron, you can invoke Bash (or the Bourne shell) with the '-c' option, source your init script, then invoke your command, eg:

  * * * * *  /bin/bash -c ". myconfig.bashrc; myscript"

Another approach would be to have myscript dot in the configuration file itself, if it's a rather static configuration. (Or, conditionally dot it in, if you find a certain variable to be missing from your environment... the possibilities are numerous.)

The at and batch utilities copy the current environment (except for the variables TERM, DISPLAY and _) as part of the job metadata, and should recreate it when the job is executed. If this isn't the case you'll want to test the environment and/or explicitly initialize it similarly to cron above.

Anchor(faq73)

73. How can I use parameter expansion? How can I get substrings? How can I get a file without its extension, or get just a file's extension?

Parameter expansion is a separate section of the bash manpage (man bash -P 'less -p "^   Parameter Expansion"' or [http://tiswww.tis.case.edu/~chet/bash/bashref.html#SEC30 see the reference]). It can be hard to understand parameter expansion without actually using it. (DO NOT think about parameter expansion like a regex. It is different and distinct.)

Here's an example of how to use parameter expansion with something akin to a hostname (dot-separated components):

parameter     result
-----------   ------------------------------
${NAME}       polish.ostrich.racing.champion
${NAME#*.}           ostrich.racing.champion
${NAME##*.}                         champion
${NAME%%.*}   polish                        
${NAME%.*}    polish.ostrich.racing         

And, here's an example of the parameter expansions for a typical filename.

parameter     result
-----------   --------------------------------------------------------
${FILE}       /usr/share/java-1.4.2-sun/demo/applets/Clock/Clock.class
${FILE#*/}     usr/share/java-1.4.2-sun/demo/applets/Clock/Clock.class
${FILE##*/}                                                Clock.class
${FILE%%/*}                                                           
${FILE%/*}    /usr/share/java-1.4.2-sun/demo/applets/Clock            

You cannot nest parameter expansions. If you need to perform two separate expansions, use a temporary variable to hold the result of the first expansion.

You may find it helpful to associate that, on your keyboard, the "#" is to the left of the "$" symbol and the "%" symbol is to its right; this corresponds with their acting upon the left (beginning) and right (end) parts of the parameter.

Here are a few more examples (but please see the real documentation for a list of all the features!). I include these mostly so people won't break the wiki again, trying to add new questions that answer this stuff.

${string:2:1}   # The third character of string (0, 1, 2 = third)
${string:1}     # The string starting from the second character
                # Note: this is equivalent to ${string#?}
${string%?}     # The string with its last character removed.
${string: -1}   # The last character of string
${string:(-1)}  # The last character of string, alternate syntax
                # Note: string:-1 means something entirely different.

${file%.mp3}    # The filename without the .mp3 extension
                # Very useful in loops of the form: for file in *.mp3; do ...
${file%.*}      # The filename without its extension (assuming there was
                # only one extension in the first place...).
${file%%.*}     # The filename without all of its extensions
${file##*.}     # The extension only.

Anchor(faq74)

74. How do I get the effects of those nifty Bash Parameter Expansions in older shells?

The extended forms of ParameterSubstitution work with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, sed and expr can often be used.

For example, to remove the filename extension part:

    for file in *.doc
    do
        base=`echo "$file" | sed 's/\.[^.]*$//'`    # remove everything starting with last '.'
        mv "$file" "$base".txt
    done

Another example, this time to remove the last character of a variable:

    var=`expr "$var" : '\(.*\).'`

or (using sed):

    var=`echo "$var" | sed 's/.$//'`

Anchor(faq75)

75. How do I use 'find'? I can't understand the man page at all!

See UsingFind.

Anchor(faq76)

76. How do I get the sum of all the numbers in a column?

This and all similar questions are best answered with an ["AWK"] one-liner.

{{{awk '{sum += $1} END {print sum}' myfile }}}

A small bit of effort can adapt this to most similar tasks (finding the average, skipping lines with the wrong number of fields, etc.).

For more examples of using awk, see [http://www.student.northpark.edu/pemente/awk/awk1line.txt handy one-liners for awk].

Anchor(faq77)

77. How do I log history or "secure" bash against history removal?

This is a question which has no answer applicable to bash. You are here because you asked or wanted to know how to find out what a user had executed when they unset or /dev/nulled their shell history. There are several problems with this.

The first issue is:

  • kill -9 $$

This innocuous looking command does what you would presume it to: it kills the current shell off. However, the .bash_history is ONLY written to when bash is allowed to exit cleanly. As such, sending SIGKILL to bash will prevent logging to .bash_history

Users may also set variables that disable shell history, or simply make their .bash_history a symlink to /dev/null. All of these will defeat any attempt to spy on them through their .bash_history file.

The second issue is permissions. The bash shell is executed as a user. This means that the user can read or write all content produced by or handled by the shell. Any location you would try to log to, MUST be writeable to by the user, and not a privileged user. This is because the shell specifically tries to ensure the user does not exceed its privileges. Imagine a regular user writing a root read/write only history. This is creative license for exploiting and gaining escalated privileges on the server, and thus an extremely bad idea.

The third issue is location. Assume that you pursue a chroot jail for your bash users. This is a fantastic idea, and a good step towards securing your server. However, placing your users in a chroot jail conversely affects the ability to log the users' actions. Once jailed, your user can only write to content within its specific jail. This makes finding user writeable extraneous logs a simple matter, and enables the attacker to find your hidden logs much easier than would otherwise be the case.

Where does this leave you? Unfortunately, nowhere good, and definitely not what you wanted to know. If you want to record all of the commands issues to BASH by a user, your best bet is to modify BASH so that it actually records them, in real time, as they are executed -- not when the user logs off. This is still not reliable, though, because end users may simply upload their own shell and run that instead of your hacked BASH. Or they may use one of the other shells already on your system, instead of your hacked BASH. But, for those who absolutely must have some form of patch available, you can use the patch located at http://wooledge.org/~greg/bash_logging.txt (patch submitted by _sho_ -- use at your own risk. The results of a code-review with improvements are here: http://phpfi.com/220302 -- Heiner).

For a more serious approach to this problem, consider BSD process accounting (kernel-based) instead of focusing on shells.

Anchor(faq78)

78. I want to set a user's password using the Unix passwd command, but how do I script that? It doesn't read standard input!

OK, first of all, I know there are going to be some people reading this, right now, who don't even understand the question. Here, this does not work:

{ echo oldpass; echo newpass; echo newpass; } | passwd
# This DOES NOT WORK!

Nothing you can do in bash can possibly work. passwd(1) does not read from standard input. This is intentional. It is for your protection. Passwords were never intended to be put into programs, or generated by programs. They were intended to be entered only by the fingers of an actual human being, with a functional brain, and never, ever written down anywhere.

Nonetheless, we get hordes of users asking how they can circumvent 35 years of Unix security.

You have three choices. The first is to manually generate your own hashed password strings (for example, using http://wooledge.org/~greg/crypt/ or a similar tool) and then write them to your system's local password-hash file (which may be /etc/passwd, or /etc/shadow, or /etc/master.passwd, or /etc/security/passwd, or ...). This requires that you read the relevant man pages on your system, find out where the password hash goes, what formatting the file requires, and then construct code that writes it out in that format.

The second is to use [http://expect.nist.gov/ expect]. I think it even has this exact problem as one of its canonical examples.

The third is to use some system-specific tools which may or may not exist on your platform. For example, some GNU/Linux systems have a chpasswd(8) tool which can be coerced into doing these sorts of things.

See also [#faq69 FAQ #69].

Anchor(faq79)

79. How can I grep for lines containing foo AND bar, foo OR bar?

Well, for lines containing foo AND bar, two grep statements are needed.

grep foo| grep bar

If you prefer, you can achieve this in one sed, or awk statement.

sed -n '/foo/{/bar/p}'
awk '/foo/ && /bar/'

And for lines containing foo OR bar, grep can do it "nicely", but it can also be done with sed, awk, etc.

egrep 'foo|bar'
grep -E 'foo|bar'

Anchor(faq80)

80. How can I make an alias that takes an argument?

You can't. Aliases in bash are extremely rudimentary, and not really suitable to any serious purpose. The bash man page even says so explicitly:

  • There is no mechanism for using arguments in the replacement text. If arguments are needed, a shell function should be used (see FUNCTIONS below).

Use a function instead. For example,

settitle() { case $TERM in *xterm*|*rxvt*) echo -en "\e]2;$1\a";; esac; }

Anchor(faq81)

81. How can I determine whether a command exists anywhere in my PATH?

In BASH, there are a couple builtins that are suitable for this purpose: hash and type. Here's an example using hash:

if hash qwerty 2>/dev/null; then
  echo qwerty exists
else
  echo qwerty does not exist
fi

If these builtins are not available (because you're in a Bourne shell, or whatever), then you may have to rely on the external command which (which is often a csh script, although sometimes a compiled binary). Unfortunately, which does not set a useful exit code -- and it doesn't even write errors to stderr! Therefore, one must parse its output.

# Last resort -- using which(1)
x=$(LC_ALL=C which qwerty 2>&1)
case "$x" in
  no\ *\ in\ *)           echo qwerty does not exist;;
  *Command\ not\ found.)  echo qwerty does not exist;;
  '')                     echo qwerty does not exist;;
  *)                      echo qwerty exists;;
esac

(Also note that its output is not consistent across platforms. On HP-UX, for example, it prints no qwerty in /path /path /path ...; on OpenBSD, it prints qwerty: Command not found.; and on GNU/Linux, it prints nothing at all.)

BashFAQ (last edited 2021-05-27 20:31:17 by GreyCat)