108259
Comment: expand on piping stderr
|
35
comment1,
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#pragma section-numbers 2 = BASH Frequently Asked Questions = These are answers to frequently asked questions on channel #bash on the [http://www.freenode.net/ freenode] IRC network. These answers are contributed by the regular members of the channel (originally heiner, and then others including greycat and r00t), and by users like you. If you find something inaccurate or simply misspelled, please feel free to correct it! All the information here is presented without any warranty or guarantee of accuracy. Use it at your own risk. When in doubt, please consult the man pages or the GNU info pages as the authoritative references. ["BASH"] is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the KornShell, too. If a question is not strictly shell specific, but rather related to Unix, it may be in the UnixFaq. If you want to help, you can add new questions with answers here, or try to answer one of the BashOpenQuestions. [[TableOfContents]] [[Anchor(faq1)]] == How can I read a file line-by-line? == {{{ while read line do echo "$line" done < "$file" }}} If you want to operate on individual fields within each line, you may supply additional variables to {{{read}}}: {{{ # Input file has 3 columns separated by white space. while read first_name last_name phone; do ... done < "$file" }}} If the field delimiters are not whitespace, you can set {{{IFS}}} (input field separator): {{{ while IFS=: read user pass uid gid gecos home shell; do ... done < /etc/passwd }}} Also, please note that you do ''not'' necessarily need to know how many fields each line of input contains. If you supply more variables than there are fields, the extra variables will be empty. If you supply fewer, the last variable gets "all the rest" of the fields after the preceding ones are satisfied. For example, {{{ while read first_name last_name junk; do ... done <<< 'Bob Smith 123 Main Street Elk Grove Iowa 123-555-6789' # Inside the loop, first_name will contain "Bob", and # last_name will contain "Smith". The variable "junk" holds # everything else. }}} The {{{read}}} command modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the {{{IFS}}} variable has to be cleared: {{{ OIFS=$IFS; IFS= while read line do echo "$line" done < "$file" IFS=$OIFS }}} As a feature, the {{{read}}} command concatenates lines that end with a backslash '\' character to one single line. To disable this feature, KornShell and ["BASH"] have {{{read -r}}}: {{{ OIFS=$IFS; IFS= while read -r line do echo "$line" done < "$file" IFS=$OIFS }}} Note that reading a file line by line this way is ''very slow'' for large files. Consider using e.g. ["AWK"] instead if you get performance problems. One may also read from a command instead of a regular file: {{{ some command | while read line; do other commands done }}} That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24], or use process substitution like: {{{ while read line; do other commands done < <(some command) }}} Sometimes it's useful to read a file into an array, one array element per line. You can do that with the following example: {{{ O=$IFS IFS=$'\n' arr=($(< myfile)) IFS=$O }}} This temporarily changes the Input Field Separator to a newline, so that each line will be considered one field by read. Then it populates the array {{{arr}}} with the fields. Then it sets the {{{IFS}}} back to what it was before. This same trick works on a stream of data as well as a file: {{{ O=$IFS IFS=$'\n' arr=($(find . -type f)) IFS=$O }}} Of course, this will blow up in your face if the files contain newlines; see [#faq20 FAQ 20] for hints on dealing with such files. [[Anchor(faq2)]] == How can I store the return value of a command in a variable? == Well, that depends on exactly what you mean by that question. Some people want to store the command's ''output'' (either stdout, or stdout + stderr); and others want to store the command's ''exit status'' (0 to 255, with 0 typically meaning "success"). If you want to capture the output: {{{ var=$(command) # stdout only; stderr remains uncaptured var=$(command 2>&1) # both stdout and stderr will be captured }}} If you want the exit status: {{{ command var=$? }}} If you want both: {{{ var1=$(command) var2=$? # the assignment to var1 has no effect on command's exit status, which is still in $? }}} If you don't ''actually'' want the exit status, but simply want to take an action upon success or failure: {{{ if command then echo "it succeeded" else echo "it failed" fi }}} [[Anchor(faq3)]] == How can I insert a blank character after each character? == {{{ sed 's/./& /g' }}} Example: {{{ $ echo "testing" | sed 's/./& /g' t e s t i n g }}} [[Anchor(faq4)]] == How can I check whether a directory is empty or not? == We can test for the exit status of ls: {{{ if ls "$directory"/file.txt; then echo "file.txt found!" else echo "file.txt not found." fi }}} The following idea counts the number of entries in the specified directory (omitting ".." and "."): {{{ find "$dir" -maxdepth 0 -links 2 \ -exec echo "empty directory: {}" \; }}} Conversely, to find a non-empty directory: {{{ find "$dir" -maxdepth 0 -links +2 \ -exec echo "directory is non-empty" \; }}} Most modern systems have an "ls -A" which explicitly omits "." and ".." from the directory listing: {{{ if [ -n "$(ls -A somedir)" ] then echo directory is non-empty fi }}} This can be shortened to: {{{ if [ "$(ls -A somedir)" ] then echo directory is non-empty fi }}} Another way, using Bash features, involves setting the special shell option which changes the behavior of globbing. Some people prefer to avoid this approach, because it's so drastically different and could severely alter the behavior of scripts. Nevertheless, if you're willing to use this approach, it does greatly simplify this particular task: {{{ shopt -s nullglob if [[ -z $(echo *) ]]; then echo directory is empty fi }}} It also simplifies various other operations: {{{ shopt -s nullglob for i in *.zip; do blah blah "$i" # No need to check $i is a file. done }}} Without the {{{shopt}}}, that would have to be: {{{ for i in *.zip; do [[ -f $i ]] || continue # If no .zip files, i becomes *.zip blah blah "$i" done }}} (You may want to use the latter anyway, if there's a possibility that the glob may match directories in addition to files.) [[Anchor(faq5)]] == How can I convert all upper-case file names to lower case? == {{{ # tolower - convert file names to lower case for file in * do [ -f "$file" ] || continue # ignore non-existing names newname=$(echo "$file" | tr '[A-Z]' '[a-z]') # lower-case version of file name [ "$file" = "$newname" ] && continue # nothing to do [ -f "$newname" ] && continue # do not overwrite existing files mv "$file" "$newname" done }}} Purists will insist on using {{{ tr '[[:upper:]]' '[[:lower:]]' }}} in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them. This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed. {{{ # renamefiles - rename files whose name contain unusual characters for file in * do [ -f "$file" ] || continue # ignore non-existing names newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g') [ "$file" = "$newname" ] && continue # nothing to do [ -f "$newname" ] && continue # do not overwrite existing files mv "$file" "$newname" done }}} The character class in {{{[]}}} contains all allowed characters; modify it as needed. If you have the utility "mmv" on your machine, you could simply do {{{ mmv "*" "#l1" }}} [[Anchor(faq6)]] == How can I use a logical AND in a shell pattern (glob)? == That can be achieved through the !() extglob operator. You'll need {{{extglob}}} set. It can be checked with: {{{ $ shopt extglob }}} and set with: {{{ $ shopt -s extglob }}} To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d: {{{ $ mv foo!(*.d) foo_thursday.d }}} For the general case: Delete all files containing Pink_Floyd AND not containing The_Final_Cut: {{{ $ rm !(!(*Pink_Floyd*)|*The_Final_Cut*) }}} By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns. [[Anchor(faq7)]] == Is there a function to return the length of a string? == The fastest way, not requiring external programs (but usable only with ["BASH"] and KornShell): {{{ ${#varname} }}} or {{{ expr "$varname" : '.*' }}} ({{{expr}}} prints the number of characters matching the pattern {{{.*}}}, which is the length of the string) or {{{ expr length "$varname" }}} (for a BSD/GNU version of {{{expr}}}. Do not use this, because it is not ["POSIX"]). [[Anchor(faq8)]] == How can I recursively search all files for a string? == On most recent systems (GNU/Linux/BSD), you would use {{{grep -r pattern .}}} to search all files from the current directory (.) downward. You can use {{{find}}} if your {{{grep}}} lacks -r: {{{ find . -type f -exec grep -l "$search" '{}' \; }}} The {} characters will be replaced with the current file name. This command is slower than it needs to be, because {{{find}}} will call {{{grep}}} with only one file name, resulting in many {{{grep}}} invocations (one per file). Since {{{grep}}} accepts multiple file names on the command line, {{{find}}} can be instrumented to call it with several file names at once: {{{ find . -type f -exec grep -l "$search" '{}' \+ }}} The trailing '+' character instructs {{{find}}} to call {{{grep}}} with as many file names as possible, saving processes and resulting in faster execution. This example works for POSIX {{{find}}}, e.g. with Solaris. GNU find uses a helper program called {{{xargs}}} for the same purpose: {{{ find . -type f -print0 | xargs -0 grep -l "$search" }}} The {{{-print0}}} / {{{-0}}} options ensure that any file name can be processed, even ones containing blanks, TAB characters, or new-lines. 90% of the time, all you need is: Have grep recurse and print the lines (GNU grep): {{{ grep -r "$search" . }}} Have grep recurse and print only the names (GNU grep): {{{ grep -r -l "$search" . }}} The {{{find}}} command can be used to run arbitrary commands on every file in a directory (including sub-directories). Replace {{{grep}}} with the command of your choice. The curly braces {} will be replaced with the current file name in the case above. (Note that they must be escaped in some shells, but not in ["BASH"].) [[Anchor(faq9)]] == My command line produces no output: tail -f logfile | grep 'ssh' == Most standard Unix commands buffer their output if used non-interactively. This means, that they don't write each character (or even each line) as they are ready, but collect a larger number (e.g. 4 kilobytes) before printing it. In the case above, the {{{tail}}} command buffers its output, and therefore {{{grep}}} only gets its input in e.g. 4K blocks. Unfortunately there's no easy solution to this, because the behaviour of the standard programs would need to be changed. *See bottom of section before taking 'no easy solution' to heart* Some programs provide special command line options for this purpose, e.g. ||grep (e.g. GNU version 2.5.1)||{{{--line-buffered}}}|| ||sed (e.g. GNU version 4.0.6)||{{{-u,--unbuffered}}}|| ||awk (some GNU versions)||{{{-W interactive, or use the fflush() function}}}|| ||tcpdump, tethereal||{{{-l}}}|| The {{{expect}}} package (http://expect.nist.gov/) has an {{{unbuffer}}} example program, which can help here. It disables buffering for the output of a program. Example usage: {{{ unbuffer tail -f logfile | grep 'ssh' }}} There is another option when you have more control over the creation of the log file. If you would like to {{{grep}}} the real-time log of a text interface program which does buffered session logging by default (or you were using {{{script}}} to make a session log), then try this instead: {{{ $ program | tee -a program.log In another window: $ tail -f program.log | grep whatever }}} Apparently this works because {{{tee}}} produces unbuffered output. This has only been tested on GNU {{{tee}}}, YMMV. A solution to this is to use the 'less' command in follow mode. This is simple to do! {{{ $ less program.log }}} Then enter your search pattern (/ is search in less, like vi) /ssh Next, put less into follow mode by issuing shift+f Thats all there is to it! [[Anchor(faq10)]] == How can I recreate a directory structure, without the files? == With the {{{cpio}}} program: {{{ cd "$srcdir" find . -type d -print | cpio -pdumv "$dstdir" }}} or with GNU-{{{tar}}}, and less obscure syntax: {{{ cd "$srcdir" find . -type d -print | tar c --files-from - --no-recursion | tar x --directory "$dstdir" }}} This creates a list of directory names with find, non-recursively adds just the directories to an archive, and pipes it to a second tar instance to extract it at the target location. [[Anchor(faq11)]] == How can I print the n'th line of a file? == The dirty (but not quick) way would be {{{sed -n ${n}p "$file"}}} but this reads the whole input file, even if you only wanted the third line. The following {{{sed}}} command line reads a file printing nothing (-n). At line $n the command "p" is run, printing it, with a "q" afterwards: quit the program. {{{ sed -n "$n{p;q;}" "$file" }}} [[Anchor(faq12)]] == A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle... == {{{ sh -c 'echo "$1"' -- hello }}} [[Anchor(faq13)]] == How can I concatenate two variables? == There is no concatenation operator for strings (either literal or variable dereferences) in the shell. The strings are just written one after the other: {{{ var=$var1$var2 }}} If the right-hand side contains whitespace characters, it needs to be quoted: {{{ var="$var1 - $var2" }}} Braces can be used to disambiguate the right-hand side: {{{ var=${var1}xyzzy # without braces, var1xyzzy would be interpreted as a variable name # Another equivalent way would be: var="$var1"xyzzy }}} CommandSubstitution can be used as well. The following line creates a log file name {{{logname}}} containing the current date, resulting in names like e.g. {{{log.2004-07-26}}}: {{{ logname="log.$(date +%Y-%m-%d)" }}} Appending data to the end of a string doesn't require any black magic, either. {{{ string="$string more data here" }}} Bash 3.1 has a new += operator that you may see from time to time: {{{ string+=" more data here" # EXTREMELY non-portable! }}} It's generally best to use the portable syntax. [[Anchor(faq14)]] == How can I redirect the output of multiple commands at once? == Redirecting the standard output of a single command is as easy as {{{ date > file }}} To redirect standard error: {{{ date 2> file }}} To redirect both: {{{ date > file 2>&1 }}} In a loop or other larger code structure: {{{ for i in $list; do echo "Now processing $i" # more stuff here... done > file 2>&1 }}} However, this can become tedious if the output of many programs should be redirected. If all output of a script should go into a file (e.g. a log file), the {{{exec}}} command can be used: {{{ # redirect both standard output and standard error to "log.txt" exec > log.txt 2>&1 # all output including stderr now goes into "log.txt" }}} Otherwise command grouping helps: {{{ { date # some other command echo done } > messages.log 2>&1 }}} In this example, the output of all commands within the curly braces is redirected to the file {{{messages.log}}}. [[Anchor(faq15)]] == How can I run a command on all files with the extention .gz? == Often a command already accepts several files as arguments, e.g. {{{ zcat *.gz }}} (One some systems, you would use {{{gzcat}}} instead of {{{zcat}}}. If neither is available, or if you don't care to play guessing games, just use {{{gzip -dc}}} instead.) If an explicit loop is desired, or if your command does not accept multiple filename arguments in one invocation, the {{{for}}} loop can be used: {{{ for file in *.gz do echo "$file" # do something with "$file" done }}} To do it recursively, you should use a loop, plus the find command: {{{ while read file; do echo "$file" # do something with "$file" done < <(find . -name '*.gz' -print) }}} For more hints in this direction, see [#faq20 FAQ #20], below. To see why the find command comes after the loop instead of before it, see [#faq24 FAQ #24]. [[Anchor(faq16)]] == How can I remove a file name extension from a string, e.g. file.tar to file? == The easiest (and fastest) way is to use the following: {{{ $ name="file.tar" $ echo "${name%.tar}" file }}} The {{{${var%pattern}}}} syntax removes the pattern from the end of the variable. {{{${var#pattern}}}} would remove pattern from the start of the string. This could be used to rename all files from "*.doc" to "*.txt": {{{ for file in *.doc do mv "$file" "${file%.doc}".txt done }}} There's more to ParameterSubstitution, e.g. {{{${var%%pattern}, ${var##pattern}, ${var//old/new}}}}. Note that this extended form of ParameterSubstitution works with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, {{{sed}}} could be used to remove the filename extension part: {{{ for file in *.doc do base=`echo "$file" | sed 's/\.[^.]*$//'` # remove everything starting with last '.' mv "$file" "$base".txt done }}} Finally, some GNU/Linux/BSD systems offer a {{{rename}}} command. There are multiple different {{{rename}}} commands out there with contradictory syntaxes. Consult your man pages to see which one you have (if any). [[Anchor(faq17)]] == How can I group expressions, e.g. (A AND B) OR C? == The TestCommand {{{[}}} uses parentheses () for expression grouping. Given that "AND" is "-a", and "OR" is "-o", the following expression {{{ (0<n AND n<=10) OR n=-1 }}} can be written as follows: {{{ if [ \( $n -gt 0 -a $n -le 10 \) -o $n -eq -1 ] then echo "0 < $n <= 10, or $n=-1" else echo "invalid number: $n" fi }}} Note that the parentheses have to be quoted: \(, '(' or "(". ["BASH"] and KornShell have different, more powerful comparison commands with slightly different (easier) quoting: * ArithmeticExpression for arithmetic expressions, and * NewTestCommand for string (and file) expressions. Examples: {{{ if (( (n>0 && n<10) || n == -1 )) then echo "0 < $n < 10, or n==-1" fi }}} or {{{ if [[ ( -f $localconfig && -f $globalconfig ) || -n $noconfig ]] then echo "configuration ok (or not used)" fi }}} Note that the distinction between numeric and string comparisons is strict. Consider the following example: {{{ n=3 if [[ n>0 && n<10 ]] then echo "$n is between 0 and 10" else echo "ERROR: invalid number: $n" fi }}} The output will be "ERROR: ....", because in a ''string comparision'' "3" is bigger than "10", because "3" already comes after "1", and the next character "0" is not considered. Changing the square brackets to double parentheses {{{((}}} makes the example work as expected. [[Anchor(faq18)]] == How can I use numbers with leading zeros in a loop, e.g. 01, 02? == As always, there are different ways to solve the problem, each with its own advantages and disadvantages. If there are not many numbers, BraceExpansion can be used: {{{ for i in 0{1,2,3,4,5,6,7,8,9} 10 do echo $i done }}} Output: {{{ 00 01 02 03 [...] }}} This gets tedious for large sequences, but there are other ways, too. If the command {{{seq}}} is available, you can use it as follows: {{{ seq -w 1 10 }}} or, for arbitrary numbers of leading zeros (here: 3): {{{ seq -f "%03g" 1 10 }}} If you have the {{{printf}}} command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number, too: {{{ for ((i=1; i<=10; i++)) do printf "%02d " "$i" done }}} The KornShell and KornShell93 have the {{{typeset}}} command to specify the number of leading zeros: {{{ $ typeset -Z3 i=4 $ echo $i 004 }}} Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes: {{{ i=0 while test $i -le 10 do echo "00$i" i=`expr $i + 1` done | sed 's/.*\(...\)$/\1/g' }}} In this example, the number of '.' inside the parentheses in the {{{sed}}} statement determins how many total bytes from the {{{echo}}} command (at the end of each line) will be kept and printed. One more addendum: in Bash 3, you can use: {{{ printf "%03d \n" {1..300} }}} Which is slightly easier in some cases. Also you can use the {{{printf}}} command with xargs and wget to fetch files: {{{ printf "%03d \n" {$START..$END} | xargs -i% wget $LOCATION/% }}} Sometimes a good solution. [[Anchor(faq19)]] == How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30? == Some Unix systems provide the {{{split}}} utility for this purpose: {{{ split --lines 10 --numeric-suffixes input.txt output- }}} For more flexibility you can use {{{sed}}}. The {{{sed}}} command can print e.g. the line number range 1-10: {{{ sed -n '1,10p' }}} This stops {{{sed}}} from printing each line ({{{-n}}}). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). {{{sed}}} still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making {{{sed}}} terminate immediately after printing line 10: {{{ sed -n -e '1,10p' -e '10q' }}} Now the command will quit after reading line 10 ("10q"). The {{{-e}}} arguments indicate a script (instead of a file name). The same can be written a little shorter: {{{ sed -n '1,10p;10q' }}} We can now use this to print an arbitrary range of a file (specified by line number): {{{ file=/etc/passwd range=10 firstline=1 maxlines=$(wc -l < "$file") # count number of lines while (($firstline < $maxlines)) do ((lastline=$firstline+$range+1)) sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file" ((firstline=$firstline+$range+1)) done }}} This example uses ["BASH"] and KornShell ArithmeticExpressions, which older [wiki:Self:BourneShell Bourne shells] do not have. In that case the following example should be used instead: {{{ file=/etc/passwd range=10 firstline=1 maxlines=`wc -l < "$file"` # count line numbers while [ $firstline -le $maxlines ] do lastline=`expr $firstline + $range + 1` sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file" firstline=`expr $lastline + 1` done }}} [[Anchor(faq20)]] == How can I find and deal with file names containing newlines, spaces or both? == The preferred method is still to use {{{ find ... -exec command {} \; }}} or, if you need to handle filenames ''en masse'': {{{ find ... -print0 | xargs -0 command }}} for GNU {{{find}}}/{{{xargs}}}, or (POSIX {{{find}}}): {{{ find ... -exec command {} + }}} Use that unless you really can't. Another way to deal with files with spaces in their names is to use the shell's filename expansion (["globbing"]). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well. This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. (But it will not work in the original BourneShell.) {{{ for file in *.mp3; do mv "$file" "${file// /_}" done }}} You could do the same thing for all files (regardless of extension) by using {{{ for file in *\ *; do }}} instead of *.mp3. Another way to handle filenames recursively involes using the {{{-print0}}} option of {{{find}}} (a GNU/BSD extension), together with bash's {{{-d}}} option for read: {{{ unset a i while read -d $'\0' file; do a[i++]="$file" # or however you want to process each file done < <(find /tmp -type f -print0) }}} The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing {{{read}}} to use the NUL byte (\0) as its word delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using {{{find -exec}}}. [[Anchor(faq21)]] == How can I replace a string with another string in all files? == {{{sed}}} is a good command to replace strings, e.g. {{{ sed 's/olddomain\.com/newdomain\.com/g' input > output }}} To replace a string in all files of the current directory: {{{ for i in *; do sed 's/old/new/g' "$i" > atempfile && mv atempfile "$i" done }}} GNU sed 4.x (but no other version of sed) has a special {{{-i}}} flag which makes the temp file unnecessary: {{{ for i in *; do sed -i 's/old/new/g' "$i" done }}} Those of you who have perl 5 can accomplish the same thing using this code: {{{ perl -pi -e 's/old/new/g' * }}} Recursively: {{{ find . -type f -print0 | xargs -0 perl -pi -e 's/old/new/g' }}} To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...: {{{ perl -i.bak -pne 's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g' $(find . -type f) }}} Finally, here's a script that some people may find useful: {{{ : # chtext - change text in several files # neither string may contain '|' unquoted old='olddomain\.com' new='newdomain\.com' # if no files were specified on the command line, use all files: [ $# -lt 1 ] && set -- * for file do [ -f "$file" ] || continue # do not process e.g. directories [ -r "$file" ] || continue # cannot read file - ignore it # Replace string, write output to temporary file. Terminate script in case of errors sed "s|$old|$new|g" "$file" > "$file"-new || exit # If the file has changed, overwrite original file. Otherwise remove copy if cmp "$file" "$file"-new >/dev/null 2>&1 then rm "$file"-new # file nas not changed else mv "$file"-new "$file" # file has changed: overwrite original file fi done }}} If the code above is put into a script file (e.g. {{{chtext}}}), the resulting script can be used to change a text e.g. in all HTML files of the current and all subdirectories: {{{ find . -type f -name '*.html' -exec chtext {} \; }}} Many optimizations are possible: * use another {{{sed}}} separator character than '|', e.g. ^A (ASCII 1) * some implementations of {{{sed}}} (e.g. GNU sed) have an "-i" option that can change a file in-place; no temporary file is necessary in that case * the {{{find}}} command above could use either {{{xargs}}} or the built-in {{{xargs}}} of POSIX find Note: {{{set -- *}}} in the code above is safe with respect to files whose names contain spaces. The expansion of * by {{{set}}} is the same as the expansion done by {{{for}}}, and filenames will be preserved properly as individual parameters, and not broken into words on whitespace. A more sophisticated example of {{{chtext}}} is here: http://www.shelldorado.com/scripts/cmds/chtext [[Anchor(faq22)]] == How can I calculate with floating point numbers instead of just integers? == ["BASH"] does not have built-in floating point arithmetic: {{{ $ echo $((10/3)) 3 }}} For better precision, an external program must be used, e.g. {{{bc}}}, {{{awk}}} or {{{dc}}}: {{{ $ echo "scale=3; 10/3" | bc 3.333 }}} The "scale=3" command notifies {{{bc}}} that three digits of precision after the decimal point are required. {{{awk}}} can be used for calculations, too: {{{ $ awk 'BEGIN {printf "%.3f\n", 10 / 3}' /dev/null 3.333 }}} There is a subtle but important difference between the {{{bc}}} and the {{{awk}}} solution here: {{{bc}}} reads commands and expressions ''from standard input''. {{{awk}}} on the other hand evaluates the expression as ''part of the program''. Expressions on standard input are ''not'' evaluated, i.e. {{{echo 10/3 | awk '{print $0}'}}} will print {{{10/3}}} instead of the evaluated result of the expression. This explains why the example uses {{{/dev/null}}} as an input file for {{{awk}}}: the program evaluates the {{{BEGIN}}} action, evaluating the expression and printing the result. Afterwards the work is already done: it reads its standard input, gets an end-of-file indication, and terminates. If no file had been specified, {{{awk}}} would wait for data on standard input. Newer versions of KornShell93 have built-in floating point arithmetic, together with mathematical functions like {{{sin()}}} or {{{cos()}}} . [[Anchor(faq23)]] == How do I append a string to the contents of a variable? == The shell doesn't have a string concatenation operator like Java ("+") or Perl ("."). The following example shows how to append the string ".2004-08-15" to the contents of the shell variable {{{filename}}}: {{{ filename="$filename.2004-08-15" }}} If the variable name and the string to append could be confused, the variable name can be enclosed in braces, e.g. {{{ filename="${filename}old" }}} instead of {{{filename=$filenameold}}} [[Anchor(faq24)]] == I set variables in a loop. Why do they suddenly disappear after the loop terminates? == The following command always prints "total number of lines: 0", although the variable {{{linecnt}}} has a larger value in the {{{while}}} loop: {{{ linecnt=0 cat /etc/passwd | while read line do linecnt=`expr $linecnt + 1` done echo "total number of lines: $linecnt" }}} The reason for this surprising behaviour is that a {{{while/for/until}}} loop runs in a subshell when its input or output is redirected from a pipeline. For the {{{while}}} loop above, a new subshell with its own copy of the variable {{{linecnt}}} is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecnt}}} of the parent (whose value has not changed) is used in the {{{echo}}} command. It's hard to tell when shell would create a new process for a loop: * BourneShell creates it when the input or output is redirected, either by using a pipeline or by a redirection operator ('<', '>'). * ["BASH"] creates a new process only if the loop is part of a pipeline * KornShell creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it. To solve this, either use a method that works without a subshell (shown below), or make sure you do all processing inside that subshell (a bit of a kludge, but easier to work with): {{{ linecnt=0 cat /etc/passwd | ( while read line ; do linecnt="$((linecnt+1))" done echo "total number of lines: $linecnt" ) }}} To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem at least for ["BASH"] and KornShell (but still for BourneShell): {{{ linecnt=0 while read line ; do linecnt="$((linecnt+1))" done < /etc/passwd echo "total number of lines: $linecnt" }}} For ["BASH"], when the first part of the pipe is a command, you can use "process substitution". The command used here is a simple "echo -e $'a\nb\nc'" as a substitute for a command with a multiline output: {{{ while read LINE; do echo "-> $LINE" done < <(echo -e $'a\nb\nc') }}} A portable and common work-around is to redirect the input of the {{{read}}} command using {{{exec}}}: {{{ linecnt=0 exec < /etc/passwd # redirect standard input from the file /etc/passwd while read line # "read" gets its input from the file /etc/passwd do linecnt=`expr $linecnt + 1` done echo "total number of lines: $linecnt" }}} This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore: {{{ exec 3<&0 # save original standard input file descriptor "0" as FD "3" exec 0</etc/passwd # redirect standard input from the file /etc/passwd linecnt=0 while read line # "read" gets its input from the file /etc/passwd do linecnt=`expr $linecnt + 1` done exec 0<&3 # restore saved standard input (fd 0) from file descriptor "3" exec 3<&- # close the no longer needed file descriptor "3" echo "total number of lines: $linecnt" }}} Subsequent {{{exec}}} commands can be combined into one line, which is interpreted left-to-right: {{{ exec 3<&0 exec 0</etc/passwd _...read redirected standard input..._ exec 0<&3 exec 3<&- }}} is equivalent to {{{ exec 3<&0 0</etc/passwd _...read redirected standard input..._ exec 0<&3 3<&- }}} [[Anchor(faq25)]] == How can I access positional parameters after $9? == Use {{{${10}}}} instead of {{{$10}}}. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use {{{for}}}, e.g. to get the last parameter: {{{ for last do : # nothing done echo "last argument is: $last" }}} To get an argument by number, we can use a counter: {{{ n=12 # This is the number of the argument we are interested in i=1 for arg do if [ $i -eq $n ] then argn=arg break fi i=`expr $i + 1` done echo "argument number $n is: $argn" }}} This has the advantage of not "consuming" the arguments. If this is no problem, the {{{shift}}} command discards the first positional arguments: {{{ shift 11 echo "the 12th argument is: $1" }}} Although direct access to any positional argument is possible this way, it's hardly needed. The common way is to use {{{getopts(3)}}} to process command line options (e.g. "-l", or "-o filename"), and then use either {{{for}}} or {{{while}}} to process all arguments in turn. An explanation of how to process command line arguments is available here: http://www.shelldorado.com/goodcoding/cmdargs.html [[Anchor(faq26)]] == How can I randomize (shuffle) the order of lines in a file? == {{{ randomize(){ while read l ; do echo "0$RANDOM $l" ; done | sort -n | cut -d" " -f2- } }}} Note: the leading 0 is to make sure it doesnt break if the shell doesnt support $RANDOM, which is supported by ["BASH"], KornShell, KornShell93 and ["POSIX"] shell, but not BourneShell. The same idea (printing random numbers in front of a line, and sorting the lines on that column) using other programs: {{{ awk ' BEGIN { srand() } { print rand() "\t" $0 } ' | sort -n | # Sort numerically on first (random number) column cut -f2- # Remove sorting column }}} This is faster thAn the previous solution, but will not work for very old AWK implementations (try "nawk", or "gawk", if available). A related question we frequently see is, "How can I print a random line from a file?" The problem here is that you need to know in advance how many lines the file contains. Lacking that knowledge, you have to read the entire file through once just to count them -- or, you have to suck the entire file into memory. Let's explore both of these approaches. {{{ n=$(wc -l < "$file") # Count number of lines. r=$((RANDOM % n + 1)) # Random number from 1..n. sed -n "$r{p;q;}" "$file" # Print the r'th line. }}} (These examples use the answer from [#faq11 FAQ 11] to print the n'th line.) The first one's pretty straightforward -- we use {{{wc}}} to count the lines, choose a random number, and then use {{{sed}}} to print the line. If we already happened to know how many lines were in the file, we could skip the {{{wc}}} command, and this would be a very efficient approach. The next example sucks the entire file into memory. This approach saves time reopening the file, but obviously uses more memory. {{{ oIFS=$IFS IFS=$'\n' lines=($(<"$file")) IFS=$oIFS n=${#lines[@]} r=$((RANDOM % n)) echo "${lines[r]}" }}} Note that we don't add 1 to the random number in this example, because the array of lines is indexed counting from 0. Also, some people want to choose a random file from a directory (for a signature on an e-mail, or to chose a random song to play, or a random image to display, etc.). A similar technique can be used: {{{ files=(*.ogg) # Or *.gif, or * n=${#files[@]} # For aesthetics xmms "${files[RANDOM % n]}" # Choose a random element }}} [[Anchor(faq27)]] == How can two processes communicate using named pipes (fifos)? == NamedPipes, also known as FIFOs ("First In First Out") are well suited for inter-process communication. The advantage over using files as a means of communication is, that processes are synchronized by pipes: a process writing to a pipe blocks if there is no reader, and a process reading from a pipe blocks if there is no writer. Here is a small example of a server process communicating with a client process. The server sends commands to the client, and the client acknowledges each command: '''Server''' {{{ #! /bin/sh # server - communication example # Create a FIFO. Some systems don't have a "mkfifo" command, but use # "mknod pipe p" instead mkfifo pipe while sleep 1 do echo "server: sending GO to client" # The following command will cause this process to block (wait) # until another process reads from the pipe echo GO > pipe # A client read the string! Now wait for its answer. The "read" # command again will block until the client wrote something read answer < pipe # The client answered! echo "server: got answer: $answer" done }}} '''Client''' {{{ #! /bin/sh # client # We cannot start working until the server has created the pipe... until [ -p pipe ] do sleep 1; # wait for server to create pipe done # Now communicate... while sleep 1 do echo "client: waiting for data" # Wait until the server sends us one line of data: read data < pipe # Received one line! echo "client: read <$data>, answering" # Now acknowledge that we got the data. This command # again will block until the server read it. echo ACK > pipe done }}} Write both examples to files {{{server}}} and {{{client}}} respectively, and start them concurrently to see it working: {{{ $ chmod +x server client $ server & client & server: sending GO to client client: waiting for data client: read <GO>, answering server: got answer: ACK server: sending GO to client client: waiting for data client: read <GO>, answering server: got answer: ACK server: sending GO to client client: waiting for data [...] }}} [[Anchor(faq28)]] == How do I determine the location of my script? I want to read some config files from the same place. == This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable {{{$0}}}. But providing the script name in {{{$0}}} is only a (very common) convention, not a requirement. The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". That's not the case in ["BASH"]. But this isn't reliable across shells; some of them return the actual command typed in by the user instead of the fully qualified path. In those cases, if all you want is the fully qualified version of $0, you can use something like this (["POSIX"], non-Bourne): {{{ [[ $0 = /* ]] && echo $0 || echo $PWD/$0 }}} Or the BourneShell version: {{{ case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac }}} However, this approach has some major drawbacks. The most important is, that the script name (as seen in {{{$0}}}) may not be relative to the current working directory, but relative to a directory from the program search path {{{$PATH}}} (this is often seen with KornShell). Another drawback is that there is really no guarantee that your script is still in the same place it was when it first started executing. Suppose your script is loaded from a temporary file which is then unlinked immediately... your script might not even exist on disk any more! The script could also have been moved to a different location while it was executing. Or (and this is most likely by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common {{{PATH}}} directory like {{{/usr/local/bin}}}, which is how it's being invoked. Your script might be in {{{/opt/foobar/bin/script}}} but the naive approach of reading {{{$0}}} won't tell you that. (For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [http://www.cs.bell-labs.com/sys/doc/lexnames.html this Plan 9 paper].) So if the name in {{{$0}}} is a relative one, i.e. does not start with '/', we can still try to search the script like the shell would have done: in all directories from {{{$PATH}}}. The following script shows how this could be done: {{{ myname=$0 if [ -s "$myname" ] && [ -x "$myname" ] then # $myname is already a valid file name mypath=$myname else case "$myname" in /*) exit 1;; # absolute path - do not search PATH *) # Search all directories from the PATH variable. Take # care to interpret leading and trailing ":" as meaning # the current directory; the same is true for "::" within # the PATH. for dir in `echo "$PATH" | sed 's/^:/.:/g;s/::/:.:/g;s/:$/:./;s/:/ /g'` do [ -f "$dir/$myname" ] || continue # no file [ -x "$dir/$myname" ] || continue # not executable mypath=$dir/$myname break # only return first matching file done ;; esac fi if [ -f "$mypath" ] then : # echo >&2 "DEBUG: mypath=<$mypath>" else echo >&2 "cannot find full path name: $myname" exit 1 fi echo >&2 "path of this script: $mypath" }}} Note that {{{$mypath}}} is not necessarily an absolute path name. It still can contain relative parts like {{{../bin/myscript}}}. Generally storing data files in the same directory as their scripts is a bad practice. The Unix file system layout assumes that files in one place (e.g. /bin) are executable programs, while files in another place (e.g. /etc) are data files. (Let's ignore legacy Unix systems with programs in /etc for the moment, shall we....) It really makes the most sense to keep your script's configuration in a single, static location such as {{{$SCRIPTROOT/etc/foobar.conf}}}. If you need to define multiple configuration files, then you can have a directory (say, {{{/var/lib/foobar}}} or {{{/usr/local/lib/foobar}}}), and read that directory's location from a variable in {{{/etc/foobar.conf}}}. If you don't even want that much to be hard-coded, you could pass the location of {{{foobar.conf}}} as a parameter to the script. If you need the script to assume certain default in the absence of {{{/etc/foobar.conf}}}, you can put defaults in the script itself, and/or fall back to something like {{{$HOME/.foobar.conf}}} if {{{/etc/foobar.conf}}} is missing. (This depends on what your script does. In some cases, it may make more sense to abort gracefully.) [[Anchor(faq29)]] == How can I display value of a symbolic link on standard output? == The external command {{{readlink}}} can be used to display the value of a symbolic link. {{{ $ readlink /bin/sh bash }}} you can also use GNU find's %l directive, which is especially useful if you need to resolve links in batches: {{{ $ find /bin/ -type l -printf '%p points to %l\n' /bin/sh points to bash /bin/bunzip2 points to bzip2 ... }}} If your system lacks {{{readlink}}}, you can use a function like this one: {{{ readlink() { local path=$1 ll if [ -L "$path" ]; then ll="$(LC_ALL=C ls -l "$path" 2> /dev/null)" && echo "${ll/* -> }" else return 1 fi } }}} [[Anchor(faq30)]] == How can I rename all my *.foo files to *.bar? == Some GNU/Linux distributions have a rename command, which you can use for this purpose; however, the syntax differs from one distribution to the next, so it's not a portable answer. You can do it in POSIX shells like this: {{{ for f in *.foo; do mv "$f" "${f%.foo}.bar"; done }}} This invokes the external command {{{mv}}} once for each file, so it may not be as efficient as some of the {{{rename}}} implementations. If you want to do it recursively, then it becomes much more challenging. This example works (in ["BASH"]) as long as no files have newlines in their names: {{{ find . -name '*.foo' -print | while IFS=$'\n' read -r f; do mv "$f" "${f%.foo}.bar" done }}} Another common form of this question is "How do I rename all my MP3 files so that they have underscores instead of spaces?" You can use this: {{{ for f in *\ *.mp3; do mv "$f" "${f// /_}"; done }}} [[Anchor(faq31)]] == What is the difference between the old and new test commands ([ and [[)? == {{{[}}} ("test" command) and {{{[[}}} ("new test" command) are both used to evaluate expressions. Some examples: {{{ if [ -z "$variable" ] then echo "variable is empty!" fi if [ -f "$filename" ] then echo "not a valid, existing file name: $filename" fi }}} and {{{ if [[ -e $file ]] then echo "directory entry does not exist: $file" fi if [[ $file0 -nt $file1 ]] then echo "file $file0 is newer than $file1" fi }}} To cut a long story short: {{{[}}} implements the old, portable syntax of the command. Although all modern shells have built-in implementations, there usually still is an external executable of that name, e.g. {{{/bin/[}}}. {{{[[}}} is a new improved version of it, which is a keyword, not a program. This has benefical effects on the ease of use, see below. {{{[[}}} is understood by KornShell, ["BASH"] (e.g. 2.03), KornShell93, ["POSIX"] shell, but not by the older BourneShell. Although {{{[}}} and {{{[[}}} have much in common, and share many expression operators like "-f", "-s", "-n", "-z", there are some notable differences. Here is a comparison list: ||'''Feature'''||'''new test''' {{{[[}}}||'''old test''' {{{[}}}||'''Example'''|| ||<rowspan="4">string comparison||>||(not available)||-|| ||<||(not available)||-|| ||== (or =)||=||-|| ||!=||!=||-|| ||<rowspan="2">expression grouping||&&||-a||{{{[[ -n $var && -f $var ]] && echo "$var is a file"}}}|| ||{{{||}}}||-o||-|| ||Pattern matching||=||(not available)||{{{[[ $name = a* ]] || echo "name does not start with an 'a': $name"}}}|| ||In-process regular expression matching||=~||(not available)||{{{[[ $(date) =~ '^Fri ... 13 ' ]] && echo "It's Friday the 13th!"}}}|| Special primitives that {{{[[}}} is defined to have, but {{{[}}} may be lacking (depending on the implementation): ||'''Description'''||'''Primitive'''||'''Example'''|| ||entry (file or directory) exists||-e||{{{[[ -e $config ]] && echo "config file exists: $config"}}}|| ||file is newer/older than other file||-nt / -ot||{{{[[ $file0 -nt $file1 ]] && echo "$file0 is newer than $file1"}}}|| ||two files are the same||-ef||{{{[[ $input -ef $output ]] && { echo "will not overwrite input file: $input"; exit 1; } }}}|| ||negation||!||-|| But there are more subtle differences. * No field splitting will be done for {{{[[}}} (and therefore many arguments need not to be quoted) {{{ file="file name" [[ -f $file ]] && echo "$file is a file"}}} will work even though $file is not quoted and contains whitespace. With {{{[}}} the variable needs to be quoted: {{{ file="file name" [ -f "$file" ] && echo "$file is a file"}}} This makes {{{[[}}} easier to use and less error prone. * No file name generation will be done for {{{[[}}}. Therefore the following line tries to match the contents of the variable $path with the pattern {{{/*}}} {{{ [[ $path = /* ]] && echo "\$path starts with a forward slash /: $path"}}} The next command most likely will result in an error, because {{{/*}}} is subject to file name generation: {{{ [ $path = /* ] && echo "this does not work"}}} {{{[[}}} is strictly used for strings and files. If you want to compare numbers, use ArithmethicExpression ((''expression'')), e.g. {{{ i=0 while ((i<10)) do echo $i ((i=$i+1)) done}}} When should the new test command {{{[[}}} be used, and when the old one {{{[}}}? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires ["BASH"] or KornShell, the new syntax could be preferable. [[Anchor(faq32)]] == How can I redirect the output of 'time' to a variable or file? == The reason that 'time' needs special care for redirecting its output is one of those mysteries of the universe. The answer will probably be solved around the same time we find dark matter. * File Redirection {{{ bash -c "time ls" > /path/to/foo 2>&1 ( time ls ) > /path/to/foo 2>&1 { time ls; } > /path/to/foo 2>&1 }}} * Variable Redirection {{{ foo=$( bash -c "time ls" 2>&1 ) foo=$( ( time ls ) 2>&1 ) foo=$( { time ls; } 2>&1 ) }}} Note: Using 'bash -c' and ( ) creates a subshell, using { } does not. Do with that as you wish. [[Anchor(faq33)]] == How can I find a process ID for a process given its name? == Usually a process is referred to using its process ID (PID), and the {{{ps}}} command can display the information for any process given its process ID, e.g. {{{ $ echo $$ # my process id 21796 $ ps -p 21796 PID TTY TIME CMD 21796 pts/5 00:00:00 ksh }}} But frequently the process ID for a process is not known, but only its name. Some operating systems, e.g. Solaris, BSD, and some versions of Linux have a dedicated command to search a process given its name, called {{{pgrep}}}: {{{ $ pgrep init 1 }}} Often there is an even more specialized program available to not just find the process ID of a process given its name, but also to send a signal to it: {{{ $ pkill myprocess }}} Some systems also provide {{{pidof}}}. It differs from {{{pgrep}}} in that multiple output process IDs are only space separated, not newline separated. {{{ $ pidof cron 5392 }}} If these programs are not available, a user can search the output of the ps(1) command using {{{grep}}}. The major problem when grepping the ps output is that grep ''may'' match its own ps entry (try: ps aux | grep init). To make matters worse, this does not happen every time; the techicnal name for this is a "race condition". To avoid this, there are several ways: * Using grep -v at the end {{{ ps aux | grep name | grep -v grep }}} will throw away all lines containing "grep" from the output. Disadvantage: You always have the exit state of the grep -v, so you can't e.g. check if a specific process exists. * Using grep -v in the middle {{{ ps aux | grep -v grep | grep name }}} This does exactly the same, beside that the exit state of "grep name" is acessible and a representation for "name is a process in ps" or "name is not a process in ps". It still has the disadvantage to start a new process (grep -v). * Using [] in grep {{{ ps aux | grep [n]ame }}} This spawns only the needed grep-process. The trick is to use the {{{[]}}}-character class (regular expressions). To put only one character in a character group normally makes no sense at all, because a {{{[c]}}} will always be a "c". In this case, it's the same. {{{grep [n]ame}}} searches for "name". But as grep's own process list entry is what you executed ("grep [n]ame") and not "grep name", it will not match itself. ===BEGIN greycat rant=== Most of the time when someone asks a question like this, it's because they want to manage a long-running daemon using primitive shell scripting techniques. Common variants are "How can I get the PID of my foobard process.... so I can start one if it's not already running" or "How can I get the PID of my foobard process... because I want to prevent the foobard script from running if foobard is already active." Both of these questions will lead to seriously flawed production systems. If what you really want is to restart your daemon whenever it dies, just do this: {{{ #!/bin/sh while true; do mydaemon --in-the-foreground done }}} where --in-the-foreground is whatever switch, if any, you must give to the daemon to PREVENT IT from automatically backgrounding itself. (Often, -d does this and has the additional benefit of running the daemon with increased verbosity.) Self-daemonizing programs may or may not be the target of a future greycat rant.... If that's too simplistic, look into [http://cr.yp.to/daemontools.html daemontools] or [http://smarden.org/runit/ runit], which are programs for managing services. If what you really want is to prevent multiple instances of your program from running, then the only sure way to do that is by using a lock. For details on doing this, see ProcessManagement or [#faq45 FAQ 45]. ===END greycat rant=== [[Anchor(faq34)]] == Can I do a spinner in Bash? == Sure. {{{ i=1 sp="/-\|" echo -n ' ' while true do echo -en "\b${sp:i++%${#sp}:1}" done }}} You can also use \r instead of \b. You can use pretty much any character sequence you want as well. If you want it to slow down, put a {{{sleep}}} command inside the loop. A similar technique can be used to build progress bars. [[Anchor(faq35)]] == How can I handle command-line arguments to my script easily? == Well, that depends a great deal on what you want to do with them. Here's a general template that might help for the simple cases: {{{ while [[ $1 == -* ]]; do case "$1" in -h|--help) show_help; exit 0;; -v) verbose=1; shift;; -f) output_file=$2; shift 2;; esac done # Now all of the remaining arguments are the filenames which followed # the optional switches. You can process those with "for i" or "$@". }}} For more complex/generalized cases, or if you want things like "-xvf" to be handled as three separate flags, you can use getopts or getopt. (Heiner, that's your cue....) [[Anchor(faq36)]] == How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction). == Use the comm(1) command. {{{ # intersection of file1 and file2 comm -12 <(sort file1) <(sort file2) # subtraction of file1 from file2 comm -13 <(sort file1) <(sort file2) }}} Read the comm(1) manpage for details. If for some reason you lack the core comm(1) program, you can use these other methods: an amazingly simple and fast implementation, that took just 20 seconds to match a 30k line file against a 400k line file for me. note that it probably only works with GNU grep, and that the file specified with -f is will be loaded into ram, so it doesn't scale for very large files. it has grep read one of the sets as a pattern list from a file (-f), and interpret the patterns as plain strings not regexps (-F), matching only whole lines (-x). {{{ # intersection of file1 and file2 grep -xF -f file1 file2 # substraction of file1 from file2 grep -vxF -f file1 file2 }}} an implementation using sort and uniq {{{ # intersection of file1 and file2 sort file1 file2 | uniq -d (Assuming each of file1 or file2 does not have repeated content) # file1-file2 (Subtraction) sort file1 file2 file2 | uniq -u # same way for file2 - file1, change last file2 to file1 sort file1 file2 file1 | uniq -u }}} another implementation of substraction: {{{ cat file1 file1 file2 | sort | uniq -c | awk '{ if ($1 == 2) { $1 = ""; print; } }' }}} This may introduce an extra space at the start of the line; if that's a problem, just strip it away. Also, this approach assumes that neither file1 nor file2 has any duplicates in it. Finally, it sorts the output for you. If that's a problem, then you'll have to abandon this approach altogether. Perhaps you could use awk's associative arrays (or perl's hashes or tcl's arrays) instead. [[Anchor(faq37)]] == How can I print text in various colors? == ''Do not'' hard-code ANSI color escape sequences in your program! The {{{tput}}} command lets you interact with the terminal database in a sane way. {{{ tput setaf 1; echo this is red tput setaf 2; echo this is green tput setaf 0; echo now we are back in black }}} {{{tput}}} reads the terminfo database which contains all the escape codes necessary for interacting with your terminal, as defined by the {{{$TERM}}} variable. For more details, see the {{{terminfo(5)}}} man page. If you don't know in advance what your user's terminal's default text color is, you can use {{{tput sgr0}}} to reset the colors to their default settings. This also removes boldface ({{{tput bold}}}), etc. [[Anchor(faq38)]] == How do Unix file permissions work? == See ["Permissions"]. [[Anchor(faq39)]] == What are all the dot-files that bash reads? == See DotFiles. [[Anchor(faq40)]] == How do I use dialog to get input from the user? == {{{ foo=$(dialog --inputbox "text goes here" 8 40 2>&1 >/dev/tty) echo "The user typed '$foo'" }}} The redirection here is a bit tricky. 1. The {{{foo=$(command)}}} is set up first, so the standard output of the command is being captured by bash. 1. Inside the command, the {{{2>&1}}} causes standard error to be sent to where standard out is going -- in other words, stderr will now be captured. 1. {{{>/dev/tty}}} sends standard output to the terminal, so the dialog box will be seen by the user. Standard error will still be captured, however. Another common {{{dialog(1)}}}-related question is how to dynamically generate a dialog command that has items which must be quoted (either because they're empty strings, or because they contain internal white space). One ''can'' use {{{eval}}} for that purpose, but the cleanest way to achieve this goal is to use an array. {{{ unset m; i=0 words=(apple banana cherry "dog droppings") for w in "${words[@]}"; do m[i++]=$w; m[i++]="" done dialog --menu "Which one?" 12 70 9 "${m[@]}" }}} In the previous example, the while loop that populates the '''m''' array could have been reading from a pipeline, a file, etc. Recall that the construction {{{"${m[@]}"}}} expands to the entire contents of an array, but with each element implicitly quoted. It's analogous to the {{{"$@"}}} construct for handling positional parameters. For more details, see [#faq50 FAQ50] below. Here's another example, using filenames: {{{ files=(*.mp3) # These may contain spaces, apostrophes, etc. cmd=(dialog --menu "Select one:" 22 76 16); n=6 i=0 for f in "${files[@]}"; do cmd[n++]=$((i++)); cmd[n++]="$f" done choice=$("${cmd[@]}" 2>&1 >/dev/tty) }}} The user's choice will be stored in the {{{choice}}} variable, as an integer, which can in turn be used as an index into the {{{files}}} array. A seperate but useful function of dialog is to track progress of a process that produces output. Below is an example that uses dialog to track processes writing to a log file. In the dialog window, there is a tailbox where output is stored, and a msgbox with a clickable Quit. Clicking quit will cause trap to execute, removing the tempfile, and destroying the tail process. {{{ #you can not tail a nonexistant file, so always ensure it pre-exists! rm -f dialog-tail.log; echo Initialize log >> dialog-tail.log date >> dialog-tail.log tempfile=`tempfile 2>/dev/null` || tempfile=/tmp/test$$ trap "rm -f $tempfile" 0 1 2 5 15 dialog --title "TAIL BOXES" \ --begin 10 10 --tailboxbg dialog-tail.log 8 58 \ --and-widget \ --begin 3 10 --msgbox "Press OK " 5 30 \ 2>$tempfile & mypid=$!; for i in 1 2 3; do echo $i >> dialog-tail.log; sleep 1; done echo Done. >> dialog-tail.log wait $mypid; }}} [[Anchor(faq41)]] == How do I determine whether a variable contains a substring? == {{{ if [[ $foo = *bar* ]] }}} The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions: {{{ if [[ $foo =~ ab*c ]] # bash 3, matches abbbbcde, or ac, etc. }}} If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax: {{{ case "$foo" in *bar*) .... ;; esac }}} This should allow you to match variables against globbing-style patterns. if you need a portable way to match variables against regular expressions, use {{{grep}}} or {{{egrep}}}. {{{ if echo "$foo" | egrep some-regex >/dev/null; then ... }}} [[Anchor(faq42)]] == How can I find out if a process is still running? == The {{{kill}}} command is used to send signals to a running process. As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running: {{{ myprog & # Start program in the background daemonpid=$! # ...and save its process id while sleep 60 do if kill -0 $daemonpid # Is the process still alive? then echo >&2 "OK - process is still running" else echo >&2 "ERROR - process $daemonpid is no longer running!" break fi done}}} This is one of those questions that usually masks a much deeper issue. It's rare that someone wants to know whether a process is still running simply to display a red or green light to an operator. More often, there's some ulterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running, or to ensure mutually exclusive access to a resource, etc. For much better discussion of these issues, see ProcessManagement or [#faq33 FAQ #33]. [[Anchor(faq43)]] == How can I use array variables? == BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g. {{{ host[0]="micky" host[1]="minnie" host[2]="goofy" i=0 while (($i < ${#host[@]} )) do echo "host number $i is ${host[i++]}" done}}} The awkward experssion {{{ ${#host[@]} }}} returns the number of elements for the array {{{host}}}. It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell: {{{ # BASH array=(one two three four) # KornShell set -A array -- one two three four}}} [[Anchor(faq44)]] == How can I use associative arrays or variable variables? == Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array: {{{ # KornShell93 script - does not work with BASH typeset -A homedir # Declare KornShell93 associative array homedir[jim]=/home/jim homedir[silvia]=/home/silvia homedir[alex]=/home/alex for user in ${!homedir[@]} # Enumerate all indices (user names) do echo "Home directory of user $user is ${homedir[$user]}" done}}} BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example: {{{ for user in jim silvia alex do eval homedir_$user=/home/$user done}}} This creates the variables {{{ homedir_jim=/home/jim homedir_silvia=/home/silvia homedir_alex=/home/alex}}} with the corresponding content. Note the use of the {{{eval}}} command, which interprets a command line not just one time like the shell usually does, but '''twice'''. In the first step, the shell uses the input {{{homedir_$user=/home/$user}}} to create a new line {{{homedir_jim=/home/jim}}}. In the second step, caused by {{{eval}}}, this variable assignment is executed, actually creating the variable. Print the variables using {{{ for user in jim silvia alex do varname=homedir_$user # e.g. "homedir_jim" eval varcontent='$'$varname # e.g. "/home/jim" echo "home directory of $user is $varcontent" done}}} The {{{eval}}} line needs some explanation. In a first step the command substitution is run: {{{ eval varcontent='$'$varname}}} becomes {{{ eval varcontent=$homedir_jim}}} In a second step the {{{eval}}} re-evaluates the line, and converts this to {{{ varcontent=/home/jim}}} Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages: 1. it's hard to read and to maintain 1. the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named {{{hong-hu}}}, because a dash '-' can be no valid part of a user name. 1. Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it. Here is the summary. "{{{var}}}" is a constant prefix, "{{{$index}}}" contains index string, "{{{$content}}}" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail: * Set variables {{{ eval "var$index=\"$content\"" # index must only contain characters from [a-zA-Z0-9_]}}} * Print variable content {{{ eval "echo \"var$index=\$$varname\""}}} * Check if a variable is empty {{{ if eval "[ -z "\$var$index\" ]" then echo "variable is empty: $var$index" fi}}} You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables. [[Anchor(faq45)]] == How can I ensure that only one instance of a script is running at a time (mutual exclusion)? == We need some means of '''mutual exclusion'''. One easy way is to use a "lock": any number of processes can try to acquire the lock simultaneously, but only one of them will succeed. How can we implement this using shell scripts? Some people suggest creating a lock file, and checking for its presence: {{{ # locking example -- WRONG lockfile=/tmp/myscript.lock if [ -f "$lockfile" ] then # lock is already held echo >&2 "cannot acquire lock, giving up: $lockfile" exit 0 else # nobody owns the lock > "$lockfile" # create the file #...continue script fi}}} This example '''does not work''', because there is a time window between checking and creating the file. Assume two processes are running the code at the same time. Both check if the lockfile exists, and both get the result that it does not exist. Now both processes assume they have acquired the lock -- a disaster waiting to happen. We need an atomic check-and-create operation, and fortunately there is one: {{{mkdir}}}, the command to create a directory: {{{ # locking example -- CORRECT lockdir=/tmp/myscript.lock if mkdir "$lockdir" then # directory did not exist, but was created successfully echo >&2 "successfully acquired lock: $lockdir" # continue script else echo >&2 "cannot acquire lock, giving up on $lockdir" exit 0 fi}}} The advantage over using a lock file is, that even when two processes call {{{mkdir}}} at the same time, only one process can succeed at most. This atomicity of check-and-create is ensured at the operating system kernel level. Note that we cannot use "mkdir -p" to automatically create missing path components: "mkdir -p" does not return an error if the directory exists already, but that's the feature we rely upon to ensure mutual exclusion. Now let's spice up this example by automatically removing the lock when the script finishes: {{{ lockdir=/tmp/myscript.lock if mkdir "$lockdir" then echo >&2 "successfully acquired lock" # Remove lockdir when the script finishes, or when it receives a signal trap 'rm -rf "$lockdir"' 0 # remove directory when script finishes trap "exit 2" 1 2 3 15 # terminate script when receiving signal # Optionally create temporary files in this directory, because # they will be removed automatically: tmpfile=$lockdir/filelist else echo >&2 "cannot acquire lock, giving up on $lockdir" exit 0 fi}}} This example provides reliable mutual exclusion. There is still the disadvantage that a ''stale'' lock file could remain when the script is terminated with a signal not caught (or signal 9, SIGKILL), but it's a good step towards reliable mutual exclusion. An example that remedies this (contributed by Charles Duffy) follows: ''Are we sure this code's correct? There seems to be a discrepancy between the names LOCK_DEFAULT_NAME and DEFAULT_NAME; and it checks for processes in what looks to be a race condition; and it uses the Linux-specific /proc file system and the GNU-specific egrep -o to do so.... I don't trust it. It looks overly complex and fragile. And quite non-portable. -- GreyCat'' {{{ LOCK_DEFAULT_NAME=$0 LOCK_HOSTNAME="$(hostname -f)" ## function to take the lock if free; will fail otherwise function grab-lock { local PROGRAMNAME="${1:-$DEFAULT_NAME}" local PID=${2:-$$} ( umask 000; mkdir -p "/tmp/${PROGRAMNAME}-lock" mkdir "/tmp/${PROGRAMNAME}-lock/held" || return 1 mkdir "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-${PID}" && return 0 || return 1 ) 2>/dev/null return $? } ## function to nicely let go of the lock function release-lock { local PROGRAMNAME="${1:-$DEFAULT_NAME}" local PID=${2:-$$} ( rmdir "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-${PID}" || true rmdir "/tmp/${PROGRAMNAME}-lock/held" && return 0 || return 1 ) 2>/dev/null return $? } ## function to force anyone else off of the lock function break-lock { local PROGRAMNAME="${1:-$DEFAULT_NAME}" ( [ -d "/tmp/${PROGRAMNAME}-lock/held" ] || return 0 for DIR in "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-"* ; do OTHERPID="$(echo $DIR | egrep -o '[0-9]+$')" [ -d /proc/${OTHERPID} ] || rmdir $DIR done rmdir /tmp/${PROGRAMNAME}-lock/held && return 0 || return 1 ) 2>/dev/null return $? } ## function to take the lock nicely, freeing it first if needed function get-lock { break-lock "$@" && grab-lock "$@" } }}} Instead of using {{{mkdir}}} we could also have used the program to create a symbolic link, {{{ln -s}}}. For more discussion on these issues, see ProcessManagement. [[Anchor(faq46)]] == I want to check to see whether a word is in a list (or an element is a member of a set). == Let's suppose you have your "list" stored as a big string of words, with spaces in between them. (That's the most common case when people are asking this one.) What you actually want to do is determine whether the string " foo " (note the spaces around it) appears in the list. But since your list may not have leading/trailing spaces, you have to add them as well. So, here's the most portable way to do it: {{{ if echo " $list " | grep " foo " >/dev/null; then ....}}} GNU grep seems to have a special {{{-w}}} extension which lets you avoid the spaces: {{{ if echo "$list" | GNUgrep -q -w "foo"; then ....}}} Finally, if you want to use Bash builtins, you can do it thus: {{{ if [[ " $list " = *\ foo\ * ]]; then ....}}} This is basically the same as the original grep -- we surround both the list and the word (foo) with spaces, and then do a simple text matching. [[Anchor(faq47)]] == How can I redirect stderr to a pipe? == A pipe can only carry stdout of a program. To pipe stderr through it, you need to redirect stderr to the same destination as stdout. Optionally you can close stdout or redirect it to /dev/null to only get stderr. Some sample code: {{{ # - 'myprog' is an example for a program that outputs both, stdout and # stderr # - after the pipe I will just use a 'cat', of course you can put there # what you want # version 1: redirect stderr towards the pipe while stdout survives (both come # mixed) myprog 2>&1 | cat # version 2: redirect stderr towards the pipe without getting stdout (it's # redirected to /dev/null) myprog 2>&1 >/dev/null | cat #Note that '>/dev/null' comes after '2>&1', otherwise the stderr will also be directed to /dev/null # version 3: redirect stderr towards the pipe while the "original" stdout gets # closed myprog 2>&1 >&- | cat }}} One may also pipe stderr only but keep stdout intact (without ''a priori'' knowledge of where the script's output is going). This is a bit trickier. This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade. (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!) On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick: {{{ # Redirecting only stderr to a pipe. exec 3>&1 # Save current "value" of stdout. ls -l /dev/fd/ 2>&1 >&3 3>&- | grep bad 3>&- # Close fd 3 for 'grep' and 'ls'. # ^^^^ ^^^^ exec 3>&- # Now close it for the remainder of the script. # Thanks, S.C. }}} The output of the ls command shows where each file descriptor points to. The same can be done without exec: {{{ { ls -l /dev/fd/ 2>&1 1>&3 3>&- | grep bad 3>&-; } 3>&1 }}} To show it as a dialog one-liner: {{{ exec 3>&1 dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/' exec 3>&- }}} This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers. A similar effect can be achieved with process substitution: {{{ perl -e 'print "stdout\n"; warn "stderr\n"' 2> >(tr a-z A-Z) }}} This will pipe standard error through the tr command. [[Anchor(faq48)]] == Why should I never use eval? == "eval" is a common misspelling of "evil". The section dealing with spaces in file names used to include the following quote "helpful tool (which is probably not as safe as the \0 technique)", end quote. {{{ Syntax : nasty_find_all [path] [command] <maxdepth> }}} {{{ #This code is evil and must never be used export IFS=" " [ -z "$3" ] && set -- "$1" "$2" 1 FILES=`find "$1" -maxdepth "$3" -type f -printf "\"%p\" "` #warning, evilness eval FILES=($FILES) for ((I=0; I < ${#FILES[@]}; I++)) do eval "$2 \"${FILES[I]}\"" done unset IFS }}} This script is supposed to recursively search for files with newlines and/or spaces in them, arguing that {{{find -print0 | xargs -0}}} was unsuitable for some purposes such as multiple commands. It was followed by an instructional description on all the lines involved, which we'll skip. To its defense, it works: {{{ $ ls -lR .: total 8 drwxr-xr-x 2 vidar users 4096 Nov 12 21:51 dir with spaces -rwxr-xr-x 1 vidar users 248 Nov 12 21:50 nasty_find_all ./dir with spaces: total 0 -rw-r--r-- 1 vidar users 0 Nov 12 21:51 file?with newlines $ ./nasty_find_all . echo 3 ./nasty_find_all ./dir with spaces/file with newlines $ }}} But consider this: {{{ $ touch "\"); ls -l $'\x2F'; #" }}} You just created a file called {{{ "); ls -l $'\x2F'; #}}} Now FILES will contain {{{ ""); ls -l $'\x2F'; #}}}. When we do {{{eval FILES=($FILES)}}}, it becomes {{{ FILES=(""); ls -l $'\x2F'; #" }}} Which becomes the two statements {{{ FILES=(""); }}} and {{{ ls -l / }}}. Congratulations, you just allowed execution of arbitrary commands. {{{ $ touch "\"); ls -l $'\x2F'; #" $ ./nasty_find_all . echo 3 total 1052 -rw-r--r-- 1 root root 1018530 Apr 6 2005 System.map drwxr-xr-x 2 root root 4096 Oct 26 22:05 bin drwxr-xr-x 3 root root 4096 Oct 26 22:05 boot drwxr-xr-x 17 root root 29500 Nov 12 20:52 dev drwxr-xr-x 68 root root 4096 Nov 12 20:54 etc drwxr-xr-x 9 root root 4096 Oct 5 11:37 home drwxr-xr-x 10 root root 4096 Oct 26 22:05 lib drwxr-xr-x 2 root root 4096 Nov 4 00:14 lost+found drwxr-xr-x 6 root root 4096 Nov 4 18:22 mnt drwxr-xr-x 11 root root 4096 Oct 26 22:05 opt dr-xr-xr-x 82 root root 0 Nov 4 00:41 proc drwx------ 26 root root 4096 Oct 26 22:05 root drwxr-xr-x 2 root root 4096 Nov 4 00:34 sbin drwxr-xr-x 9 root root 0 Nov 4 00:41 sys drwxrwxrwt 8 root root 4096 Nov 12 21:55 tmp drwxr-xr-x 15 root root 4096 Oct 26 22:05 usr drwxr-xr-x 13 root root 4096 Oct 26 22:05 var ./nasty_find_all ./dir with spaces/file with newlines ./ $ }}} It doesn't take much imagination to replace {{{ ls -l }}} with {{{ rm -rf }}} or worse. One might think these circumstances are obscure, but one should not be tricked by this. All it takes is one malicious user, or perhaps more likely, a benign user who left the terminal unlocked when going to the bathroom, wrote a funny php uploading script that doesn't sanity check file names or who made the same mistake as oneself in allowing arbitrary code execution (now instead of being limited to the www-user, an attacker can use {{{nasty_find_all}}} to traverse chroot jails and/or gain additional privileges), uses an IRC or IM client that's too liberal in the filenames it accepts for file transfers or conversation logs, etc. [[Anchor(faq49)]] == How can I view periodic updates/appends to a file? (ex: growing log file) == {{{tail -f}}} will show you the growing log file. On some systems (e.g. OpenBSD), this will automatically track a rotated log file to the new file with the same name (which is usually what you want). To get the equivalent functionality on GNU systems, use {{{tail --follow=name}}} instead. This is helpful if you need to view only the updates to the file after your last view. {{{ # Start by setting n=1 tail -n $n testfile; n="+$(( $(wc -l < testfile) + 1 ))" }}} Every invocation of this gives the update to the file from where we stopped last. If you know the line number from where you want to start, set n to that. [[Anchor(faq50)]] == I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments. == Some people attempt to do things like this: {{{ # Non-working example args="-s 'The subject' $address" mail $args < $body }}} This fails because of word-splitting. When {{{$args}}} is evaluated, it becomes four words: {{{'The}}} is the second word, and {{{subject'}}} is the third word. What's needed is a way to maintain each word as a separate item, even if that word contains multiple spaces. Quotes won't do it, but an array will. {{{ # Working example args=(-s "The subject" "$address") mail "${args[@]}" < $body }}} Usually, this question arises when someone is trying to use {{{dialog}}} to construct a menu on the fly. For an example of how to do this properly, see [#faq40 FAQ #40] above. [[Anchor(faq51)]] == I want history-search just like in tcsh. How can I bind it to the up and down keys? == Just add the following to /etc/inputrc or your ~/.inputrc {{{ "\e[A":history-search-backward "\e[B":history-search-forward }}} [[Anchor(faq52)]] == How do I convert a file in DOS format to UNIX format. ( Remove CRLF line terminators ) == All these are from the sed one-liners page {{{ sed 's/.$//' dosfile # assumes that all lines end with CR/LF sed 's/^M$//' dosfile # in bash/tcsh, press Ctrl-V then Ctrl-M sed 's/\x0D$//' dosfile }}} Some distributions have ''dos2unix'' command which can do this. In vim, you can use '':set fileformat=unix'' [[Anchor(faq53)]] == I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is. Lines wrap around incorrectly. == You must put {{{\[}}} and {{{\]}}} around any non-printing escape sequences in your prompt. Thus: {{{ BLUE=$(tput setaf 4) PURPLE=$(tput setaf 5) BLACK=$(tput setaf 0) PS1='\[$BLUE\]\h:\[$PURPLE\]\w\[$BLACK\]\$ ' }}} Without the {{{\[ \]}}}, bash will think the bytes which constitute the escape sequences for the color codes will actually take up space on the screen, so bash won't be able to know where the cursor actually is. [[Anchor(faq54)]] == How can I tell whether a variable contains a valid number? == First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign". {{{ if [[ $foo = *[^0-9]* ]]; then echo "'$foo' has a non-digit somewhere in it" else echo "'$foo' is strictly numeric" fi }}} This can be done in Korn and legacy Bourne shells as well, using {{{case}}}: {{{ case "$foo" in *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;; *) echo "'$foo' is strictly numeric" ;; esac }}} If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression. Bash version 3 and above have regular expression support in the [[ command: {{{ if [[ $foo =~ ^[-+]?[0-9]+\(\.[0-9]+\)?$ ]]; then echo "'$foo' looks rather like a number" else echo "'$foo' doesn't look particularly numeric to me" fi }}} If you don't have bash version 3, then you would use {{{egrep}}}: {{{ if echo "$foo" | egrep '^[-+]?[0-9]+(\.[0-9]+)?$' >/dev/null; then echo "'$foo' might be a number" else echo "'$foo' might not be a number" fi }}} Note that the parentheses in the {{{egrep}}} regular expression don't require backslashes in front of them, whereas the ones in the bash3 command do. [[Anchor(faq55)]] == Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which? == Bash processes all redirections from left to right, in order. And the order is significant. Moving them around within a command may change the results of that command. For newbies who've somehow managed to miss the previous hundred or so examples, here's what you want: {{{ foo >file 2>&1 # Sends both stdout and stderr to file. }}} Now for the rest of you, here's a simple demonstration of what's happening: {{{ foo() { echo "This is stdout" echo "This is stderr" 1>&2 } foo >/dev/null 2>&1 # produces no output foo 2>&1 >/dev/null # writes "This is stderr" on the screen }}} Why do the results differ? In the first case, {{{>/dev/null}}} is performed first, and therefore the standard output of the command is sent to {{{/dev/null}}}. Then, the {{{2>&1}}} is performed, which causes standard error to be sent to the same place that standard output is ''already'' going. So both of them are discarded. In the second example, {{{2>&1}}} is performed first. This means standard error is sent to wherever standard output happens to be going -- in this case, the user's terminal. Then, standard output is sent to {{{/dev/null}}} and is therefore discarded. So when we run {{{foo}}} the second time, we see only its standard error, not its standard output. There are times when we really do want {{{2>&1}}} to appear first -- for one example of this, see [#faq40 FAQ 40]. There are other times when we may use {{{2>&1}}} without any other redirections. Consider: {{{ find ... 2>&1 | grep "some error" }}} In this example, we want to search {{{find}}}'s standard error (as well as its standard output) for the string "some error". The {{{2>&1}}} in the piped command forces standard error to go into the pipe along with standard output. (When pipes and redirections are mixed in this way, remember: the pipe is done ''first'', before any redirections. So {{{find}}}'s standard output is already set to point to the pipe before we process the {{{2>&1}}} redirection.) If we wanted to read ''only'' standard error in the pipe, and discard standard output, we could do it like this: {{{ find ... 2>&1 >/dev/null | grep "some error" }}} The redirections in that example are processed thus: 1. First, the pipe is created. {{{find}}}'s output is sent to it. 1. Next, {{{2>&1}}} causes {{{find}}}'s standard error to go to the pipe as well. 1. Finally, {{{>/dev/null}}} causes {{{find}}}'s standard output to be discarded, leaving only stderr going into the pipe. A related question is [#faq47 FAQ #47], which discusses how to send stderr to a pipeline. [[Anchor(faq56)]] == How can I untar or unzip multiple tarballs at once? == As the {{{tar}}} command was originally designed to read from and write to tape devices (tar - Tape ARchiver), you can specify only filenames to put inside an archive or to extract out of an archive (e.g. {{{tar x myfileonthe.tape}}}). There is an option to tell {{{tar}}} that the archive is not on some tape, but in a file: {{{-f}}}. This option takes exactly one argument: the filename of the file containing the archive. All other (following) filenames are taken to be archive members: {{{ tar -x -f backup.tar myfile.txt # OR (more common syntax IMHO) tar xf backup.tar myfile.txt }}} Now here's a common mistake -- imagine a directory containing the following archive-files you want to extract all at once: {{{ $ ls backup1.tar backup2.tar backup3.tar }}} Maybe you think of {{{tar xf *.tar}}}. Let's see: {{{ $ tar xf *.tar tar: backup2.tar: Not found in archive tar: backup3.tar: Not found in archive tar: Error exit delayed from previous errors }}} What happened? The shell replaced your *.tar by the matching filenames. You really wrote: {{{ tar xf backup1.tar backup2.tar backup3.tar }}} And as we saw earlier, it means: "extract the files backup2.tar and backup3.tar from the archive backup1.tar", which will of course only succeed when there are such filenames stored in the archive. The solution is relatively easy: extract the contents of all archives '''one at a time'''. As we use a UNIX shell and we are lazy, we do that with a loop: {{{ for tarname in *.tar; do tar xf "$tarname" done }}} What happens? The for-loop will iterate through all filenames matching {{{*.tar}}} and call {{{tar xf}}} for each of them. That way you extract all archives one-by-one and you even do it automagically. The second common archive type in these days is ZIP. The command to extract contents from a ZIP file is {{{unzip}}} (who would have guessed that!). The problem here is the very same: {{{unzip}}} takes only one option specifying the ZIP-file. So, you solve it the very same way: {{{ for zipfile in *.zip; do unzip "$zipfile" done }}} Not enough? Ok. There's another option with {{{unzip}}}: it can take shell-like patterns to specify the ZIP-file names. And to avoid interpretion of those patterns by the shell, you need to quote them. {{{unzip}}} itself and '''not''' the shell will interpret {{{*.zip}}} in this case: {{{ unzip "*.zip" # OR, to make more clear what we do: unzip \*.zip }}} (This feature of {{{unzip}}} derives mainly from its origins as an MS-DOS program. MS-DOS's command interpreter does not perform glob expansions, so every MS-DOS program must be able to expand wildcards into a list of filenames. This feature was left in the Unix version, and as we just demonstrated, it can occasionally be useful.) [[Anchor(faq57)]] == How can group entries (in a file by common prefixes)? == as in, convert: {{{ foo: entry1 bar: entry2 foo: entry3 baz: entry4 }}} to {{{ foo: entry1 entry3 bar: entry2 baz: entry4 }}} there are two simple general methods for this: a. sort the file, and then iterate over it, collectin entries until the prefix changes, and then print the collected entries with the previous prefix b iterate over the file, collect entries for each prefix in an array indexed by the prefix a basic implementation of a) in bash: {{{ old=xxx ; stuff= (sort file ; echo xxx) | while read prefix line ; do if [[ $prefix = $old ]] ; then stuff="$stuff $line" else echo "$old: $stuff" old="$prefix" stuff= fi done }}} and a basic implementation of b) in awk: {{{ { a[$1] = a[$1] " " $2 } END{ for (x in a) print x, a[x] } }}} usage: {{{ awk '{a[$1] = a[$1] " " $2}END{for (x in a) print x, a[x]}' file }}} [[Anchor(faq58)]] == Can bash handle binary data? == the answer is, basically no... while bash won't have as much problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them. one instance where such would sometimes be handy is for example storing small temporary bitmaps while working with netpbm... here i resorted to adding an extra pnmnoraw to the pipe, creating (larger) ascii files that bash has no problems storing) if you are feeling adventurous, consider this experiment: {{{ # bindec.bash, attempt to decode binary data to ascii decimals IFS= while read -n1 x ;do case "$x" in '') echo empty ;; # insert the 256 lines generated by the following oneliner here: # for x in $(seq 0 255) ;do echo " $'\\$(printf %o $x)') echo $x;;" ;done esac done }}} and then pipe binary data into it, maybe like so: {{{ for x in $(seq 0 255) ;do echo -ne "\\$(printf %o $x)" ;done | bash bindec.bash | nl | less }}} this suggests that a the 0 character is skipped entirely, because we can't create it with the input generation, enough to conveniently corrupt most binary files we try to process (note that this refers to storing them in variables... moving data between programs using pipes is always binary clean) [[Anchor(faq59)]] == How can I remove the last character of a line? == Using bash and ksh extended parameter substitution: {{{ var=${var%?} }}} Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of {{{var}}}. As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least). More portable, but slower: {{{ var=`expr "$var" : '\(.*\).'` }}} or (using {{{sed}}}): {{{ var=`echo "$var" | sed 's/.$//'` }}} [[Anchor(faq60)]] == I'm trying to write a script that will change directory (or set a variable), but after the script finishes, I'm back where I started (or my variable isn't set)! == Consider this: {{{ #!/bin/sh cd /tmp }}} If one executes this simple script, what happens? Bash forks, and the parent waits. The child executes the script, including the {{{chdir(2)}}} system call, and then exits. The parent, which was waiting for the child, harvests the child's exit status (presumably 0 for success), and then bash carries on with the next command. Since the {{{chdir}}} was done by a child process, it has no effect on the parent. Moreover, there is '''no conceivable way''' you can ''ever'' have a child process affect ''any'' part of the parent's environment, which includes its variables as well as its current working directory. So, how does one go about it? You can still have the {{{cd}}} command in an external file, but you can't ''run it'' as a script. Instead, you must {{{source}}} it (or "dot it in", using the {{{.}}} command, which is a synonym for {{{source}}}). {{{ echo 'cd /tmp' > $HOME/mycd source $HOME/mycd pwd # Now, we're in /tmp }}} The same thing applies to setting variables. {{{source}}} the file that contains the commands; don't try to run it. [[Anchor(faq61)]] == Is there a list of which features were added to specific releases of Bash? == * [http://cnswww.cns.cwru.edu/~chet/bash/NEWS NEWS]: a file tersely listing the notable changes between the current and previous versions * [http://cnswww.cns.cwru.edu/~chet/bash/CHANGES CHANGES]: a complete bash change history * [http://cnswww.cns.cwru.edu/~chet/bash/COMPAT COMPAT]: compatibility issues between bash3 and previous versions Here's a ''partial'' list of the changes, in a more compact format: ||'''Feature'''||'''Added in version'''|| ||x+=string||3.1-alpha1|| ||{x..y}||3.0-alpha|| ||${!array[@]}||3.0-alpha|| ||[[ =~||3.0-alpha|| ||<<<||2.05b-alpha1|| ||i++||2.04-devel|| ||for ((;;))||2.04-devel|| ||/dev/fd/N, /dev/tcp/host/port, etc.||2.04-devel|| ||a=(*.txt) file expansion||2.03-alpha|| ||extglob||2.02-alpha1|| ||[[||2.02-alpha1|| ||builtin printf||2.02-alpha1|| ||$(< filename)||2.02-alpha1|| ||** (exponentiation)||2.02-alpha1|| ||\xNNN||2.02-alpha1|| ||(( ))||2.0-beta2|| [[Anchor(faq62)]] == How do I create a temporary file in a secure manner? == Good question. To be filled in later. (Interim hints: {{{tempfile}}} is not portable. {{{mktemp}}} exists more widely, but it may require a {{{-c}}} switch to create the file in advance; or it may create the file by default and barf if {{{-c}}} is supplied. There does not appear to be any single command that simply ''works'' everywhere, without testing various arguments.) [[Anchor(faq63)]] == My ssh client hangs when I try to run a remote background job! == The following will not do what you expect: {{{ ssh me@remotehost 'sleep 120 &' # Client hangs for 120 seconds }}} This is a "feature" of [http://www.openssh.org/ OpenSSH]. The client will not close the connection as long as the remote end's terminal still is still in use -- and in the case of {{{sleep 120 &}}}, stdout and stderr are still connected to the terminal. The immediate answer to your question -- "How do I get the client to disconnect so I can get my shell back?" -- is to kill the ssh client. You can do this with the {{{kill}}} or {{{pkill}}} commands, of course; or by sending the INT signal (usually Ctrl-C) for a non-interactive ssh session (as above); or by pressing '''<Enter><~><.>''' (Enter, Tilde, Period) in the client's terminal window for an interactive remote shell. The long-term workaround for this is to ensure that all the file descriptors are redirected to a log file (or {{{/dev/null}}}) on the remote side: {{{ ssh me@remotehost 'sleep 120 >/dev/null 2>&1 &' # Client should return immediately }}} This also applies to restarting daemons on some legacy Unix systems. {{{ ssh root@hp-ux-box # Interactive shell ... # Discover that the problem is stale NFS handles /sbin/init.d/nfs.client stop # autofs is managed by this script and /sbin/init.d/nfs.client start # killing it on HP-UX is OK (unlike Linux) exit # Client hangs -- use Enter ~ . to kill it. }}} The legacy Unix {{{/sbin/init.d/nfs.client}}} script runs daemons in the background but leaves their stdout and stderr attached to the terminal (and they don't fully self-daemonize). The solution is either to fix the Unix vendor's broken init script, or to kill the ssh client process after this happens. The author of this article uses the latter approach. [[Anchor(faq64)]] == Why is it so hard to get an answer to the question that I asked in #bash ? == * #bash aphorism #1 "The questioner's first description of the problem/question will be misleading." * corollary 1.1 "The questioner's second description of the problem/question will also be misleading" * corollary 1.2 "The questioner is never precise" ex: will say "print the file" when they mean print the file's name, rather than printing the file itself." * #bash aphorism #2, "The questioner will keep changing their original question until it drives the helpers in the channel insane." * #bash aphorism #3, "The data is never formatted in the way that makes it easiest to manipulate :-)" * #bash aphorism #4, "30 to 40 percent of the conversations in #bash will be about aphorisms #1 and #2" [[Anchor(faq65)]] == Is there a "PAUSE" command in bash like there is in MSDOS batch scripts? To prompt the user to press any key to continue? == No, but you can use these: {{{ echo press enter to continue; read }}} {{{ echo press any key to continue; read -n 1 }}} [[Anchor(faq66)]] == I want to check if [[ $var == foo || $var == bar || $var = more ]] without repeating $var n times. == {{{ case $var in foo|bar|more) ... ;; esac }}} [[Anchor(faq67)]] == How can I trim leading/trailing white space from one of my variables? == There are a few ways to do this -- none of them elegant. First, the most portable way would be to use sed: {{{ x=$(echo "$x" | sed -e 's/^ *//' -e 's/ *$//') # Note: this only removes spaces. For tabs too: x=$(echo "$x" | sed -e $'s/^[ \t]*//' -e $'s/[ \t]*$//') # Or possibly, with some systems: x=$(echo "$x" | sed -e 's/^[[:space:]]\+//' -e 's/[[:space:]]\+$//') }}} One can achieve the goal using builtins, although at the moment I'm not sure which shells the following syntax supports: {{{ # Remove leading whitespace: while [[ $x = [$' \t\n']* ]]; do x=${x#[$' \t\n']}; done # And now trailing: while [[ $x = *[$' \t\n'] ]]; do x=${x%[$' \t\n']}; done }}} Of course, the preceding example is pretty slow, because it removes one character at a time, in a loop (although it's good enough in practice for most purposes). If you want something a bit fancier, there's a bash-only solution using extglob: {{{ shopt -s extglob x=${x##*([$' \t\n'])}; x=${x%%*([$' \t\n'])} shopt -u extglob }}} There are many, many other ways to do this. These are not necessarily the most efficient, but they're known to work. [[Anchor(faq68)]] == How do I run a command, and have it abort (timeout) after N seconds? == There are two C programs that can do this: [http://pilcrow.madison.wi.us/ doalarm], and [http://www.porcupine.org/forensics/tct.html timeout]. (Compiling them is beyond the scope of this document; suffice to say, it'll be trivial on GNU/Linux systems, easy on most BSDs, and painful on anything else....) If you don't have or don't want one of the above two programs, you can use a perl one-liner to set an ALRM and then exec the program you want to run under a time limit. In any case, you must understand what your program does with SIGALRM. {{{ function doalarm () { perl -e 'alarm shift; exec @ARGV' "$@" ; } doalarm ${NUMBER_OF_SECONDS_BEFORE_ALRMING} program arg arg ... }}} If you can't or won't install one of these programs (which ''really'' should have been included with the basic core Unix utilities 30 years ago!), then the best you can do is an ugly hack like: {{{ command & pid=$!; { sleep 10 && kill $pid; } & }}} This will, as you will soon discover, produce quite a mess regardless of whether the timeout condition kicked in or not. Cleaning it up is not something worth my time -- just use {{{doalarm}}} or {{{timeout}}} instead. Really. [[Anchor(faq69)]] == I want to automate an ssh (or scp, or sftp) connection, but I don't know how to send the password.... == '''STOP!''' First of all, if you actually were to embed your password in a script somewhere, it would be visible to the entire world (or at least, anyone who can read files on your system). This would defeat the entire purpose of having a password on your remote account. If you understand this and still want to continue, then the next thing you need to do is read and understand the man page for {{{ssh-keygen(1)}}}. This will tell you how to generate a public/private key pair (in either RSA or DSA format), and how to use these keys to authenticate to the remote system without sending a password at all. Since many of you are too lazy to read man pages, and instead prefer to ask us in #bash to read them for you, I'll even give a brief summary of the procedure here: {{{ ssh-keygen -t rsa scp ~/.ssh/id_rsa.pub me@remote: ssh me@remote 'cat id_rsa.pub >> .ssh/authorized_keys' ssh me@remote date # should not prompt for passWORD, # but your key may have a passPHRASE }}} If your key has a passphrase on it, and you want to avoid typing it every time, look into {{{ssh-agent(1)}}}. It's beyond the scope of this document, though. If you're being prompted for a password even with the public key inserted into the remote {{{authorized_keys}}} file, chances are you have a permissions problem on the remote system. Check '''every single directory''' in the full path leading up to the {{{authorized_keys}}} file and make sure they do '''not''' have world- or group-write privilegs. ''E.g.'', if your home directory is {{{/home/fred}}} and {{{/home}}} has group "staff" write privileges, {{{sshd}}} will refuse to honor your key. If that's not it, then make sure you didn't spell it ''authorised_keys''. SSH uses the US spelling, ''authorized_keys''. If you ''really'' want to use a password instead of public keys, first have your head examined. Then, if you ''still'' want to use a password, use {{{expect(1)}}}. And don't ask us for help with it. [[Anchor(faq70)]] == How do I convert Unix (epoch) timestamps to human-readable values? == The only sane way to handle time values within a program is to convert them into a linear scale. You can't store "January 17, 2005 at 5:37 PM" in a variable and expect to do anything with it. Therefore, any competent program is going to use time stamps with semantics such as "the number of seconds since point X". These are called ''epoch'' timestamps. If the epoch is January 1, 1970 at midnight UTC, then it's also called a "Unix timestamp", because this is how Unix stores all times (such as file modification times). Standard Unix, unfortunately, has ''no'' tools to work with Unix timestamps. (Ironic, eh?) GNU date, and later BSD date, has a {{{%s}}} extension to generate output in Unix timestamp format: {{{ date +%s # Prints the current time in Unix format, e.g. 1164128484 }}} This is commonly used in scripts when one requires the ''interval'' between two events: {{{ start=$(date +%s) ... end=$(date +%s) echo "Operation took $((end - start)) seconds." }}} Reading the SOURCECODE of GNU date's date parser reveals that it accepts Unix timestamps prefixed with '@', so: {{{ $ date -d "@1164128484" # Prints "Tue Nov 21 18:01:24 CET 2006" in the central European time zone }}} Another method that was suggested before is to trick GNU date using: {{{ date -d "1970-01-01 UTC + 1164128484 seconds" # Prints "Tue Nov 21 12:01:24 EST 2006" in the US/Eastern time zone. }}} If you don't have GNU date available, Perl can also be used: {{{ perl -le "print scalar localtime 1164128484" # Prints "Tue Nov 21 12:01:24 2006" }}} I used double quotes in these examples so that the time constant could be replaced with a variable reference. See the documentation for {{{date(1)}}} and Perl for details on changing the output format. |
comment1, ---- CategoryHomepage |