94705
Comment: delete bogus "questions" with no answers
|
101131
table showing some of the important changes
|
Deletions are marked like this. | Additions are marked like this. |
Line 24: | Line 24: |
The {{{read}}} command still modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the IFS (internal field separator) variable has to be cleared: | If you want to operate on individual fields within each line, you may supply additional variables to {{{read}}}: {{{ # Input file has 3 columns separated by white space. while read first_name last_name phone; do ... done < "$file" }}} If the field delimiters are not whitespace, you can set {{{IFS}}} (input field separator): {{{ while IFS=: read user pass uid gid gecos home shell; do ... done < /etc/passwd }}} Also, please note that you do ''not'' necessarily need to know how many fields each line of input contains. If you supply more variables than there are fields, the extra variables will be empty. If you supply fewer, the last variable gets "all the rest" of the fields after the preceding ones are satisfied. For example, {{{ while read first_name last_name junk; do ... done <<< 'Bob Smith 123 Main Street Elk Grove Iowa 123-555-6789' # Inside the loop, first_name will contain "Bob", and # last_name will contain "Smith". The variable "junk" holds # everything else. }}} The {{{read}}} command modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the {{{IFS}}} variable has to be cleared: |
Line 56: | Line 84: |
That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24]. | That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24], or use process substitution like: {{{ while read line; do other commands done < <(some command) }}} |
Line 67: | Line 101: |
== How can I remove the last character of a line? == Using bash and ksh extended parameter substitution: {{{ var=${var%?} }}} Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of {{{var}}}. As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least). More portable, but slower: {{{ var=`expr "$var" : '\(.*\).'` }}} or (using {{{sed}}}): {{{ var=`echo "$var" | sed 's/.$//'` |
== How can I store the return value of a command in a variable? == Well, that depends on exactly what you mean by that question. Some people want to store the command's ''output'' (either stdout, or stdout + stderr); and others want to store the command's ''exit status'' (0 to 255, with 0 typically meaning "success"). If you want to capture the output: {{{ var=$(command) # stdout only; stderr remains uncaptured var=$(command 2>&1) # both stdout and stderr will be captured }}} If you want the exit status: {{{ command var=$? }}} If you want both: {{{ var1=$(command) var2=$? # the assignment to var1 has no effect on command's exit status, which is still in $? }}} If you don't ''actually'' want the exit status, but simply want to take an action upon success or failure: {{{ if command then echo "it succeeded" else echo "it failed" fi |
Line 103: | Line 151: |
We can test for the exit status of ls: {{{ if ls "$directory"/file.txt; then echo "file.txt found!" else echo "file.txt not found." fi }}} |
|
Line 414: | Line 473: |
== How can I redirect the output of several commands at once? == | == How can I redirect the output of multiple commands at once? == |
Line 930: | Line 989: |
}}} For ["BASH"], when the first part of the pipe is a command, you can use "process substitution". The command used here is a simple "echo -e $'a\nb\nc'" as a substitute for a command with a multiline output: {{{ while read LINE; do echo "-> $LINE" done < <(echo -e $'a\nb\nc') |
|
Line 1470: | Line 1537: |
sp="/-|-\|" | sp="/-\|" |
Line 1479: | Line 1546: |
A similar technique can be used to build progress bars. | |
Line 1910: | Line 1978: |
One may also pipe stderr only but keep stdout intact (without ''a priori'' knowledge of where the script's output is going). This is a bit trickier. This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade. (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!) On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick: {{{ # Redirecting only stderr to a pipe. exec 3>&1 # Save current "value" of stdout. ls -l 2>&1 >&3 3>&- | grep bad 3>&- # Close fd 3 for 'grep' (but not 'ls'). # ^^^^ ^^^^ exec 3>&- # Now close it for the remainder of the script. # Thanks, S.C. }}} To show it as a dialog one-liner: {{{ exec 3>&1 dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/' exec 3>&- }}} This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers. |
|
Line 2085: | Line 2175: |
This can be done in legacy Bourne shell as well, using {{{case}}}: | This can be done in Korn and legacy Bourne shells as well, using {{{case}}}: |
Line 2089: | Line 2179: |
*[^0-9]*) echo "'$foo' has a non-digit somewhere in it" ;; | *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;; |
Line 2158: | Line 2248: |
A related question is [#faq59 FAQ #59], which discusses how to send stderr to a pipeline, while leaving stdout unpiped. | A related question is [#faq47 FAQ #47], which discusses how to send stderr to a pipeline. |
Line 2291: | Line 2381: |
== I'd like to pipe stderr only but keep stdout intact. == This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade. (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!) On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick: {{{ # Redirecting only stderr to a pipe. exec 3>&1 # Save current "value" of stdout. ls -l 2>&1 >&3 3>&- | grep bad 3>&- # Close fd 3 for 'grep' (but not 'ls'). # ^^^^ ^^^^ exec 3>&- # Now close it for the remainder of the script. # Thanks, S.C. }}} To show it as a dialog one-liner: {{{ exec 3>&1 dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/' exec 3>&- }}} This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers. |
== How can I remove the last character of a line? == Using bash and ksh extended parameter substitution: {{{ var=${var%?} }}} Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of {{{var}}}. As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least). More portable, but slower: {{{ var=`expr "$var" : '\(.*\).'` }}} or (using {{{sed}}}): {{{ var=`echo "$var" | sed 's/.$//'` }}} |
Line 2347: | Line 2434: |
Here's a ''partial'' list of the changes, in a more compact format: ||'''Feature'''||'''Added in version'''|| ||x+=string||3.1-alpha1|| ||{x..y}||3.0-alpha|| ||${!array[@]}||3.0-alpha|| ||[[ =~||3.0-alpha|| ||<<<||2.05b-alpha1|| ||i++||2.04-devel|| ||for ((;;))||2.04-devel|| ||/dev/fd/N, /dev/tcp/host/port, etc.||2.04-devel|| ||a=(*.txt) file expansion||2.03-alpha|| ||extglob||2.02-alpha1|| ||[[||2.02-alpha1|| ||builtin printf||2.02-alpha1|| ||$(< filename)||2.02-alpha1|| ||** (exponentiation)||2.02-alpha1|| ||\xNNN||2.02-alpha1|| ||(( ))||2.0-beta2|| |
|
Line 2387: | Line 2494: |
* #bash aphorism #1 "the original english description of the problem/question is ALWAYS wrong!" * corollary to aphorism #1 "the questioner is never precise" ex: will say "print the file" when they mean print the file's name, rather than printing the file itself. * #bash aphorism #2, the questioner will keep changing their original question until it drives the helpers in the channel insane. * #bash aphorism #3, the data is never formatted in the way that makes it easiest to manipulate :-) * #bash aphorism #4, 30 to 40 percent of the conversations in #bash will be about the 3 aphorisms above |
* #bash aphorism #1 "The questioner's first description of the problem/question will be misleading." * corollary 1.1 "The questioner's second description of the problem/question will also be misleading" * corollary 1.2 "The questioner is never precise" ex: will say "print the file" when they mean print the file's name, rather than printing the file itself." * #bash aphorism #2, "The questioner will keep changing their original question until it drives the helpers in the channel insane." * #bash aphorism #3, "The data is never formatted in the way that makes it easiest to manipulate :-)" * #bash aphorism #4, "30 to 40 percent of the conversations in #bash will be about aphorisms #1 and #2" [[Anchor(faq65)]] == Is there a "PAUSE" command in bash like there is in MSDOS batch scripts? == No, but you can use these: {{{ echo press enter to continue; read }}} {{{ echo press any key to continue; read -n 1 }}} [[Anchor(faq66)]] == I want to check if [[ $var == foo || $var == bar || $var = more ]] without repeating $var n times. == {{{ case $var in foo|bar|more) ... ;; esac }}} [[Anchor(faq67)]] == How can I trim leading/trailing white space from one of my variables? == There are a few ways to do this -- none of them elegant. First, the most portable way would be to use sed: {{{ x=$(echo "$x" | sed -e 's/^ *//' -e 's/ *$//') # Note: this only removes spaces. For tabs too: x=$(echo "$x" | sed -e $'s/^[ \t]*//' -e $'s/[ \t]*$//') # Or possibly, with some systems: x=$(echo "$x" | sed -e 's/^[[:space:]]\+//' -e 's/[[:space:]]\+$//') }}} One can achieve the goal using builtins, although at the moment I'm not sure which shells the following syntax supports: {{{ # Remove leading whitespace: while [[ $x = [$' \t\n']* ]]; do x=${x#[$' \t\n']}; done # And now trailing: while [[ $x = *[$' \t\n'] ]]; do x=${x%[$' \t\n']}; done }}} Of course, the preceding example is pretty slow, because it removes one character at a time, in a loop (although it's good enough in practice for most purposes). If you want something a bit fancier, there's a bash-only solution using extglob: {{{ shopt -s extglob x=${x##*([$' \t\n'])}; x=${x%%*([$' \t\n'])} shopt -u extglob }}} There are many, many other ways to do this. These are not necessarily the most efficient, but they're known to work. [[Anchor(faq68)]] == How do I run a command, and have it abort (timeout) after N seconds? == There are two C programs that can do this: [http://pilcrow.madison.wi.us/ doalarm], and [http://www.porcupine.org/forensics/tct.html timeout]. (Compiling them is beyond the scope of this document; suffice to say, it'll be trivial on GNU/Linux systems, easy on most BSDs, and painful on anything else....) If you don't have or don't want one of the above two programs, you can use a perl one-liner to set an ALRM and then exec the program you want to run under a time limit. In any case, you must understand what your program does with SIGALRM. {{{ function doalarm () { perl -e 'alarm shift; exec @ARGV' "$@" ; } doalarm ${NUMBER_OF_SECONDS_BEFORE_ALRMING} program arg arg ... }}} If you can't or won't install one of these programs (which ''really'' should have been included with the basic core Unix utilities 30 years ago!), then the best you can do is an ugly hack like: {{{ command & pid=$!; { sleep 10 && kill $pid; } & }}} This will, as you will soon discover, produce quite a mess regardless of whether the timeout condition kicked in or not. Cleaning it up is not something worth my time -- just use {{{doalarm}}} or {{{timeout}}} instead. Really. |
BASH Frequently Asked Questions
These are answers to frequently asked questions on channel #bash on the [http://www.freenode.net/ freenode] IRC network. These answers are contributed by the regular members of the channel (originally heiner, and then others including greycat and r00t), and by users like you. If you find something inaccurate or simply misspelled, please feel free to correct it!
All the information here is presented without any warranty or guarantee of accuracy. Use it at your own risk. When in doubt, please consult the man pages or the GNU info pages as the authoritative references.
["BASH"] is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the KornShell, too. If a question is not strictly shell specific, but rather related to Unix, it may be in the UnixFaq.
If you want to help, you can add new questions with answers here, or try to answer one of the BashOpenQuestions.
1. How can I read a file line-by-line?
while read line do echo "$line" done < "$file"
If you want to operate on individual fields within each line, you may supply additional variables to read:
# Input file has 3 columns separated by white space. while read first_name last_name phone; do ... done < "$file"
If the field delimiters are not whitespace, you can set IFS (input field separator):
while IFS=: read user pass uid gid gecos home shell; do ... done < /etc/passwd
Also, please note that you do not necessarily need to know how many fields each line of input contains. If you supply more variables than there are fields, the extra variables will be empty. If you supply fewer, the last variable gets "all the rest" of the fields after the preceding ones are satisfied. For example,
while read first_name last_name junk; do ... done <<< 'Bob Smith 123 Main Street Elk Grove Iowa 123-555-6789' # Inside the loop, first_name will contain "Bob", and # last_name will contain "Smith". The variable "junk" holds # everything else.
The read command modifies each line read, e.g. it removes all leading whitespace characters (blanks, tab characters). If that is not desired, the IFS variable has to be cleared:
OIFS=$IFS; IFS= while read line do echo "$line" done < "$file" IFS=$OIFS
As a feature, the read command concatenates lines that end with a backslash '\' character to one single line. To disable this feature, KornShell and ["BASH"] have read -r:
OIFS=$IFS; IFS= while read -r line do echo "$line" done < "$file" IFS=$OIFS
Note that reading a file line by line this way is very slow for large files. Consider using e.g. ["AWK"] instead if you get performance problems.
One may also read from a command instead of a regular file:
some command | while read line; do other commands done
That may cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see [#faq24 FAQ 24], or use process substitution like:
while read line; do other commands done < <(some command)
Sometimes it's useful to read a file into an array, one array element per line. You can do that with the following example:
O=$IFS IFS=$'\n' arr=($(< myfile)) IFS=$O
This temporarily changes the Input Field Separator to a newline, so that each line will be considered one field by read. Then it populates the array arr with the fields. Then it sets the IFS back to what it was before.
2. How can I store the return value of a command in a variable?
Well, that depends on exactly what you mean by that question. Some people want to store the command's output (either stdout, or stdout + stderr); and others want to store the command's exit status (0 to 255, with 0 typically meaning "success").
If you want to capture the output:
var=$(command) # stdout only; stderr remains uncaptured var=$(command 2>&1) # both stdout and stderr will be captured
If you want the exit status:
command var=$?
If you want both:
var1=$(command) var2=$? # the assignment to var1 has no effect on command's exit status, which is still in $?
If you don't actually want the exit status, but simply want to take an action upon success or failure:
if command then echo "it succeeded" else echo "it failed" fi
3. How can I insert a blank character after each character?
sed 's/./& /g'
Example:
$ echo "testing" | sed 's/./& /g' t e s t i n g
4. How can I check whether a directory is empty or not?
We can test for the exit status of ls:
if ls "$directory"/file.txt; then echo "file.txt found!" else echo "file.txt not found." fi
The following idea counts the number of entries in the specified directory (omitting ".." and "."):
find "$dir" -maxdepth 0 -links 2 \ -exec echo "empty directory: {}" \;
Conversely, to find a non-empty directory:
find "$dir" -maxdepth 0 -links +2 \ -exec echo "directory is non-empty" \;
Most modern systems have an "ls -A" which explicitly omits "." and ".." from the directory listing:
if [ -n "$(ls -A somedir)" ] then echo directory is non-empty fi
This can be shortened to:
if [ "$(ls -A somedir)" ] then echo directory is non-empty fi
Another way, using Bash features, involves setting the special shell option which changes the behavior of globbing. Some people prefer to avoid this approach, because it's so drastically different and could severely alter the behavior of scripts.
Nevertheless, if you're willing to use this approach, it does greatly simplify this particular task:
shopt -s nullglob if [[ -z $(echo *) ]]; then echo directory is empty fi
It also simplifies various other operations:
shopt -s nullglob for i in *.zip; do blah blah "$i" # No need to check $i is a file. done
Without the shopt, that would have to be:
for i in *.zip; do [[ -f $i ]] || continue # If no .zip files, i becomes *.zip blah blah "$i" done
(You may want to use the latter anyway, if there's a possibility that the glob may match directories in addition to files.)
5. How can I convert all upper-case file names to lower case?
# tolower - convert file names to lower case for file in * do [ -f "$file" ] || continue # ignore non-existing names newname=$(echo "$file" | tr '[A-Z]' '[a-z]') # lower-case version of file name [ "$file" = "$newname" ] && continue # nothing to do [ -f "$newname" ] && continue # do not overwrite existing files mv "$file" "$newname" done
Purists will insist on using
tr '[[:upper:]]' '[[:lower:]]'
in the above code, in case of non-ASCII (e.g. accented) letters in locales which have them.
This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.
# renamefiles - rename files whose name contain unusual characters for file in * do [ -f "$file" ] || continue # ignore non-existing names newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g') [ "$file" = "$newname" ] && continue # nothing to do [ -f "$newname" ] && continue # do not overwrite existing files mv "$file" "$newname" done
The character class in [] contains all allowed characters; modify it as needed.
6. How can I use a logical AND in a shell pattern (glob)?
That can be achieved through the !() extglob operator. You'll need extglob set. It can be checked with:
$ shopt extglob
and set with:
$ shopt -s extglob
To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:
$ mv foo!(*.d) foo_thursday.d
For the general case:
Delete all files containing Pink_Floyd AND not containing The_Final_Cut:
$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)
By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.
7. Is there a function to return the length of a string?
The fastest way, not requiring external programs (but usable only with ["BASH"] and KornShell):
${#varname}
or
expr "$varname" : '.*'
(expr prints the number of characters matching the pattern .*, which is the length of the string)
or
expr length "$varname"
(for a BSD/GNU version of expr. Do not use this, because it is not ["POSIX"]).
8. How can I recursively search all files for a string?
On most recent systems (GNU/Linux/BSD), you would use grep -r pattern . to search all files from the current directory (.) downward.
You can use find if your grep lacks -r:
find . -type f -exec grep -l "$search" '{}' \;
The {} characters will be replaced with the current file name.
This command is slower than it needs to be, because find will call grep with only one file name, resulting in many grep invocations (one per file). Since grep accepts multiple file names on the command line, find can be instrumented to call it with several file names at once:
find . -type f -exec grep -l "$search" '{}' \+
The trailing '+' character instructs find to call grep with as many file names as possible, saving processes and resulting in faster execution. This example works for POSIX find, e.g. with Solaris.
GNU find uses a helper program called xargs for the same purpose:
find . -type f -print0 | xargs -0 grep -l "$search"
The -print0 / -0 options ensure that any file name can be processed, even ones containing blanks, TAB characters, or new-lines.
90% of the time, all you need is:
Have grep recurse and print the lines (GNU grep):
grep -r "$search" .
Have grep recurse and print only the names (GNU grep):
grep -r -l "$search" .
The find command can be used to run arbitrary commands on every file in a directory (including sub-directories). Replace grep with the command of your choice. The curly braces {} will be replaced with the current file name in the case above.
(Note that they must be escaped in some shells, but not in ["BASH"].)
9. My command line produces no output: tail -f logfile | grep 'ssh'
Most standard Unix commands buffer their output if used non-interactively. This means, that they don't write each character (or even each line) as they are ready, but collect a larger number (e.g. 4 kilobytes) before printing it. In the case above, the tail command buffers its output, and therefore grep only gets its input in e.g. 4K blocks.
Unfortunately there's no easy solution to this, because the behaviour of the standard programs would need to be changed. *See bottom of section before taking 'no easy solution' to heart*
Some programs provide special command line options for this purpose, e.g.
grep (e.g. GNU version 2.5.1) |
--line-buffered |
sed (e.g. GNU version 4.0.6) |
-u,--unbuffered |
awk (some GNU versions) |
-W interactive, or use the fflush() function |
tcpdump, tethereal |
-l |
The expect package (http://expect.nist.gov/) has an unbuffer example program, which can help here. It disables buffering for the output of a program.
Example usage:
unbuffer tail -f logfile | grep 'ssh'
There is another option when you have more control over the creation of the log file. If you would like to grep the real-time log of a text interface program which does buffered session logging by default (or you were using script to make a session log), then try this instead:
$ program | tee -a program.log In another window: $ tail -f program.log | grep whatever
Apparently this works because tee produces unbuffered output. This has only been tested on GNU tee, YMMV.
A solution to this is to use the 'less' command in follow mode. This is simple to do!
$ less program.log
Then enter your search pattern (/ is search in less, like vi)
- /ssh
Next, put less into follow mode by issuing shift+f
Thats all there is to it! Anchor(faq10)
10. How can I recreate a directory structure, without the files?
With the cpio program:
cd "$srcdir" find . -type d -print | cpio -pdumv "$dstdir"
or with GNU-tar, and less obscure syntax:
cd "$srcdir" find . -type d -print | tar c --files-from - --no-recursion | tar x --directory "$dstdir"
This creates a list of directory names with find, non-recursively adds just the directories to an archive, and pipes it to a second tar instance to extract it at the target location.
11. How can I print the n'th line of a file?
The dirty (but not quick) way would be sed -n ${n}p "$file" but this reads the whole input file, even if you only wanted the third line.
The following sed command line reads a file printing nothing (-n). At line $n the command "p" is run, printing it, with a "q" afterwards: quit the program.
sed -n "$n{p;q;}" "$file"
12. A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle...
sh -c 'echo "$1"' -- hello
13. How can I concatenate two variables?
There is no concatenation operator for strings (either literal or variable dereferences) in the shell. The strings are just written one after the other:
var=$var1$var2
If the right-hand side contains whitespace characters, it needs to be quoted:
var="$var1 - $var2"
Braces can be used to disambiguate the right-hand side:
var=${var1}xyzzy # without braces, var1xyzzy would be interpreted as a variable name # Another equivalent way would be: var="$var1"xyzzy
CommandSubstitution can be used as well. The following line creates a log file name logname containing the current date, resulting in names like e.g. log.2004-07-26:
logname="log.$(date +%Y-%m-%d)"
Appending data to the end of a string doesn't require any black magic, either.
string="$string more data here"
Bash 3.1 has a new += operator that you may see from time to time:
string+=" more data here" # EXTREMELY non-portable!
It's generally best to use the portable syntax.
14. How can I redirect the output of multiple commands at once?
Redirecting the standard output of a single command is as easy as
date > file
To redirect standard error:
date 2> file
To redirect both:
date > file 2>&1
In a loop or other larger code structure:
for i in $list; do echo "Now processing $i" # more stuff here... done > file 2>&1
However, this can become tedious if the output of many programs should be redirected. If all output of a script should go into a file (e.g. a log file), the exec command can be used:
# redirect both standard output and standard error to "log.txt" exec > log.txt 2>&1 # all output including stderr now goes into "log.txt"
Otherwise command grouping helps:
{ date # some other command echo done } > messages.log 2>&1
In this example, the output of all commands within the curly braces is redirected to the file messages.log.
15. How can I run a command on all files with the extention .gz?
Often a command already accepts several files as arguments, e.g.
zcat *.gz
(One some systems, you would use gzcat instead of zcat. If neither is available, or if you don't care to play guessing games, just use gzip -dc instead.) If an explicit loop is desired, or if your command does not accept multiple filename arguments in one invocation, the for loop can be used:
for file in *.gz do echo "$file" # do something with "$file" done
To do it recursively, you should use a loop, plus the find command:
while read file; do echo "$file" # do something with "$file" done < <(find . -name '*.gz' -print)
For more hints in this direction, see [#faq20 FAQ #20], below. To see why the find command comes after the loop instead of before it, see [#faq24 FAQ #24].
16. How can I remove a file name extension from a string, e.g. file.tar to file?
The easiest (and fastest) way is to use the following:
$ name="file.tar" $ echo "${name%.tar}" file
The ${var%pattern} syntax removes the pattern from the end of the variable. ${var#pattern} would remove pattern from the start of the string. This could be used to rename all files from "*.doc" to "*.txt":
for file in *.doc do mv "$file" "${file%.doc}".txt done
There's more to ParameterSubstitution, e.g. ${var%%pattern}, ${var##pattern}, ${var//old/new}.
Note that this extended form of ParameterSubstitution works with ["BASH"], KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, sed could be used to remove the filename extension part:
for file in *.doc do base=`echo "$file" | sed 's/\.[^.]*$//'` # remove everything starting with last '.' mv "$file" "$base".txt done
Finally, some GNU/Linux/BSD systems offer a rename command. There are multiple different rename commands out there with contradictory syntaxes. Consult your man pages to see which one you have (if any).
17. How can I group expressions, e.g. (A AND B) OR C?
The TestCommand [ uses parentheses () for expression grouping. Given that "AND" is "-a", and "OR" is "-o", the following expression
(0<n AND n<=10) OR n=-1
can be written as follows:
if [ \( $n -gt 0 -a $n -le 10 \) -o $n -eq -1 ] then echo "0 < $n <= 10, or $n=-1" else echo "invalid number: $n" fi
Note that the parentheses have to be quoted: \(, '(' or "(".
["BASH"] and KornShell have different, more powerful comparison commands with slightly different (easier) quoting:
ArithmeticExpression for arithmetic expressions, and
NewTestCommand for string (and file) expressions.
Examples:
if (( (n>0 && n<10) || n == -1 )) then echo "0 < $n < 10, or n==-1" fi
or
if [[ ( -f $localconfig && -f $globalconfig ) || -n $noconfig ]] then echo "configuration ok (or not used)" fi
Note that the distinction between numeric and string comparisons is strict. Consider the following example:
n=3 if [[ n>0 && n<10 ]] then echo "$n is between 0 and 10" else echo "ERROR: invalid number: $n" fi
The output will be "ERROR: ....", because in a string comparision "3" is bigger than "10", because "3" already comes after "1", and the next character "0" is not considered. Changing the square brackets to double parentheses (( makes the example work as expected.
18. How can I use numbers with leading zeros in a loop, e.g. 01, 02?
As always, there are different ways to solve the problem, each with its own advantages and disadvantages.
If there are not many numbers, BraceExpansion can be used:
for i in 0{1,2,3,4,5,6,7,8,9} 10 do echo $i done
Output:
00 01 02 03 [...]
This gets tedious for large sequences, but there are other ways, too. If the command seq is available, you can use it as follows:
seq -w 1 10
or, for arbitrary numbers of leading zeros (here: 3):
seq -f "%03g" 1 10
If you have the printf command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number, too:
for ((i=1; i<=10; i++)) do printf "%02d " "$i" done
The KornShell and KornShell93 have the typeset command to specify the number of leading zeros:
$ typeset -Z3 i=4 $ echo $i 004
Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes:
i=0 while test $i -le 10 do echo "00$i" i=`expr $i + 1` done | sed 's/.*\(...\)$/\1/g'
In this example, the number of '.' inside the parentheses in the sed statement determins how many total bytes from the echo command (at the end of each line) will be kept and printed.
One more addendum: in Bash 3, you can use:
printf "%03d \n" {1..300}
Which is slightly easier in some cases.
Also you can use the printf command with xargs and wget to fetch files:
printf "%03d \n" {$START..$END} | xargs -i% wget $LOCATION/%
Sometimes a good solution.
19. How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?
Some Unix systems provide the split utility for this purpose:
split --lines 10 --numeric-suffixes input.txt output-
For more flexibility you can use sed. The sed command can print e.g. the line number range 1-10:
sed -n '1,10p'
This stops sed from printing each line (-n). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). sed still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making sed terminate immediately after printing line 10:
sed -n -e '1,10p' -e '10q'
Now the command will quit after reading line 10 ("10q"). The -e arguments indicate a script (instead of a file name). The same can be written a little shorter:
sed -n '1,10p;10q'
We can now use this to print an arbitrary range of a file (specified by line number):
file=/etc/passwd range=10 firstline=1 maxlines=$(wc -l < "$file") # count number of lines while (($firstline < $maxlines)) do ((lastline=$firstline+$range+1)) sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file" ((firstline=$firstline+$range+1)) done
This example uses ["BASH"] and KornShell ArithmeticExpressions, which older [wiki:BourneShell Bourne shells] do not have. In that case the following example should be used instead:
file=/etc/passwd range=10 firstline=1 maxlines=`wc -l < "$file"` # count line numbers while [ $firstline -le $maxlines ] do lastline=`expr $firstline + $range + 1` sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file" firstline=`expr $lastline + 1` done
20. How can I find and deal with file names containing newlines, spaces or both?
The preferred method is still to use
find ... -exec command {} \;
or, if you need to handle filenames en masse:
find ... -print0 | xargs -0 command
for GNU find/xargs, or (POSIX find):
find ... -exec command {} +
Use that unless you really can't.
Another way to deal with files with spaces in their names is to use the shell's filename expansion (["globbing"]). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well.
This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. (But it will not work in the original BourneShell.)
for file in *.mp3; do mv "$file" "${file// /_}" done
You could do the same thing for all files (regardless of extension) by using
for file in *\ *; do
instead of *.mp3.
Another way to handle filenames recursively involes using the -print0 option of find (a GNU/BSD extension), together with bash's -d option for read:
unset a i while read -d $'\0' file; do a[i++]="$file" # or however you want to process each file done < <(find /tmp -type f -print0)
The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its word delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec.
21. How can I replace a string with another string in all files?
sed is a good command to replace strings, e.g.
sed 's/olddomain\.com/newdomain\.com/g' input > output
To replace a string in all files of the current directory:
for i in *; do sed 's/old/new/g' "$i" > atempfile && mv atempfile "$i" done
GNU sed 4.x (but no other version of sed) has a special -i flag which makes the temp file unnecessary:
for i in *; do sed -i 's/old/new/g' "$i" done
Those of you who have perl 5 can accomplish the same thing using this code:
perl -pi -e 's/old/new/g' *
Recursively:
find . -type f -print0 | xargs -0 perl -pi -e 's/old/new/g'
To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...:
perl -i.bak -pne 's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g' $(find . -type f)
Finally, here's a script that some people may find useful:
: # chtext - change text in several files # neither string may contain '|' unquoted old='olddomain\.com' new='newdomain\.com' # if no files were specified on the command line, use all files: [ $# -lt 1 ] && set -- * for file do [ -f "$file" ] || continue # do not process e.g. directories [ -r "$file" ] || continue # cannot read file - ignore it # Replace string, write output to temporary file. Terminate script in case of errors sed "s|$old|$new|g" "$file" > "$file"-new || exit # If the file has changed, overwrite original file. Otherwise remove copy if cmp "$file" "$file"-new >/dev/null 2>&1 then rm "$file"-new # file nas not changed else mv "$file"-new "$file" # file has changed: overwrite original file fi done
If the code above is put into a script file (e.g. chtext), the resulting script can be used to change a text e.g. in all HTML files of the current and all subdirectories:
find . -type f -name '*.html' -exec chtext {} \;
Many optimizations are possible:
use another sed separator character than '|', e.g. ^A (ASCII 1)
some implementations of sed (e.g. GNU sed) have an "-i" option that can change a file in-place; no temporary file is necessary in that case
the find command above could use either xargs or the built-in xargs of POSIX find
Note: set -- * in the code above is safe with respect to files whose names contain spaces. The expansion of * by set is the same as the expansion done by for, and filenames will be preserved properly as individual parameters, and not broken into words on whitespace.
A more sophisticated example of chtext is here: http://www.shelldorado.com/scripts/cmds/chtext
22. How can I calculate with floating point numbers instead of just integers?
["BASH"] does not have built-in floating point arithmetic:
$ echo $((10/3)) 3
For better precision, an external program must be used, e.g. bc, awk or dc:
$ echo "scale=3; 10/3" | bc 3.333
The "scale=3" command notifies bc that three digits of precision after the decimal point are required.
awk can be used for calculations, too:
$ awk 'BEGIN {printf "%.3f\n", 10 / 3}' /dev/null 3.333
There is a subtle but important difference between the bc and the awk solution here: bc reads commands and expressions from standard input. awk on the other hand evaluates the expression as part of the program. Expressions on standard input are not evaluated, i.e. echo 10/3 | awk '{print $0}' will print 10/3 instead of the evaluated result of the expression.
This explains why the example uses /dev/null as an input file for awk: the program evaluates the BEGIN action, evaluating the expression and printing the result. Afterwards the work is already done: it reads its standard input, gets an end-of-file indication, and terminates. If no file had been specified, awk would wait for data on standard input.
Newer versions of KornShell93 have built-in floating point arithmetic, together with mathematical functions like sin() or cos() .
23. How do I append a string to the contents of a variable?
The shell doesn't have a string concatenation operator like Java ("+") or Perl ("."). The following example shows how to append the string ".2004-08-15" to the contents of the shell variable filename:
filename="$filename.2004-08-15"
If the variable name and the string to append could be confused, the variable name can be enclosed in braces, e.g.
filename="${filename}old"
instead of filename=$filenameold
24. I set variables in a loop. Why do they suddenly disappear after the loop terminates?
The following command always prints "total number of lines: 0", although the variable linecnt has a larger value in the while loop:
linecnt=0 cat /etc/passwd | while read line do linecnt=`expr $linecnt + 1` done echo "total number of lines: $linecnt"
The reason for this surprising behaviour is that a while/for/until loop runs in a subshell when its input or output is redirected from a pipeline. For the while loop above, a new subshell with its own copy of the variable linecnt is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecnt of the parent (whose value has not changed) is used in the echo command.
It's hard to tell when shell would create a new process for a loop:
BourneShell creates it when the input or output is redirected, either by using a pipeline or by a redirection operator ('<', '>').
- ["BASH"] creates a new process only if the loop is part of a pipeline
KornShell creates it only if the loop is part of a pipeline, but not if the loop is the last part of it.
To solve this, either use a method that works without a subshell (shown below), or make sure you do all processing inside that subshell (a bit of a kludge, but easier to work with):
linecnt=0 cat /etc/passwd | ( while read line ; do linecnt="$((linecnt+1))" done echo "total number of lines: $linecnt" )
To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem at least for ["BASH"] and KornShell (but still for BourneShell):
linecnt=0 while read line ; do linecnt="$((linecnt+1))" done < /etc/passwd echo "total number of lines: $linecnt"
For ["BASH"], when the first part of the pipe is a command, you can use "process substitution". The command used here is a simple "echo -e $'a\nb\nc'" as a substitute for a command with a multiline output:
while read LINE; do echo "-> $LINE" done < <(echo -e $'a\nb\nc')
A portable and common work-around is to redirect the input of the read command using exec:
linecnt=0 exec < /etc/passwd # redirect standard input from the file /etc/passwd while read line # "read" gets its input from the file /etc/passwd do linecnt=`expr $linecnt + 1` done echo "total number of lines: $linecnt"
This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:
exec 3<&0 # save original standard input file descriptor "0" as FD "3" exec 0</etc/passwd # redirect standard input from the file /etc/passwd linecnt=0 while read line # "read" gets its input from the file /etc/passwd do linecnt=`expr $linecnt + 1` done exec 0<&3 # restore saved standard input (fd 0) from file descriptor "3" exec 3<&- # close the no longer needed file descriptor "3" echo "total number of lines: $linecnt"
Subsequent exec commands can be combined into one line, which is interpreted left-to-right:
exec 3<&0 exec 0</etc/passwd _...read redirected standard input..._ exec 0<&3 exec 3<&-
is equivalent to
exec 3<&0 0</etc/passwd _...read redirected standard input..._ exec 0<&3 3<&-
25. How can I access positional parameters after $9?
Use ${10} instead of $10. This works for ["BASH"] and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use for, e.g. to get the last parameter:
for last do : # nothing done echo "last argument is: $last"
To get an argument by number, we can use a counter:
n=12 # This is the number of the argument we are interested in i=1 for arg do if [ $i -eq $n ] then argn=arg break fi i=`expr $i + 1` done echo "argument number $n is: $argn"
This has the advantage of not "consuming" the arguments. If this is no problem, the shift command discards the first positional arguments:
shift 11 echo "the 12th argument is: $1"
Although direct access to any positional argument is possible this way, it's hardly needed. The common way is to use getopts(3) to process command line options (e.g. "-l", or "-o filename"), and then use either for or while to process all arguments in turn. An explanation of how to process command line arguments is available here: http://www.shelldorado.com/goodcoding/cmdargs.html
26. How can I randomize (shuffle) the order of lines in a file?
randomize(){ while read l ; do echo "0$RANDOM $l" ; done | sort -n | cut -d" " -f2- }
Note: the leading 0 is to make sure it doesnt break if the shell doesnt support $RANDOM, which is supported by ["BASH"], KornShell, KornShell93 and ["POSIX"] shell, but not BourneShell.
The same idea (printing random numbers in front of a line, and sorting the lines on that column) using other programs:
awk ' BEGIN { srand() } { print rand() "\t" $0 } ' | sort -n | # Sort numerically on first (random number) column cut -f2- # Remove sorting column
This is faster thAn the previous solution, but will not work for very old AWK implementations (try "nawk", or "gawk", if available).
A related question we frequently see is, "How can I print a random line from a file?" The problem here is that you need to know in advance how many lines the file contains. Lacking that knowledge, you have to read the entire file through once just to count them -- or, you have to suck the entire file into memory. Let's explore both of these approaches.
n=$(wc -l < "$file") # Count number of lines. r=$((RANDOM % n + 1)) # Random number from 1..n. sed -n "$r{p;q;}" "$file" # Print the r'th line.
(These examples use the answer from [#faq11 FAQ 11] to print the n'th line.) The first one's pretty straightforward -- we use wc to count the lines, choose a random number, and then use sed to print the line. If we already happened to know how many lines were in the file, we could skip the wc command, and this would be a very efficient approach.
The next example sucks the entire file into memory. This approach saves time reopening the file, but obviously uses more memory.
oIFS=$IFS IFS=$'\n' lines=($(<"$file")) IFS=$oIFS n=${#lines[@]} r=$((RANDOM % n)) echo "${lines[r]}"
Note that we don't add 1 to the random number in this example, because the array of lines is indexed counting from 0.
Also, some people want to choose a random file from a directory (for a signature on an e-mail, or to chose a random song to play, or a random image to display, etc.). A similar technique can be used:
files=(*.ogg) # Or *.gif, or * n=${#files[@]} # For aesthetics xmms "${files[RANDOM % n]}" # Choose a random element
27. How can two processes communicate using named pipes (fifos)?
NamedPipes, also known as FIFOs ("First In First Out") are well suited for inter-process communication. The advantage over using files as a means of communication is, that processes are synchronized by pipes: a process writing to a pipe blocks if there is no reader, and a process reading from a pipe blocks if there is no writer.
Here is a small example of a server process communicating with a client process. The server sends commands to the client, and the client acknowledges each command:
Server
# server - communication example # Create a FIFO. Some systems don't have a "mkfifo" command, but use # "mknod pipe p" instead mkfifo pipe while sleep 1 do echo "server: sending GO to client" # The following command will cause this process to block (wait) # until another process reads from the pipe echo GO > pipe # A client read the string! Now wait for its answer. The "read" # command again will block until the client wrote something read answer < pipe # The client answered! echo "server: got answer: $answer" done
Client
# client # We cannot start working until the server has created the pipe... until [ -p pipe ] do sleep 1; # wait for server to create pipe done # Now communicate... while sleep 1 do echo "client: waiting for data" # Wait until the server sends us one line of data: read data < pipe # Received one line! echo "client: read <$data>, answering" # Now acknowledge that we got the data. This command # again will block until the server read it. echo ACK > pipe done
Write both examples to files server and client respectively, and start them concurrently to see it working:
$ chmod +x server client $ server & client & server: sending GO to client client: waiting for data client: read <GO>, answering server: got answer: ACK server: sending GO to client client: waiting for data client: read <GO>, answering server: got answer: ACK server: sending GO to client client: waiting for data [...]
28. How do I determine the location of my script? I want to read some config files from the same place.
This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable $0. But providing the script name in $0 is only a (very common) convention, not a requirement.
The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". That's not the case in ["BASH"]. But this isn't reliable across shells; some of them return the actual command typed in by the user instead of the fully qualified path. In those cases, if all you want is the fully qualified version of $0, you can use something like this (["POSIX"], non-Bourne):
[[ $0 = /* ]] && echo $0 || echo $PWD/$0
Or the BourneShell version:
case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac
However, this approach has some major drawbacks. The most important is, that the script name (as seen in $0) may not be relative to the current working directory, but relative to a directory from the program search path $PATH (this is often seen with KornShell).
Another drawback is that there is really no guarantee that your script is still in the same place it was when it first started executing. Suppose your script is loaded from a temporary file which is then unlinked immediately... your script might not even exist on disk any more! The script could also have been moved to a different location while it was executing. Or (and this is most likely by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common PATH directory like /usr/local/bin, which is how it's being invoked. Your script might be in /opt/foobar/bin/script but the naive approach of reading $0 won't tell you that.
(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [http://www.cs.bell-labs.com/sys/doc/lexnames.html this Plan 9 paper].)
So if the name in $0 is a relative one, i.e. does not start with '/', we can still try to search the script like the shell would have done: in all directories from $PATH.
The following script shows how this could be done:
myname=$0 if [ -s "$myname" ] && [ -x "$myname" ] then # $myname is already a valid file name mypath=$myname else case "$myname" in /*) exit 1;; # absolute path - do not search PATH *) # Search all directories from the PATH variable. Take # care to interpret leading and trailing ":" as meaning # the current directory; the same is true for "::" within # the PATH. for dir in `echo "$PATH" | sed 's/^:/.:/g;s/::/:.:/g;s/:$/:./;s/:/ /g'` do [ -f "$dir/$myname" ] || continue # no file [ -x "$dir/$myname" ] || continue # not executable mypath=$dir/$myname break # only return first matching file done ;; esac fi if [ -f "$mypath" ] then : # echo >&2 "DEBUG: mypath=<$mypath>" else echo >&2 "cannot find full path name: $myname" exit 1 fi echo >&2 "path of this script: $mypath"
Note that $mypath is not necessarily an absolute path name. It still can contain relative parts like ../bin/myscript.
Generally storing data files in the same directory as their scripts is a bad practice. The Unix file system layout assumes that files in one place (e.g. /bin) are executable programs, while files in another place (e.g. /etc) are data files. (Let's ignore legacy Unix systems with programs in /etc for the moment, shall we....)
It really makes the most sense to keep your script's configuration in a single, static location such as $SCRIPTROOT/etc/foobar.conf. If you need to define multiple configuration files, then you can have a directory (say, /var/lib/foobar or /usr/local/lib/foobar), and read that directory's location from a variable in /etc/foobar.conf. If you don't even want that much to be hard-coded, you could pass the location of foobar.conf as a parameter to the script. If you need the script to assume certain default in the absence of /etc/foobar.conf, you can put defaults in the script itself, and/or fall back to something like $HOME/.foobar.conf if /etc/foobar.conf is missing. (This depends on what your script does. In some cases, it may make more sense to abort gracefully.)
29. How can I display value of a symbolic link on standard output?
The external command readlink can be used to display the value of a symbolic link.
$ readlink /bin/sh bash
you can also use GNU find's %l directive, which is especially useful if you need to resolve links in batches:
$ find /bin/ -type l -printf '%p points to %l\n' /bin/sh points to bash /bin/bunzip2 points to bzip2 ...
If your system lacks readlink, you can use a function like this one:
readlink() { local path=$1 ll if [ -L "$path" ]; then ll="$(LC_ALL=C ls -l "$path" 2> /dev/null)" && echo "${ll/* -> }" else return 1 fi }
30. How can I rename all my *.foo files to *.bar?
Some GNU/Linux distributions have a rename command, which you can use for this purpose; however, the syntax differs from one distribution to the next, so it's not a portable answer.
You can do it in POSIX shells like this:
for f in *.foo; do mv "$f" "${f%.foo}.bar"; done
This invokes the external command mv once for each file, so it may not be as efficient as some of the rename implementations.
If you want to do it recursively, then it becomes much more challenging. This example works (in ["BASH"]) as long as no files have newlines in their names:
find . -name '*.foo' -print | while IFS=$'\n' read -r f; do mv "$f" "${f%.foo}.bar" done
Another common form of this question is "How do I rename all my MP3 files so that they have underscores instead of spaces?" You can use this:
for f in *\ *.mp3; do mv "$f" "${f// /_}"; done
31. What is the difference between the old and new test commands ([ and [[)?
[ ("test" command) and [[ ("new test" command) are both used to evaluate expressions. Some examples:
if [ -z "$variable" ] then echo "variable is empty!" fi if [ -f "$filename" ] then echo "not a valid, existing file name: $filename" fi
and
if [[ -e $file ]] then echo "directory entry does not exist: $file" fi if [[ $file0 -nt $file1 ]] then echo "file $file0 is newer than $file1" fi
To cut a long story short: [ implements the old, portable syntax of the command. Although all modern shells have built-in implementations, there usually still is an external executable of that name, e.g. /bin/[. [[ is a new improved version of it, which is a keyword, not a program. This has benefical effects on the ease of use, see below. [[ is understood by KornShell, ["BASH"] (e.g. 2.03), KornShell93, ["POSIX"] shell, but not by the older BourneShell.
Although [ and [[ have much in common, and share many expression operators like "-f", "-s", "-n", "-z", there are some notable differences. Here is a comparison list:
Feature |
new test [[ |
old test [ |
Example |
string comparison |
> |
(not available) |
- |
< |
(not available) |
- |
|
== (or =) |
= |
- |
|
!= |
!= |
- |
|
expression grouping |
&& |
-a |
[[ -n $var && -f $var ]] && echo "$var is a file" |
|| |
-o |
- |
|
Pattern matching |
= |
(not available) |
[[ $name = a* ]] || echo "name does not start with an 'a': $name" |
In-process regular expression matching |
=~ |
(not available) |
[[ $(date) =~ '^Fri ... 13 ' ]] && echo "It's Friday the 13th!" |
Special primitives that [[ is defined to have, but [ may be lacking (depending on the implementation):
Description |
Primitive |
Example |
entry (file or directory) exists |
-e |
[[ -e $config ]] && echo "config file exists: $config" |
file is newer/older than other file |
-nt / -ot |
[[ $file0 -nt $file1 ]] && echo "$file0 is newer than $file1" |
two files are the same |
-ef |
[[ $input -ef $output ]] && { echo "will not overwrite input file: $input"; exit 1; } |
negation |
! |
- |
But there are more subtle differences.
No field splitting will be done for [[ (and therefore many arguments need not to be quoted)
file="file name" [[ -f $file ]] && echo "$file is a file"
will work even though $file is not quoted and contains whitespace. With [ the variable needs to be quoted:
file="file name" [ -f "$file" ] && echo "$file is a file"
This makes [[ easier to use and less error prone.
No file name generation will be done for [[. Therefore the following line tries to match the contents of the variable $path with the pattern /*
[[ $path = /* ]] && echo "\$path starts with a forward slash /: $path"
The next command most likely will result in an error, because /* is subject to file name generation:
[ $path = /* ] && echo "this does not work"
[[ is strictly used for strings and files. If you want to compare numbers, use ArithmethicExpression ((expression)), e.g.
i=0 while ((i<10)) do echo $i ((i=$i+1)) done
When should the new test command [[ be used, and when the old one [? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires ["BASH"] or KornShell, the new syntax could be preferable.
32. How can I redirect the output of 'time' to a variable or file?
The reason that 'time' needs special care for redirecting its output is one of those mysteries of the universe. The answer will probably be solved around the same time we find dark matter.
- File Redirection
bash -c "time ls" > /path/to/foo 2>&1 ( time ls ) > /path/to/foo 2>&1 { time ls; } > /path/to/foo 2>&1
- Variable Redirection
foo=$( bash -c "time ls" 2>&1 ) foo=$( ( time ls ) 2>&1 ) foo=$( { time ls; } 2>&1 )
Note: Using 'bash -c' and ( ) creates a subshell, using { } does not. Do with that as you wish.
33. How can I find a process ID for a process given its name?
Usually a process is referred to using its process ID (PID), and the ps command can display the information for any process given its process ID, e.g.
$ echo $$ # my process id 21796 $ ps -p 21796 PID TTY TIME CMD 21796 pts/5 00:00:00 ksh
But frequently the process ID for a process is not known, but only its name. Some operating systems, e.g. Solaris, BSD, and some versions of Linux have a dedicated command to search a process given its name, called pgrep:
$ pgrep init 1
Often there is an even more specialized program available to not just find the process ID of a process given its name, but also to send a signal to it:
$ pkill myprocess
Some systems also provide pidof. It differs from pgrep in that multiple output process IDs are only space separated, not newline separated.
$ pidof cron 5392
If these programs are not available, a user can search the output of the ps(1) command using grep.
The major problem when grepping the ps output is that grep may match its own ps entry (try: ps aux | grep init). To make matters worse, this does not happen every time; the techicnal name for this is a "race condition". To avoid this, there are several ways:
- Using grep -v at the end
ps aux | grep name | grep -v grep
will throw away all lines containing "grep" from the output. Disadvantage: You always have the exit state of the grep -v, so you can't e.g. check if a specific process exists.
- Using grep -v in the middle
ps aux | grep -v grep | grep name
This does exactly the same, beside that the exit state of "grep name" is acessible and a representation for "name is a process in ps" or "name is not a process in ps". It still has the disadvantage to start a new process (grep -v).
- Using [] in grep
ps aux | grep [n]ame
This spawns only the needed grep-process. The trick is to use the []-character class (regular expressions). To put only one character in a character group normally makes no sense at all, because a [c] will always be a "c". In this case, it's the same. grep [n]ame searches for "name". But as grep's own process list entry is what you executed ("grep [n]ame") and not "grep name", it will not match itself.
===BEGIN greycat rant===
Most of the time when someone asks a question like this, it's because they want to manage a long-running daemon using primitive shell scripting techniques. Common variants are "How can I get the PID of my foobard process.... so I can start one if it's not already running" or "How can I get the PID of my foobard process... because I want to prevent the foobard script from running if foobard is already active." Both of these questions will lead to seriously flawed production systems.
If what you really want is to restart your daemon whenever it dies, just do this:
while true; do mydaemon --in-the-foreground done
where --in-the-foreground is whatever switch, if any, you must give to the daemon to PREVENT IT from automatically backgrounding itself. (Often, -d does this and has the additional benefit of running the daemon with increased verbosity.) Self-daemonizing programs may or may not be the target of a future greycat rant....
If that's too simplistic, look into [http://cr.yp.to/daemontools.html daemontools] or [http://smarden.org/runit/ runit], which are programs for managing services.
If what you really want is to prevent multiple instances of your program from running, then the only sure way to do that is by using a lock. For details on doing this, see ProcessManagement or [#faq45 FAQ 45].
===END greycat rant===
34. Can I do a spinner in Bash?
Sure.
i=1 sp="/-\|" echo -n ' ' while true do echo -en "\b${sp:i++%${#sp}:1}" done
You can also use \r instead of \b. You can use pretty much any character sequence you want as well. If you want it to slow down, put a sleep command inside the loop. A similar technique can be used to build progress bars.
35. How can I handle command-line arguments to my script easily?
Well, that depends a great deal on what you want to do with them. Here's a general template that might help for the simple cases:
while [[ $1 == -* ]]; do case "$1" in -h|--help) show_help; exit 0;; -v) verbose=1; shift;; -f) output_file=$2; shift 2;; esac done # Now all of the remaining arguments are the filenames which followed # the optional switches. You can process those with "for i" or "$@".
For more complex/generalized cases, or if you want things like "-xvf" to be handled as three separate flags, you can use getopts or getopt. (Heiner, that's your cue....)
36. How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction).
Use the comm(1) command.
# intersection of file1 and file2 comm -12 <(sort file1) <(sort file2) # subtraction of file1 from file2 comm -13 <(sort file1) <(sort file2)
Read the comm(1) manpage for details.
If for some reason you lack the core comm(1) program, you can use these other methods:
an amazingly simple and fast implementation, that took just 20 seconds to match a 30k line file against a 400k line file for me.
note that it probably only works with GNU grep, and that the file specified with -f is will be loaded into ram, so it doesn't scale for very large files.
it has grep read one of the sets as a pattern list from a file (-f), and interpret the patterns as plain strings not regexps (-F), matching only whole lines (-x).
# intersection of file1 and file2 grep -xF -f file1 file2 # substraction of file1 from file2 grep -vxF -f file1 file2
an implementation using sort and uniq
# intersection of file1 and file2 sort file1 file2 | uniq -d (Assuming each of file1 or file2 does not have repeated content) # file1-file2 (Subtraction) sort file1 file2 file2 | uniq -u # same way for file2 - file1, change last file2 to file1 sort file1 file2 file1 | uniq -u
another implementation of substraction:
cat file1 file1 file2 | sort | uniq -c | awk '{ if ($1 == 2) { $1 = ""; print; } }'
This may introduce an extra space at the start of the line; if that's a problem, just strip it away.
Also, this approach assumes that neither file1 nor file2 has any duplicates in it.
Finally, it sorts the output for you. If that's a problem, then you'll have to abandon this approach altogether. Perhaps you could use awk's associative arrays (or perl's hashes or tcl's arrays) instead.
37. How can I print text in various colors?
Do not hard-code ANSI color escape sequences in your program! The tput command lets you interact with the terminal database in a sane way.
tput setaf 1; echo this is red tput setaf 2; echo this is green tput setaf 0; echo now we are back in black
tput reads the terminfo database which contains all the escape codes necessary for interacting with your terminal, as defined by the $TERM variable. For more details, see the terminfo(5) man page.
If you don't know in advance what your user's terminal's default text color is, you can use tput sgr0 to reset the colors to their default settings. This also removes boldface (tput bold), etc.
38. How do Unix file permissions work?
See ["Permissions"].
39. What are all the dot-files that bash reads?
See DotFiles.
40. How do I use dialog to get input from the user?
foo=$(dialog --inputbox "text goes here" 8 40 2>&1 >/dev/tty) echo "The user typed '$foo'"
The redirection here is a bit tricky.
The foo=$(command) is set up first, so the standard output of the command is being captured by bash.
Inside the command, the 2>&1 causes standard error to be sent to where standard out is going -- in other words, stderr will now be captured.
>/dev/tty sends standard output to the terminal, so the dialog box will be seen by the user. Standard error will still be captured, however.
Another common dialog(1)-related question is how to dynamically generate a dialog command that has items which must be quoted (either because they're empty strings, or because they contain internal white space). One can use eval for that purpose, but the cleanest way to achieve this goal is to use an array.
unset m; i=0 words=(apple banana cherry "dog droppings") for w in "${words[@]}"; do m[i++]=$w; m[i++]="" done dialog --menu "Which one?" 12 70 9 "${m[@]}"
In the previous example, the while loop that populates the m array could have been reading from a pipeline, a file, etc.
Recall that the construction "${m[@]}" expands to the entire contents of an array, but with each element implicitly quoted. It's analogous to the "$@" construct for handling positional parameters. For more details, see [#faq50 FAQ50] below.
Here's another example, using filenames:
files=(*.mp3) # These may contain spaces, apostrophes, etc. cmd=(dialog --menu "Select one:" 22 76 16); n=6 i=0 for f in "${files[@]}"; do cmd[n++]=$((i++)); cmd[n++]="$f" done choice=$("${cmd[@]}" 2>&1 >/dev/tty)
The user's choice will be stored in the choice variable, as an integer, which can in turn be used as an index into the files array.
A seperate but useful function of dialog is to track progress of a process that produces output. Below is an example that uses dialog to track processes writing to a log file. In the dialog window, there is a tailbox where output is stored, and a msgbox with a clickable Quit. Clicking quit will cause trap to execute, removing the tempfile, and destroying the tail process.
#you can not tail a nonexistant file, so always ensure it pre-exists! rm -f dialog-tail.log; echo Initialize log >> dialog-tail.log date >> dialog-tail.log tempfile=`tempfile 2>/dev/null` || tempfile=/tmp/test$$ trap "rm -f $tempfile" 0 1 2 5 15 dialog --title "TAIL BOXES" \ --begin 10 10 --tailboxbg dialog-tail.log 8 58 \ --and-widget \ --begin 3 10 --msgbox "Press OK " 5 30 \ 2>$tempfile & mypid=$!; for i in 1 2 3; do echo $i >> dialog-tail.log; sleep 1; done echo Done. >> dialog-tail.log wait $mypid;
41. How do I determine whether a variable contains a substring?
if [[ $foo = *bar* ]]
The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions:
if [[ $foo =~ ab*c ]] # bash 3, matches abbbbcde, or ac, etc.
If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax:
case "$foo" in *bar*) .... ;; esac
This should allow you to match variables against globbing-style patterns. if you need a portable way to match variables against regular expressions, use grep or egrep.
if echo "$foo" | egrep some-regex >/dev/null; then ...
42. How can I find out if a process is still running?
The kill command is used to send signals to a running process. As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running:
myprog & # Start program in the background daemonpid=$! # ...and save its process id while sleep 60 do if kill -0 $daemonpid # Is the process still alive? then echo >&2 "OK - process is still running" else echo >&2 "ERROR - process $daemonpid is no longer running!" break fi done
This is one of those questions that usually masks a much deeper issue. It's rare that someone wants to know whether a process is still running simply to display a red or green light to an operator. More often, there's some ulterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running, or to ensure mutually exclusive access to a resource, etc. For much better discussion of these issues, see ProcessManagement or [#faq33 FAQ #33].
43. How can I use array variables?
BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.
host[0]="micky" host[1]="minnie" host[2]="goofy" i=0 while (($i < ${#host[@]} )) do echo "host number $i is ${host[i++]}" done
The awkward experssion ${#host[@]} returns the number of elements for the array host.
It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:
# BASH array=(one two three four) # KornShell set -A array -- one two three four
44. How can I use associative arrays or variable variables?
Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes". KornShell93 already supports this kind of array:
# KornShell93 script - does not work with BASH typeset -A homedir # Declare KornShell93 associative array homedir[jim]=/home/jim homedir[silvia]=/home/silvia homedir[alex]=/home/alex for user in ${!homedir[@]} # Enumerate all indices (user names) do echo "Home directory of user $user is ${homedir[$user]}" done
BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:
for user in jim silvia alex do eval homedir_$user=/home/$user done
This creates the variables
homedir_jim=/home/jim homedir_silvia=/home/silvia homedir_alex=/home/alex
with the corresponding content. Note the use of the eval command, which interprets a command line not just one time like the shell usually does, but twice. In the first step, the shell uses the input homedir_$user=/home/$user to create a new line homedir_jim=/home/jim. In the second step, caused by eval, this variable assignment is executed, actually creating the variable.
Print the variables using
for user in jim silvia alex do varname=homedir_$user # e.g. "homedir_jim" eval varcontent='$'$varname # e.g. "/home/jim" echo "home directory of $user is $varcontent" done
The eval line needs some explanation. In a first step the command substitution is run:
eval varcontent='$'$varname
becomes
eval varcontent=$homedir_jim
In a second step the eval re-evaluates the line, and converts this to
varcontent=/home/jim
Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:
- it's hard to read and to maintain
the variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* , i.e. a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we e.g. could not have processed the home directory of a user named hong-hu, because a dash '-' can be no valid part of a user name.
- Quoting is hard to get right. If a content (not variable name) string can contain whitespace characters, it's hard to quote it right to preserve it.
Here is the summary. "var" is a constant prefix, "$index" contains index string, "$content" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:
- Set variables
eval "var$index=\"$content\"" # index must only contain characters from [a-zA-Z0-9_]
- Print variable content
eval "echo \"var$index=\$$varname\""
- Check if a variable is empty
if eval "[ -z "\$var$index\" ]" then echo "variable is empty: $var$index" fi
You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.
45. How can I ensure that only one instance of a script is running at a time (mutual exclusion)?
We need some means of mutual exclusion. One easy way is to use a "lock": any number of processes can try to acquire the lock simultaneously, but only one of them will succeed.
How can we implement this using shell scripts? Some people suggest creating a lock file, and checking for its presence:
# locking example -- WRONG lockfile=/tmp/myscript.lock if [ -f "$lockfile" ] then # lock is already held echo >&2 "cannot acquire lock, giving up: $lockfile" exit 0 else # nobody owns the lock > "$lockfile" # create the file #...continue script fi
This example does not work, because there is a time window between checking and creating the file. Assume two processes are running the code at the same time. Both check if the lockfile exists, and both get the result that it does not exist. Now both processes assume they have acquired the lock -- a disaster waiting to happen. We need an atomic check-and-create operation, and fortunately there is one: mkdir, the command to create a directory:
# locking example -- CORRECT lockdir=/tmp/myscript.lock if mkdir "$lockdir" then # directory did not exist, but was created successfully echo >&2 "successfully acquired lock: $lockdir" # continue script else echo >&2 "cannot acquire lock, giving up on $lockdir" exit 0 fi
The advantage over using a lock file is, that even when two processes call mkdir at the same time, only one process can succeed at most. This atomicity of check-and-create is ensured at the operating system kernel level.
Note that we cannot use "mkdir -p" to automatically create missing path components: "mkdir -p" does not return an error if the directory exists already, but that's the feature we rely upon to ensure mutual exclusion.
Now let's spice up this example by automatically removing the lock when the script finishes:
lockdir=/tmp/myscript.lock if mkdir "$lockdir" then echo >&2 "successfully acquired lock" # Remove lockdir when the script finishes, or when it receives a signal trap 'rm -rf "$lockdir"' 0 # remove directory when script finishes trap "exit 2" 1 2 3 15 # terminate script when receiving signal # Optionally create temporary files in this directory, because # they will be removed automatically: tmpfile=$lockdir/filelist else echo >&2 "cannot acquire lock, giving up on $lockdir" exit 0 fi
This example provides reliable mutual exclusion. There is still the disadvantage that a stale lock file could remain when the script is terminated with a signal not caught (or signal 9, SIGKILL), but it's a good step towards reliable mutual exclusion.
Instead of using mkdir we could also have used the program to create a symbolic link, ln -s.
For more discussion on these issues, see ProcessManagement.
46. I want to check to see whether a word is in a list (or an element is a member of a set).
Let's suppose you have your "list" stored as a big string of words, with spaces in between them. (That's the most common case when people are asking this one.) What you actually want to do is determine whether the string " foo " (note the spaces around it) appears in the list. But since your list may not have leading/trailing spaces, you have to add them as well. So, here's the most portable way to do it:
if echo " $list " | grep " foo " >/dev/null; then ....
GNU grep seems to have a special -w extension which lets you avoid the spaces:
if echo "$list" | GNUgrep -q -w "foo"; then ....
Finally, if you want to use Bash builtins, you can do it thus:
if [[ " $list " = *\ foo\ * ]]; then ....
This is basically the same as the original grep -- we surround both the list and the word (foo) with spaces, and then do a simple text matching.
47. How can I redirect stderr to a pipe?
A pipe can only carry stdout of a program. To pipe stderr through it, you need to redirect stderr to the same destination as stdout. Optionally you can close stdout or redirect it to /dev/null to only get stderr. Some sample code:
# - 'myprog' is an example for a program that outputs both, stdout and # stderr # - after the pipe I will just use a 'cat', of course you can put there # what you want # version 1: redirect stderr towards the pipe while stdout survives (both come # mixed) myprog 2>&1 | cat # version 2: redirect stderr towards the pipe without getting stdout (it's # redirected to /dev/null) myprog 2>&1 >/dev/null | cat #Note that '>/dev/null' comes after '2>&1', otherwise the stderr will also be directed to /dev/null # version 3: redirect stderr towards the pipe while the "original" stdout gets # closed myprog 2>&1 >&- | cat
One may also pipe stderr only but keep stdout intact (without a priori knowledge of where the script's output is going). This is a bit trickier.
This has an obvious application with eg. dialog, which draws (using ncurses) windows onto the screen to stdout, and returns output to stderr. This may be a little inconvenient, because it may lead to a necessary temporary file which we may like to evade. (Although this is not necessary -- see [#faq40 FAQ #40] for more examples of using dialog specifically!)
On [http://www.tldp.org/LDP/abs/html/io-redirection.html TLDP], I've found following trick:
# Redirecting only stderr to a pipe. exec 3>&1 # Save current "value" of stdout. ls -l 2>&1 >&3 3>&- | grep bad 3>&- # Close fd 3 for 'grep' (but not 'ls'). # ^^^^ ^^^^ exec 3>&- # Now close it for the remainder of the script. # Thanks, S.C.
To show it as a dialog one-liner:
exec 3>&1 dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 3>&- | sed 's/First/Only/' exec 3>&-
This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed. Cheers.
48. Why should I never use eval?
"eval" is a common misspelling of "evil". The section dealing with spaces in file names used to include the following quote "helpful tool (which is probably not as safe as the \0 technique)", end quote.
Syntax : nasty_find_all [path] [command] <maxdepth>
#This code is evil and must never be used export IFS=" " [ -z "$3" ] && set -- "$1" "$2" 1 FILES=`find "$1" -maxdepth "$3" -type f -printf "\"%p\" "` #warning, evilness eval FILES=($FILES) for ((I=0; I < ${#FILES[@]}; I++)) do eval "$2 \"${FILES[I]}\"" done unset IFS
This script is supposed to recursively search for files with newlines and/or spaces in them, arguing that find -print0 | xargs -0 was unsuitable for some purposes such as multiple commands. It was followed by an instructional description on all the lines involved, which we'll skip.
To its defense, it works:
$ ls -lR .: total 8 drwxr-xr-x 2 vidar users 4096 Nov 12 21:51 dir with spaces -rwxr-xr-x 1 vidar users 248 Nov 12 21:50 nasty_find_all ./dir with spaces: total 0 -rw-r--r-- 1 vidar users 0 Nov 12 21:51 file?with newlines $ ./nasty_find_all . echo 3 ./nasty_find_all ./dir with spaces/file with newlines $
But consider this:
$ touch "\"); ls -l $'\x2F'; #"
You just created a file called "); ls -l $'\x2F'; #
Now FILES will contain ""); ls -l $'\x2F'; #. When we do eval FILES=($FILES), it becomes
FILES=(""); ls -l $'\x2F'; #"
Which becomes the two statements FILES=(""); and ls -l / . Congratulations, you just allowed execution of arbitrary commands.
$ touch "\"); ls -l $'\x2F'; #" $ ./nasty_find_all . echo 3 total 1052 -rw-r--r-- 1 root root 1018530 Apr 6 2005 System.map drwxr-xr-x 2 root root 4096 Oct 26 22:05 bin drwxr-xr-x 3 root root 4096 Oct 26 22:05 boot drwxr-xr-x 17 root root 29500 Nov 12 20:52 dev drwxr-xr-x 68 root root 4096 Nov 12 20:54 etc drwxr-xr-x 9 root root 4096 Oct 5 11:37 home drwxr-xr-x 10 root root 4096 Oct 26 22:05 lib drwxr-xr-x 2 root root 4096 Nov 4 00:14 lost+found drwxr-xr-x 6 root root 4096 Nov 4 18:22 mnt drwxr-xr-x 11 root root 4096 Oct 26 22:05 opt dr-xr-xr-x 82 root root 0 Nov 4 00:41 proc drwx------ 26 root root 4096 Oct 26 22:05 root drwxr-xr-x 2 root root 4096 Nov 4 00:34 sbin drwxr-xr-x 9 root root 0 Nov 4 00:41 sys drwxrwxrwt 8 root root 4096 Nov 12 21:55 tmp drwxr-xr-x 15 root root 4096 Oct 26 22:05 usr drwxr-xr-x 13 root root 4096 Oct 26 22:05 var ./nasty_find_all ./dir with spaces/file with newlines ./ $
It doesn't take much imagination to replace ls -l with rm -rf or worse.
One might think these circumstances are obscure, but one should not be tricked by this. All it takes is one malicious user, or perhaps more likely, a benign user who left the terminal unlocked when going to the bathroom, wrote a funny php uploading script that doesn't sanity check file names or who made the same mistake as oneself in allowing arbitrary code execution (now instead of being limited to the www-user, an attacker can use nasty_find_all to traverse chroot jails and/or gain additional privileges), uses an IRC or IM client that's too liberal in the filenames it accepts for file transfers or conversation logs, etc.
49. How can I view periodic updates/appends to a file? (ex: growing log file)
tail -f will show you the growing log file. On some systems (e.g. OpenBSD), this will automatically track a rotated log file to the new file with the same name (which is usually what you want). To get the equivalent functionality on GNU systems, use tail --follow=name instead.
This is helpful if you need to view only the updates to the file after your last view.
# Start by setting n=1 tail -n $n testfile; n="+$(( $(wc -l < testfile) + 1 ))"
Every invocation of this gives the update to the file from where we stopped last. If you know the line number from where you want to start, set n to that.
50. I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments.
Some people attempt to do things like this:
# Non-working example args="-s 'The subject' $address" mail $args < $body
This fails because of word-splitting. When $args is evaluated, it becomes four words: 'The is the second word, and subject' is the third word.
What's needed is a way to maintain each word as a separate item, even if that word contains multiple spaces. Quotes won't do it, but an array will.
# Working example args=(-s "The subject" "$address") mail "${args[@]}" < $body
Usually, this question arises when someone is trying to use dialog to construct a menu on the fly. For an example of how to do this properly, see [#faq40 FAQ #40] above.
51. I want history-search just like in tcsh. How can I bind it to the up and down keys?
Just add the following to /etc/inputrc or your ~/.inputrc
"\e[A":history-search-backward "\e[B":history-search-forward
52. How do I convert a file in DOS format to UNIX format. ( Remove CRLF line terminators )
All these are from the sed one-liners page
sed 's/.$//' dosfile # assumes that all lines end with CR/LF sed 's/^M$//' dosfile # in bash/tcsh, press Ctrl-V then Ctrl-M sed 's/\x0D$//' dosfile
Some distributions have dos2unix command which can do this. In vim, you can use :set fileformat=unix
53. I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is. Lines wrap around incorrectly.
You must put \[ and \] around any non-printing escape sequences in your prompt. Thus:
BLUE=$(tput setaf 4) PURPLE=$(tput setaf 5) BLACK=$(tput setaf 0) PS1='\[$BLUE\]\h:\[$PURPLE\]\w\[$BLACK\]\$ '
Without the \[ \], bash will think the bytes which constitute the escape sequences for the color codes will actually take up space on the screen, so bash won't be able to know where the cursor actually is.
54. How can I tell whether a variable contains a valid number?
First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign".
if [[ $foo = *[^0-9]* ]]; then echo "'$foo' has a non-digit somewhere in it" else echo "'$foo' is strictly numeric" fi
This can be done in Korn and legacy Bourne shells as well, using case:
case "$foo" in *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;; *) echo "'$foo' is strictly numeric" ;; esac
If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression. Bash version 3 and above have regular expression support in the [[ command:
if [[ $foo =~ ^[-+]?[0-9]+\(\.[0-9]+\)?$ ]]; then echo "'$foo' looks rather like a number" else echo "'$foo' doesn't look particularly numeric to me" fi
If you don't have bash version 3, then you would use egrep:
if echo "$foo" | egrep '^[-+]?[0-9]+(\.[0-9]+)?$' >/dev/null; then echo "'$foo' might be a number" else echo "'$foo' might not be a number" fi
Note that the parentheses in the egrep regular expression don't require backslashes in front of them, whereas the ones in the bash3 command do.
55. Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which?
Bash processes all redirections from left to right, in order. And the order is significant. Moving them around within a command may change the results of that command.
Here's a simple example:
foo() { echo "This is stdout" echo "This is stderr" 1>&2 } foo >/dev/null 2>&1 # produces no output foo 2>&1 >/dev/null # writes "This is stderr" on the screen
Why do the results differ? In the first case, >/dev/null is performed first, and therefore the standard output of the command is sent to /dev/null. Then, the 2>&1 is performed, which causes standard error to be sent to the same place that standard output is already going. So both of them are discarded.
In the second example, 2>&1 is performed first. This means standard error is sent to wherever standard output happens to be going -- in this case, the user's terminal. Then, standard output is sent to /dev/null and is therefore discarded. So when we run foo the second time, we see only its standard error, not its standard output.
There are times when we really do want 2>&1 to appear first -- for one example of this, see [#faq40 FAQ 40].
There are other times when we may use 2>&1 without any other redirections. Consider:
find ... 2>&1 | grep "some error"
In this example, we want to search find's standard error (as well as its standard output) for the string "some error". The 2>&1 in the piped command forces standard error to go into the pipe along with standard output. (When pipes and redirections are mixed in this way, remember: the pipe is done first, before any redirections. So find's standard output is already set to point to the pipe before we process the 2>&1 redirection.)
If we wanted to read only standard error in the pipe, and discard standard output, we could do it like this:
find ... 2>&1 >/dev/null | grep "some error"
The redirections in that example are processed thus:
First, the pipe is created. find's output is sent to it.
Next, 2>&1 causes find's standard error to go to the pipe as well.
Finally, >/dev/null causes find's standard output to be discarded, leaving only stderr going into the pipe.
A related question is [#faq47 FAQ #47], which discusses how to send stderr to a pipeline.
56. How can I untar or unzip multiple tarballs at once?
As the tar command was originally designed to read from and write to tape devices (tar - Tape ARchiver), you can specify only filenames to put inside an archive or to extract out of an archive (e.g. tar x myfileonthe.tape). There is an option to tell tar that the archive is not on some tape, but in a file: -f. This option takes exactly one argument: the filename of the file containing the archive. All other (following) filenames are taken to be archive members:
tar -x -f backup.tar myfile.txt # OR (more common syntax IMHO) tar xf backup.tar myfile.txt
Now here's a common mistake -- imagine a directory containing the following archive-files you want to extract all at once:
$ ls backup1.tar backup2.tar backup3.tar
Maybe you think of tar xf *.tar. Let's see:
$ tar xf *.tar tar: backup2.tar: Not found in archive tar: backup3.tar: Not found in archive tar: Error exit delayed from previous errors
What happened? The shell replaced your *.tar by the matching filenames. You really wrote:
tar xf backup1.tar backup2.tar backup3.tar
And as we saw earlier, it means: "extract the files backup2.tar and backup3.tar from the archive backup1.tar", which will of course only succeed when there are such filenames stored in the archive.
The solution is relatively easy: extract the contents of all archives one at a time. As we use a UNIX shell and we are lazy, we do that with a loop:
for tarname in *.tar; do tar xf "$tarname" done
What happens? The for-loop will iterate through all filenames matching *.tar and call tar xf for each of them. That way you extract all archives one-by-one and you even do it automagically.
The second common archive type in these days is ZIP. The command to extract contents from a ZIP file is unzip (who would have guessed that!). The problem here is the very same: unzip takes only one option specifying the ZIP-file. So, you solve it the very same way:
for zipfile in *.zip; do unzip "$zipfile" done
Not enough? Ok. There's another option with unzip: it can take shell-like patterns to specify the ZIP-file names. And to avoid interpretion of those patterns by the shell, you need to quote them. unzip itself and not the shell will interpret *.zip in this case:
unzip "*.zip" # OR, to make more clear what we do: unzip \*.zip
(This feature of unzip derives mainly from its origins as an MS-DOS program. MS-DOS's command interpreter does not perform glob expansions, so every MS-DOS program must be able to expand wildcards into a list of filenames. This feature was left in the Unix version, and as we just demonstrated, it can occasionally be useful.)
57. How can group entries (in a file by common prefixes)?
as in, convert:
foo: entry1 bar: entry2 foo: entry3 baz: entry4
to
foo: entry1 entry3 bar: entry2 baz: entry4
there are two simple general methods for this:
- sort the file, and then iterate over it, collectin entries until the prefix changes, and then print the collected entries with the previous prefix b iterate over the file, collect entries for each prefix in an array indexed by the prefix
a basic implementation of a) in bash:
old=xxx ; stuff= (sort file ; echo xxx) | while read prefix line ; do if [[ $prefix = $old ]] ; then stuff="$stuff $line" else echo "$old: $stuff" old="$prefix" stuff= fi done
and a basic implementation of b) in awk:
{ a[$1] = a[$1] " " $2 } END{ for (x in a) print x, a[x] }
usage:
awk '{a[$1] = a[$1] " " $2}END{for (x in a) print x, a[x]}' file
58. Can bash handle binary data?
the answer is, basically no... while bash won't have as much problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them. one instance where such would sometimes be handy is for example storing small temporary bitmaps while working with netpbm... here i resorted to adding an extra pnmnoraw to the pipe, creating (larger) ascii files that bash has no problems storing)
if you are feeling adventurous, consider this experiment:
# bindec.bash, attempt to decode binary data to ascii decimals IFS= while read -n1 x ;do case "$x" in '') echo empty ;; # insert the 256 lines generated by the following oneliner here: # for x in $(seq 0 255) ;do echo " $'\\$(printf %o $x)') echo $x;;" ;done esac done
and then pipe binary data into it, maybe like so:
for x in $(seq 0 255) ;do echo -ne "\\$(printf %o $x)" ;done | bash bindec.bash | nl | less
this suggests that a the 0 character is skipped entirely, because we can't create it with the input generation, enough to conveniently corrupt most binary files we try to process
(note that this refers to storing them in variables... moving data between programs using pipes is always binary clean)
59. How can I remove the last character of a line?
Using bash and ksh extended parameter substitution:
var=${var%?}
Remember that ${var%foo} removes foo from the end, and ${var#foo} removes foo from the beginning, of var. As a mnemonic, # appears to the left of % on the keyboard (US keyboards, at least).
More portable, but slower:
var=`expr "$var" : '\(.*\).'`
or (using sed):
var=`echo "$var" | sed 's/.$//'`
60. I'm trying to write a script that will change directory (or set a variable), but after the script finishes, I'm back where I started (or my variable isn't set)!
Consider this:
#!/bin/sh cd /tmp
If one executes this simple script, what happens? Bash forks, and the parent waits. The child executes the script, including the chdir(2) system call, and then exits. The parent, which was waiting for the child, harvests the child's exit status (presumably 0 for success), and then bash carries on with the next command.
Since the chdir was done by a child process, it has no effect on the parent.
Moreover, there is no conceivable way you can ever have a child process affect any part of the parent's environment, which includes its variables as well as its current working directory.
So, how does one go about it? You can still have the cd command in an external file, but you can't run it as a script. Instead, you must source it (or "dot it in", using the . command, which is a synonym for source).
echo 'cd /tmp' > $HOME/mycd source $HOME/mycd pwd # Now, we're in /tmp
The same thing applies to setting variables. source the file that contains the commands; don't try to run it.
61. Is there a list of which features were added to specific releases of Bash?
[http://cnswww.cns.cwru.edu/~chet/bash/NEWS NEWS]: a file tersely listing the notable changes between the current and previous versions
[http://cnswww.cns.cwru.edu/~chet/bash/CHANGES CHANGES]: a complete bash change history
[http://cnswww.cns.cwru.edu/~chet/bash/COMPAT COMPAT]: compatibility issues between bash3 and previous versions
Here's a partial list of the changes, in a more compact format:
Feature |
Added in version |
x+=string |
3.1-alpha1 |
{x..y} |
3.0-alpha |
${!array[@]} |
3.0-alpha |
[[ =~ |
3.0-alpha |
<<< |
2.05b-alpha1 |
i++ |
2.04-devel |
for ((;;)) |
2.04-devel |
/dev/fd/N, /dev/tcp/host/port, etc. |
2.04-devel |
a=(*.txt) file expansion |
2.03-alpha |
extglob |
2.02-alpha1 |
[[ |
2.02-alpha1 |
builtin printf |
2.02-alpha1 |
$(< filename) |
2.02-alpha1 |
** (exponentiation) |
2.02-alpha1 |
\xNNN |
2.02-alpha1 |
(( )) |
2.0-beta2 |
62. How do I create a temporary file in a secure manner?
Good question. To be filled in later. (Interim hints: tempfile is not portable. mktemp exists more widely, but it may require a -c switch to create the file in advance; or it may create the file by default and barf if -c is supplied. There does not appear to be any single command that simply works everywhere, without testing various arguments.)
63. My ssh client hangs when I try to run a remote background job!
The following will not do what you expect:
ssh me@remotehost 'sleep 120 &' # Client hangs for 120 seconds
This is a "feature" of [http://www.openssh.org/ OpenSSH]. The client will not close the connection as long as the remote end's terminal still is still in use -- and in the case of sleep 120 &, stdout and stderr are still connected to the terminal.
The immediate answer to your question -- "How do I get the client to disconnect so I can get my shell back?" -- is to kill the ssh client. You can do this with the kill or pkill commands, of course; or by sending the INT signal (usually Ctrl-C) for a non-interactive ssh session (as above); or by pressing <Enter><~><.> (Enter, Tilde, Period) in the client's terminal window for an interactive remote shell.
The long-term workaround for this is to ensure that all the file descriptors are redirected to a log file (or /dev/null) on the remote side:
ssh me@remotehost 'sleep 120 >/dev/null 2>&1 &' # Client should return immediately
This also applies to restarting daemons on some legacy Unix systems.
ssh root@hp-ux-box # Interactive shell ... # Discover that the problem is stale NFS handles /sbin/init.d/nfs.client stop # autofs is managed by this script and /sbin/init.d/nfs.client start # killing it on HP-UX is OK (unlike Linux) exit # Client hangs -- use Enter ~ . to kill it.
The legacy Unix /sbin/init.d/nfs.client script runs daemons in the background but leaves their stdout and stderr attached to the terminal (and they don't fully self-daemonize). The solution is either to fix the Unix vendor's broken init script, or to kill the ssh client process after this happens. The author of this article uses the latter approach.
64. Why is it so hard to get an answer to the question that I asked in #bash ?
- #bash aphorism #1 "The questioner's first description of the problem/question will be misleading."
- corollary 1.1 "The questioner's second description of the problem/question will also be misleading"
- corollary 1.2 "The questioner is never precise" ex: will say "print the file" when they mean print the file's name, rather than printing the file itself."
- #bash aphorism #2, "The questioner will keep changing their original question until it drives the helpers in the channel insane."
- #bash aphorism #3, "The data is never formatted in the way that makes it easiest to manipulate :-)"
- #bash aphorism #4, "30 to 40 percent of the conversations in #bash will be about aphorisms #1 and #2"
65. Is there a "PAUSE" command in bash like there is in MSDOS batch scripts?
No, but you can use these:
echo press enter to continue; read
echo press any key to continue; read -n 1
66. I want to check if [[ $var == foo || $var == bar || $var = more ]] without repeating $var n times.
case $var in foo|bar|more) ... ;; esac
67. How can I trim leading/trailing white space from one of my variables?
There are a few ways to do this -- none of them elegant.
First, the most portable way would be to use sed:
x=$(echo "$x" | sed -e 's/^ *//' -e 's/ *$//') # Note: this only removes spaces. For tabs too: x=$(echo "$x" | sed -e $'s/^[ \t]*//' -e $'s/[ \t]*$//') # Or possibly, with some systems: x=$(echo "$x" | sed -e 's/^[[:space:]]\+//' -e 's/[[:space:]]\+$//')
One can achieve the goal using builtins, although at the moment I'm not sure which shells the following syntax supports:
# Remove leading whitespace: while [[ $x = [$' \t\n']* ]]; do x=${x#[$' \t\n']}; done # And now trailing: while [[ $x = *[$' \t\n'] ]]; do x=${x%[$' \t\n']}; done
Of course, the preceding example is pretty slow, because it removes one character at a time, in a loop (although it's good enough in practice for most purposes). If you want something a bit fancier, there's a bash-only solution using extglob:
shopt -s extglob x=${x##*([$' \t\n'])}; x=${x%%*([$' \t\n'])} shopt -u extglob
There are many, many other ways to do this. These are not necessarily the most efficient, but they're known to work.
68. How do I run a command, and have it abort (timeout) after N seconds?
There are two C programs that can do this: [http://pilcrow.madison.wi.us/ doalarm], and [http://www.porcupine.org/forensics/tct.html timeout]. (Compiling them is beyond the scope of this document; suffice to say, it'll be trivial on GNU/Linux systems, easy on most BSDs, and painful on anything else....)
If you don't have or don't want one of the above two programs, you can use a perl one-liner to set an ALRM and then exec the program you want to run under a time limit. In any case, you must understand what your program does with SIGALRM.
function doalarm () { perl -e 'alarm shift; exec @ARGV' "$@" ; } doalarm ${NUMBER_OF_SECONDS_BEFORE_ALRMING} program arg arg ...
If you can't or won't install one of these programs (which really should have been included with the basic core Unix utilities 30 years ago!), then the best you can do is an ugly hack like:
command & pid=$!; { sleep 10 && kill $pid; } &
This will, as you will soon discover, produce quite a mess regardless of whether the timeout condition kicked in or not. Cleaning it up is not something worth my time -- just use doalarm or timeout instead. Really.