4611
Comment:
|
34291
|
Deletions are marked like this. | Additions are marked like this. |
Line 5: | Line 5: |
[[TableOfContents]] |
This page shows common errors that Bash programmers make. The following examples are all flawed in some way: <<TableOfContents>> <<Anchor(pf1)>> |
Line 9: | Line 12: |
One of the most common mistakes ["BASH"] programmers make is to write a loop like this: {{{ for i in `ls *.mp3`; do # Wrong! some command $i # Wrong! |
One of the most common mistakes [[BASH]] programmers make is to write a loop like this: {{{ for i in `ls *.mp3`; do # Wrong! some command $i # Wrong! |
Line 17: | Line 20: |
This breaks when the user has a file with a space in its name. Why? Because the output of the `ls *.mp3` command substitution undergoes word splitting. Assuming we have a file named {{{01 - Don't Eat the Yellow Snow.mp3}}} in the current directory, the {{{for}}} loop will iterate over each word in the resulting file name (namely: "01", "-", "Don't", "Eat", and so on). | This breaks when the user has a file with a space in its name. Why? Because the output of the `ls *.mp3` CommandSubstitution undergoes WordSplitting. Assuming we have a file named {{{01 - Don't Eat the Yellow Snow.mp3}}} in the current directory, the {{{for}}} loop will iterate over each word in the resulting file name (namely: "01", "-", "Don't", "Eat", and so on). |
Line 22: | Line 25: |
for i in "`ls *.mp3`"; do # Wrong! | for i in "`ls *.mp3`"; do # Wrong! |
Line 28: | Line 31: |
In addition to this, the use of {{{ls}}} is just plain unnecessary. It's an external command, which simply isn't needed to do the job. So, what's the right way to do it? {{{ for i in *.mp3; do # Right! some command "$i" |
In addition to this, the use of {{{ls}}} is just plain unnecessary. It's an external command, which simply isn't needed to do the job. So, what's the right way to do it? {{{ for i in *.mp3; do # Better! and... some command "$i" # ...see Pitfall #2 for more info. |
Line 36: | Line 39: |
Let Bash expand the list of filenames for you. The expansion will ''not'' be subject to word splitting. Each filename that's matched by the {{{*.mp3}}} pattern will be treated as a separate word and, the loop will iterate once per file name. The astute reader will notice the double quotes in the second line. This leads to our second common pitfall. |
Let Bash expand the list of filenames for you. The expansion will ''not'' be subject to word splitting. Each filename that's matched by the {{{*.mp3}}} [[glob]] will be treated as a separate word, and the loop will iterate once per file name. For more details on this question, please see [[BashFAQ/020|Bash FAQ #20]]. The astute reader will notice the double quotes in the second line. This leads to our second common pitfall. <<Anchor(pf2)>> |
Line 42: | Line 48: |
What's wrong with the command shown above? Well, nothing, '''if''' you happen to know in advance that {{{$file}}} and {{{$target}}} have no white space in them. But if you don't know that in advance, or if you're paranoid, or if you're just trying to develop good habits, then you should quote your variable references to ''avoid'' having them undergo word splitting. {{{ mv "$file" "$target" }}} Without the double quotes, you'll get a command like {{{mv 01 - Don't Eat the Yellow Snow.mp3 /mnt/usb}}} and then you'll get errors like {{{mv: cannot stat `01': No such file or directory}}}. With the double quotes, all's well. |
What's wrong with the command shown above? Well, nothing, '''if''' you happen to know in advance that {{{$file}}} and {{{$target}}} have no white space or wildcards in them. But if you don't know that in advance, or if you're paranoid, or if you're just trying to develop good habits, then you should [[Quotes|quote]] your variable references to ''avoid'' having them undergo WordSplitting. {{{ cp "$file" "$target" }}} Without the double quotes, you'll get a command like {{{cp 01 - Don't Eat the Yellow Snow.mp3 /mnt/usb}}} and then you'll get errors like {{{cp: cannot stat `01': No such file or directory}}}. If $file has wildcards in it (* or ? or [...]), they will be expanded (see [[glob]]) if there are files that match them. With the double quotes, all's well, unless "$file" happens to start with a {{{-}}}, in which case {{{cp}}} thinks you're trying to feed it command line options. <<Anchor(pf3)>> == Filenames with leading dashes == Filenames with leading dashes can cause many problems. Globs like "*.mp3" are sorted into an expanded list, and "-" sorts before letters. The list is then passed to some command, which incorrectly interprets the "-filename" as an option. There are two major solutions to this. One solution is to insert {{{--}}} between the command (like {{{cp}}}) and its arguments. That tells it to stop scanning for options, and all is well: {{{ cp -- "$file" "$target" }}} The problem with this approach is that you have to insert this disabling for ''every'' command - which is easy to forget - and that not all commands support "--". For example, "echo" doesn't support "--". Another solution is to ensure that your filenames always begin with a directory (including . for the current directory, if appropriate). For example, if we're in some sort of loop: {{{ for i in ./*.mp3; do cp "$i" /target ... }}} In this case, even if we have a file whose name begins with {{{-}}}, the glob will ensure that the variable always contains something like {{{./-foo.mp3}}}, which is perfectly safe as far as {{{cp}}} is concerned. <<Anchor(pf4)>> |
Line 54: | Line 83: |
This is really the same as the previous pitfall, but I repeat it because it's ''so'' important. In the example above, the quotes are in the wrong place. You do ''not'' need to quote a string literal in bash. But you ''should'' quote your variables if you aren't sure whether they could contain white space. {{{ [ "$foo" = bar ] # Right! }}} Another way you could write this in bash involves the {{{[[}}} keyword, which extends and embraces the old {{{test}}} command (also known as {{{[}}}). {{{ [[ $foo = bar ]] # Also right! }}} You don't need to quote variable references within {{{[[ ]]}}} because they don't undergo word splitting in that context. On the other hand, quoting them won't hurt anything either. |
This is very similar to the first part of the previous pitfall, but I repeat it because it's ''so'' important. In the example above, the [[Quotes|quotes]] are in the wrong place. You do ''not'' need to quote a string literal in bash. But you ''should'' quote your variables if you aren't sure whether they could contain white space or wildcards. This breaks for two reasons: * If a variable referenced in {{{[}}} does not exist, or is blank, then the {{{[}}} command would see the line: {{{ [ $foo = "bar" ] }}} ... as: {{{ [ = "bar" ] }}} ... and throw the error {{{unary operator expected}}}. (The {{{=}}} operator is ''binary'', not unary, so the {{{[}}} command is rather shocked to see it there.) * If the variable contains internal whitespace, then it's [[WordSplitting|split into separate words]], before the {{{[}}} command sees it. Thus: {{{ [ multiple words here = "bar" ] }}} While that may look OK to you, it's a syntax error as far as {{{[}}} is concerned. A more correct way to write this would be: {{{ [ "$foo" = bar ] # Pretty close! }}} But this still breaks if {{{$foo}}} begins with a {{{-}}}. In bash, the {{{[[}}} keyword, which embraces and extends the old {{{test}}} command (also known as {{{[}}}), can be used to solve the problem: {{{ [[ $foo = bar ]] # Right! }}} You don't need to quote variable references within {{{[[ ]]}}} because they don't undergo word splitting, and even blank variables will be handled correctly. On the other hand, quoting them won't hurt anything either. You may have seen code like this: {{{ [ x"$foo" = xbar ] # Also right! }}} The {{{x"$foo"}}} hack is required for code that must run on ancient shells which lack {{{[[}}}, because if {{{$foo}}} begins with a {{{-}}}, then the {{{[}}} command may become confused. But you'll get ''really'' tired of having to explain that to everyone else. If one side is a constant, you could just do it this way: {{{ [ bar = "$foo" ] # Also right! }}} {{{[}}} doesn't care whether the token on the right hand side of the {{{=}}} begins with a {{{-}}}. It just uses it literally. It's just the left hand side that needs extra caution. <<Anchor(pf5)>> == cd `dirname "$f"` == This is mostly the same issue we've been discussing. As with a variable expansion, the result of a CommandSubstitution undergoes WordSplitting and [[glob|pathname expansion]]. So you should quote it: {{{ cd "`dirname "$f"`" }}} What's not obvious here is how the [[Quotes|quotes]] nest. A C programmer reading this would expect the first and second double-quotes to be grouped together; and then the third and fourth. But that's not the case in Bash. Bash treats the double-quotes ''inside'' the command substitution as one pair; and the double-quotes ''outside'' the substitution as another pair. Another way of writing this: the parser treats the backticks as a "nesting level", and the quotes inside it are separate from the quotes outside it. The same thing works if we use the [[BashFAQ/082|preferred $()]] syntax, too: {{{ cd "$(dirname "$f")" }}} Quotes inside {{{$()}}} are grouped together. <<Anchor(pf6)>> |
Line 70: | Line 165: |
You can't use {{{&&}}} inside the old {{{test}}} (or {{{[}}}) command. The Bash parser sees {{{&&}}} outside of {{{[[ ]]}}} or {{{(( ))}}} and breaks your command into ''two'' commands, before and after the {{{&&}}}. Use one of these instead: {{{ [ "$foo" = bar -a "$bar" = foo ] # Right! [ "$foo" = bar ] && [ "$bar" = foo ] # Also right! [[ $foo = bar && $bar = foo ]] # Also right! }}} |
You can't use {{{&&}}} inside the old {{{test}}} (or {{{[}}}) command. The Bash parser sees {{{&&}}} outside of {{{[[ ]]}}} or {{{(( ))}}} and breaks your command into ''two'' commands, before and after the {{{&&}}}. Use one of these instead: {{{ [ bar = "$foo" ] && [ foo = "$bar" ] # Right! [[ $foo = bar && $bar = foo ]] # Also right! [ bar = "$foo" -a foo = "$bar" ] # Not portable. }}} (Note that we reversed the constant and the variable inside {{{[}}} for the reasons discussed in the previous pitfall.) The same thing applies to {{{||}}}. Use {{{[[}}}, or use {{{-o}}}, or use two {{{[}}} commands. The problem with `[ A = B -a C = D ]` is that [[http://www.opengroup.org/onlinepubs/9699919799/utilities/test.html|POSIX does not specify]] the results of a `test` or `[` command with more than 4 arguments. It probably works in most shells, but you can't count on it. You should use two `test` or `[` commands with `&&` between them instead, if you have to write for POSIX shells. If you have to write for Bourne, always use `test` instead of `[`. <<Anchor(pf7)>> |
Line 80: | Line 182: |
The {{{[[ ]]}}} operator is ''not'' used for an ArithmeticExpression. It's used for strings only. If you want to do a numeric comparison against the constant 7, you must use {{{(( ))}}} instead: {{{ ((foo > 7)) # Right! }}} |
The {{{[[ ]]}}} operator should ''not'' be used for an ArithmeticExpression. It should be used for strings only. If you want to do a numeric comparison using {{{>}}} or {{{<}}}, you should use {{{(( ))}}} instead: {{{ ((foo > 7)) # Right! }}} If you use the {{{>}}} operator inside {{{[[ ]]}}}, it's treated as a string comparison, ''not'' an integer comparison. This may work sometimes, but it will fail when you least expect it. If you use {{{>}}} inside {{{[ ]}}}, it's even worse: it's an output redirection. You'll get a file named {{{7}}} in your directory, and the test will succeed as long as {{{$foo}}} is not empty. If you're developing for a BourneShell instead of bash, this is the historically correct version: {{{ test $foo -gt 7 # Also right! }}} Note that the {{{test ... -gt}}} command will fail in interesting ways if {{{$foo}}} is [[BashFAQ/054|not an integer]]. Therefore, there's not much point in quoting it properly -- if it's got white space, or is empty, or is anything ''other than'' an integer, we're probably going to crash anyway. You'll need to sanitize your input aggressively. The double brackets support this syntax too: {{{ [[ $foo -gt 7 ]] # Also right! }}} But why use that when you could use `((...))` instead? <<Anchor(pf8)>> |
Line 88: | Line 209: |
The code above looks OK at first glance, doesn't it? Sure, it's just a poor implementation of {{{grep -c}}}, but it's intended as a simplistic example. So why doesn't it work? The variable {{{count}}} will be unchanged after the loop terminates, much to the surprise of Bash developers everywhere. The reason this code does not work as expected is because each command in a pipeline is executed in a separate subshell. The changes to the {{{count}}} variable within the loop's subshell aren't reflected within the parent shell (the script in which the code occurs). For solutions to this, please see [wiki:Self:BashFaq#faq24 Bash FAQ #24]. |
The code above looks OK at first glance, doesn't it? Sure, it's just a poor implementation of {{{grep -c}}}, but it's intended as a simplistic example. So why doesn't it work? The variable {{{count}}} will be unchanged after the loop terminates, much to the surprise of Bash developers everywhere. The reason this code does not work as expected is because each command in a pipeline is executed in a separate SubShell. The changes to the {{{count}}} variable within the loop's subshell aren't reflected within the parent shell (the script in which the code occurs). For solutions to this, please see [[BashFAQ/024|Bash FAQ #24]]. <<Anchor(pf9)>> == if [grep foo myfile] == Many people are confused by the common practice of using the {{{[}}} command after an {{{if}}}. They see this and convince themselves that the {{{[}}} is part of the {{{if}}} statement's syntax, just like parentheses are used in C's {{{if}}} statement. However, that is ''not'' the case! {{{[}}} is a command, not a syntax marker for the {{{if}}} statement. It's equivalent to the {{{test}}} command, except for the requirement that the final argument must be a {{{]}}}. The syntax of the {{{if}}} statement is as follows: {{{ if COMMANDS then COMMANDS elif COMMANDS # optional then COMMANDS else # optional COMMANDS fi # required }}} There may be zero or more optional {{{elif}}} sections, and one optional {{{else}}} section. Note: there '''is no [''' in the syntax! Once again, {{{[}}} is a command. It takes arguments, and it produces an exit code. It may produce error messages. It does not, however, produce any standard output. The {{{if}}} statement evaluates the first set of {{{COMMANDS}}} that are given to it (up until {{{then}}}, as the first word of a new command). The exit code of the last command from that set determines whether the {{{if}}} statement will execute the {{{COMMANDS}}} that are in the {{{then}}} section, or move on. If you want to make a decision based on the output of a {{{grep}}} command, you do ''not'' need to enclose it in parentheses, brackets, backticks, or ''any other'' syntax mark-up! Just use `grep` as the {{{COMMANDS}}} after the {{{if}}}, like this: {{{ if grep foo myfile >/dev/null; then ... fi }}} Note that we discard the standard output of the grep (which would normally include the matching line, if any), because we don't want to ''see'' it -- we just want to know whether it's ''there''. If the {{{grep}}} matches a line from {{{myfile}}}, then the exit code will be 0 (true), and the {{{then}}} part will be executed. Otherwise, if there is no matching line, the {{{grep}}} should return a non-zero exit code. In recent versions of `grep` you can use {{{-q}}} (quiet) option to suppress stdout. <<Anchor(pf10)>> == if [bar="$foo"] == As we explained in the previous example, {{{[}}} is a command. Just like with any other command, Bash expects the command to be followed by a space, then the first argument, then another space, etc. You can't just run things all together without putting the spaces in! Here is the correct way: {{{ if [ bar = "$foo" ] }}} Each of {{{bar}}}, {{{=}}}, {{{"$foo"}}} (after substitution, but without WordSplitting) and {{{]}}} is a separate argument to the {{{[}}} command. There must be whitespace between each pair of arguments, so the shell knows where each argument begins and ends. <<Anchor(pf11)>> == if [ [ a = b ] && [ c = d ] ] == Here we go again. {{{[}}} is a ''command''. It is not a syntactic marker that sits between {{{if}}} and some sort of C-like "condition". Nor is it used for grouping. You cannot take C-like {{{if}}} commands and translate them into Bash commands just by replacing parentheses with square brackets! If you want to express a compound conditional, do this: {{{ if [ a = b ] && [ c = d ] }}} Note that here we have two ''commands'' after the {{{if}}}, joined by an {{{&&}}} operator (see the documentation if you don't know what that does). It's precisely the same as: {{{ if test a = b && test c = d }}} If the first {{{test}}} command returns false, the body of the {{{if}}} statement is not entered. If it returns true, then the second {{{test}}} command is run; and if that also one returns true, then the body of the {{{if}}} statement ''will'' be entered. <<Anchor(pf12)>> == cat file | sed s/foo/bar/ > file == You '''cannot''' read from a file and write to it in the same pipeline. Depending on what your pipeline does, the file may be clobbered (to 0 bytes, or possibly to a number of bytes equal to the size of your operating system's pipeline buffer), or it may grow until it fills the available disk space, or reaches your operating system's file size limitation, or your quota, etc. If you want to make a change to a file, other than appending to the end of it, there ''must'' be a temporary file created at some point. For example, the following is completely portable: {{{ sed 's/foo/bar/g' file > tmpfile && mv tmpfile file }}} The following will ''only'' work on GNU sed 4.x: {{{ sed -i 's/foo/bar/g' file(s) }}} Note that this also creates a temporary file, and does the same sort of renaming trickery -- it just handles it transparently. And the following equivalent command requires perl 5.x (which is probably more widely available than GNU sed 4.x): {{{ perl -pi -e 's/foo/bar/g' file(s) }}} For more details, please see [[BashFAQ/021|Bash FAQ #21]]. <<Anchor(pf13)>> == echo $foo == This relatively innocent-looking command causes ''massive'' confusion. Because the {{{$foo}}} isn't [[Quotes|quoted]], it will not only be subject to WordSplitting, but also file [[glob|globbing]]. This misleads Bash programmers into thinking their variables ''contain'' the wrong values, when in fact the variables are OK -- it's just the word splitting or filename expansion that's messing up their view of what's happening. {{{ MSG="Please enter a file name of the form *.zip" echo $MSG }}} This message is split into words and any globs are expanded, such as the *.zip. What will your users think when they see this message: {{{ Please enter a file name of the form freenfss.zip lw35nfss.zip }}} To demonstrate: {{{ VAR=*.zip # VAR contains an asterisk, a period, and the word "zip" echo "$VAR" # writes *.zip echo $VAR # writes the list of files which end with .zip }}} In fact, the `echo` command cannot be used with absolute safety here. If the variable contains `-n` for example, `echo` will consider that an option, rather than data to be printed. The only absolutely ''sure'' way to print the value of a variable is using `printf`: {{{ printf "%s\n" "$foo" }}} <<Anchor(pf14)>> == $foo=bar == No, you don't assign a variable by putting a {{{$}}} in front of the variable name. This isn't perl. <<Anchor(pf15)>> == foo = bar == No, you can't put spaces around the {{{=}}} when assigning to a variable. This isn't C. When you write {{{foo = bar}}} the shell splits it into three words. The first word, {{{foo}}}, is taken as the command name. The second and third become the arguments to that command. Likewise, the following are also wrong: {{{ foo= bar # WRONG! foo =bar # WRONG! $foo = bar; # COMPLETELY WRONG! foo=bar # Right. }}} <<Anchor(pf16)>> == echo <<EOF == A here document is a useful tool for embedding large blocks of textual data in a script. It causes a redirection of the lines of text in the script to the standard input of a command. Unfortunately, {{{echo}}} is not a command which reads from stdin. {{{ # This is wrong: echo <<EOF Hello world EOF # This is right: cat <<EOF Hello world EOF # OR by using plain echo. This very efficient in Bash, # because echo is a built-in command echo "\ Hello world " }}} <<Anchor(pf17)>> == su -c 'some command' == This syntax is ''almost'' correct. The problem is, on many platforms, {{{su}}} takes a {{{-c}}} argument, but it's not the one you want. For example, on OpenBSD: {{{ $ su -c 'echo hello' su: only the superuser may specify a login class }}} You want to pass {{{-c 'some command'}}} to a shell, which means you need a username before the {{{-c}}}. {{{ su root -c 'some command' # Now it's right. }}} {{{su}}} assumes a username of root when you omit one, but this falls on its face when you want to pass a command to the shell afterward. You must supply the username in this case. <<Anchor(pf18)>> == cd /foo; bar == If you don't check for errors from the {{{cd}}} command, you might end up executing {{{bar}}} in the wrong place. This could be a major disaster, if for example {{{bar}}} happens to be {{{rm *}}}. You must '''always''' check for errors from a {{{cd}}} command. The simplest way to do that is: {{{ cd /foo && bar }}} If there's more than just one command after the {{{cd}}}, you might prefer this: {{{ cd /foo || exit 1 bar baz bat ... # Lots of commands. }}} {{{cd}}} will report the failure to change directories, with a stderr message such as "bash: cd: /foo: No such file or directory". If you want to add your own message in stdout, however, you could use command grouping: {{{ cd /net || { echo "Can't read /net. Make sure you've logged in to the Samba network, and try again."; exit 1; } do_stuff more_stuff }}} Note there's a required space between `"{"` and `"echo"`, and a required `";"` before the closing `"}"`. Some people also like to enable {{{set -e}}} to make their scripts abort on ''any'' command that returns non-zero, but this can be rather tricky to use correctly (since many common commands may return a non-zero for a warning condition, which you may not want to treat as fatal). By the way, if you're changing directories a lot in a Bash script, be sure to read the Bash manual page on {{{pushd}}}, {{{popd}}}, and {{{dirs}}}. Perhaps all that code you wrote to manage {{{cd}}}'s and {{{pwd}}}'s is completely unnecessary. Speaking of which, compare this: {{{ find ... -type d | while read subdir; do cd "$subdir" && whatever && ... && cd - done }}} With this: {{{ find ... -type d | while read subdir; do (cd "$subdir" && whatever && ...) done }}} Forcing a SubShell here causes the {{{cd}}} to occur only in the subshell; for the next iteration of the loop, we're back to our normal location, regardless of whether the {{{cd}}} succeeded or failed. We don't have to change back manually. In fact, the penultimate example isn't even valid -- if one of the {{{whatever}}} commands fails, we might not {{{cd}}} back where we need to be. To correct it without using the subshell, we'd have to arrange to execute some sort of {{{cd "$ORIGINAL_DIR"}}} command within each loop iteration. It would be frightfully messy. The subshell version is much simpler and cleaner. <<Anchor(pf19)>> == [ bar == "$foo" ] == The {{{==}}} operator is not valid for the {{{[}}} command. Use {{{=}}} instead, or use the {{{[[}}} keyword instead. {{{ [ bar = "$foo" ] && echo yes [[ bar == $foo ]] && echo yes }}} <<Anchor(pf20)>> == for i in {1..10}; do ./something &; done == You ''cannot'' put a {{{;}}} immediately after an {{{&}}}. Just remove the extraneous {{{;}}} entirely. {{{ for i in {1..10}; do ./something & done }}} Or: {{{ for i in {1..10}; do ./something & done }}} {{{&}}} already functions as a command terminator, just like {{{;}}} does. And you cannot mix the two. In general, a `;` can be replaced by a newline, but not all newlines can be replaced by `;`. <<Anchor(pf21)>> == cmd1 && cmd2 || cmd3 == Some people like to use {{{&&}}} and {{{||}}} as a shortcut syntax for {{{if ... then ... else ... fi}}}. In many cases, this is perfectly safe: {{{ [[ -s $errorlog ]] && echo "Uh oh, there were some errors." || echo "Successful." }}} However, this construct is ''not'' completely equivalent to {{{if ... fi}}} in the general case, because the command that comes after the {{{&&}}} also generates an exit status. And if that exit status isn't "true" (0), then the command that comes after the {{{||}}} will also be invoked. For example: {{{ i=0 true && ((i++)) || ((i--)) echo $i # Prints 0 }}} What happened here? It looks like {{{i}}} should be 1, but it ends up 0. Why? Because both the {{{i++}}} ''and'' the {{{i--}}} were executed. The {{{((i++))}}} command has an exit status, and that exit status is derived from a C-like evaluation of the expression inside the parentheses. That expression's value happens to be 0 (the initial value of {{{i}}}), and in C, an expression with an integer value of 0 is considered ''false''. So {{{((i++))}}} (when {{{i}}} is 0) has an exit status of 1 (false), and therefore the {{{((i--))}}} command is executed as well. This does not occur if we use the pre-increment operator, since the exit status from `++i` is true: {{{ i=0 true && (( ++i )) || (( --i )) echo $i # Prints 1 }}} But that's missing the point of the example. It just ''happens'' to work by ''coincidence'', and you cannot rely on `x && y || z` if `y` has '''any''' chance of failure! (This example fails if we initialize `i` to -1 instead of 0.) If you need safety, or if you simply aren't sure how this works, or if ''anything'' in the preceding paragraphs wasn't completely clear, please just use the simple {{{if ... fi}}} syntax in your programs. {{{ i=0 if true; then ((i++)) else ((i--)) fi echo $i # Prints 1 }}} This section also applies to Bourne shell, here is the code that illustrates it: {{{ true && { echo true; false; } || { echo false; true; } }}} Output is two lines "true" and "false", instead the single line "true". <<Anchor(pf22)>> == On UTF-8 and Byte-Order Marks (BOM) == '''In general:''' Unix UTF-8 text does not use BOM. The encoding of plain text is determined by the locale or by mime types or other metadata. While the presence of a BOM would not normally damage a UTF-8 document meant only for reading by humans, it is problematic (often syntactically illegal) in any text file meant to be interpreted by automated processes such as scripts, source code, configuration files, and so on. Files starting with BOM should be considered equally foreign as those with MS-DOS linebreaks. '''In shell scripting:''' 'Where UTF-8 is used transparently in 8-bit environments, the use of a BOM will interfere with any protocol or file format that expects specific ASCII characters at the beginning, such as the use of "#!" of at the beginning of Unix shell scripts.' http://unicode.org/faq/utf_bom.html#bom5 <<Anchor(pf23)>> == echo "Hello World!" == The problem here is that, in an interactive Bash shell, you'll see an error like: {{{ bash: !": event not found }}} This is because, in the default settings for an interactive shell, Bash performs csh-style history expansion using the exclamation point. This is '''not''' a problem in shell scripts; only in interactive shells. Unfortunately, the obvious attempt to "fix" this won't work: {{{ $ echo "hi\!" hi\! }}} The easiest solution is unsetting the {{{histexpand}}} option: this can be done with {{{set +H}}} or {{{set +o histexpand}}} Question: Why is playing with {{{histexpand}}} more apropriate than single quotes? ''I personally ran into this issue when I was manipulating song files, using commands like'' {{{ mp3info -t "Don't Let It Show" ... mp3info -t "Ah! Leah!" ... }}} ''Using single quotes is extremely inconvenient because of all the songs with apostrophes in their titles. Using double quotes ran into the history expansion issue. (And imagine a file that has both in its name. The quoting would be atrocious.) Since I never actually ''use'' history expansion, my personal preference was to turn it off in {{{~/.bashrc}}}.'' -- GreyCat These solutions will work: {{{ echo 'Hello World!' }}} or {{{ set +H echo "Hello World!" }}} Many people simply choose to put {{{set +H}}} or {{{set +o histexpand}}} in their {{{~/.bashrc}}} to deactivate history expansion permanently. This is a personal preference, though, and you should choose whatever works best for you. <<Anchor(pf24)>> == for arg in $* == Bash (like all Bourne shells) has a special syntax for referring to the list of positional parameters one at a time, and {{{$*}}} isn't it. Neither is {{{$@}}}. Both of those expand to the list of words in your script's parameters, not to each parameter as a separate word. The correct syntax is: {{{ for arg in "$@" # Or simply: for arg }}} Since looping over the positional parameters is such a common thing to do in scripts, {{{for arg}}} defaults to {{{for arg in "$@"}}}. The double-quoted {{{"$@"}}} is special magic that causes each parameter to be used as a single word (or a single loop iteration). It's what you should be using at least 99% of the time. Here's an example: {{{ # Incorrect version for x in $*; do echo "parameter: '$x'" done $ ./myscript 'arg 1' arg2 arg3 parameter: 'arg' parameter: '1' parameter: 'arg2' parameter: 'arg3' }}} It should have been written: {{{ # Correct version for x in "$@"; do echo "parameter: '$x'" done $ ./myscript 'arg 1' arg2 arg3 parameter: 'arg 1' parameter: 'arg2' parameter: 'arg3' }}} <<Anchor(pf25)>> == function foo() == This works in some shells, but not in others. You should ''never'' combine the keyword `function` with the parentheses `()` when defining a function. Bash (at least some versions) will allow you to mix the two. No other shell will, as far as I know. Some shells will accept `function foo`, but for maximum portability, you should always use: {{{ foo() { ... } }}} <<Anchor(pf26)>> == echo "~" == Tilde expansion only applies when '~' is unquoted. In this example echo writes '~' to stdout, rather than the path of the user's home directory. Quoting path parameters that are expressed relative to a user's home directory should be done using $HOME rather than '~'. For instance consider the situation where $HOME is "/home/my photos". {{{ "~/dir with spaces" # expands to "~/dir with spaces" ~"/dir with spaces" # expands to "~/dir with spaces" ~/"dir with spaces" # expands to "/home/my photos/dir with spaces" "$HOME/dir with spaces" # expands to "/home/my photos/dir with spaces" }}} <<Anchor(pf27)>> == local varname=$(command) == When declaring a local variable in a function, the `local` acts as a command in its own right. This can sometimes interact oddly with the rest of the line -- for example, if you wanted to capture the exit status (`$?`) of the CommandSubstitution, you can't do it. `local`'s exit status masks it. It's best to use separate commands for this: {{{ local varname varname=$(command) rc=$? }}} <<Anchor(pf28)>> == sed 's/$foo/good bye/' == In [[Quotes|single quotes]], bash parameter expansions like `$foo` do not get expanded. That is the purpose of single quotes, to protect characters like `$` from the shell. Change the quotes to double quotes: {{{ foo="hello"; sed "s/$foo/good bye/" }}} But keep in mind, if you use double quotes you might need to use more escapes. See the [[Quotes]] page. <<Anchor(pf29)>> == tr [A-Z] [a-z] == There are (at least) three things wrong here. The first problem is that `[A-Z]` and `[a-z]` are seen as [[glob]]s by the shell. If you don't have any single-lettered filenames in your current directory, it'll seem like the command is correct; but if you do, things will go wrong. Probably at 0300 hours on a weekend. The second problem is that this is not really the correct notation for `tr`. What this actually does is translate '[' into '['; anything in the range A-Z into a-z; and ']' into ']'. So you don't even need those brackets, and the first problem goes away. The third problem is that depending on the [[locale]], A-Z or a-z may not give you the 26 ASCII characters you were expecting. In fact, in some locales z is in the middle of the alphabet! The solution to this depends on what you want to happen: {{{ # Use this if you want to change the case of the 26 latin letters LC_COLLATE=C tr A-Z a-z # Use this if you want the case conversion to depend upon the locale, which might be more like what a user is expecting tr '[:upper:]' '[:lower:]' }}} The quotes are required on the second command, to avoid [[glob|globbing]]. <<Anchor(pf30)>> == ps ax | grep gedit == The fundamental problem here is that the name of a running process is inherently unreliable. There could be more than one legitimate gedit process. There could be something else disguising itself as gedit (changing the reported name of an executed command is trivial). For ''real'' answers to this, see ProcessManagement. The following is the quick and dirty stuff. Searching for the PID of (for example) gedit, many people start with {{{ $ ps ax | grep gedit 10530 ? S 6:23 gedit 32118 pts/0 R+ 0:00 grep gedit }}} which, depending on a RaceCondition, often yields grep itself as a result. To filter grep out: {{{ ps ax | grep -v grep | grep gedit # will work, but ugly }}} On GNU/Linux, the parameter -C can be used instead to filter by commandname: {{{ $ ps -C gedit PID TTY TIME CMD 10530 ? 00:06:23 gedit }}} But why bother when you could just use pgrep instead? {{{ $ pgrep gedit 10530 }}} Now in a second step the PID is often extracted by awk or cut: {{{ $ ps -C gedit | awk '{print $1}' | tail -n1 }}} but even that can be handled by some of the trillions of parameters for ps: {{{ $ ps -C gedit -opid= 10530 }}} If you're stuck in 1992 and aren't using pgrep, you could use the ancient, obsolete, deprecated pidof (GNU/Linux only) instead: {{{ $ pidof gedit 10530 }}} and if you need the PID to kill the process, ''pkill'' might be interesting for you. Note however that, for example, {{{pgrep/pkill ssh}}} would also find processes named sshd, and you wouldn't want to kill those. Unfortunately some programs aren't started with their name, for example firefox is often started as firefox-bin, which you would need to find out with - well - '''ps ax | grep firefox'''. :) Please read ProcessManagement. Seriously. <<Anchor(pf31)>> == printf "$foo" == This isn't wrong because of [[Quotes|quotes]], but because of a ''format string exploit''. If `$foo` is not strictly under your control, then any `\` or `%` characters in the variable may cause undesired behavior. Always supply your own format string: {{{ printf %s "$foo" printf '%s\n' "$foo" }}} <<Anchor(pf32)>> == [[ -e "$broken_symlink" ]] returns 1 even though $broken_symlink exists == Test follows symlinks, therefore if a symlink is broken, i.e. it points to a file that doesn't exists, test -e returns 1 for it even though it exists. In order to work around it (and prepare against it) you should use: {{{ [[ -e "$broken_symlink" || -L "$broken_symlink" ]] }}} <<Anchor(pf33)>> == ed file <<<"g/d\{0,3\}/s//e/g" fails == The problem caused because ed doesn't accept 0 for \{0,3\}. You can check that the following do work: {{{ ed file <<<"g/d\{1,3\}/s//e/g" }}} Note that this happens even though POSIX states that BRE (which is the Regular Expression flavor used by ed) [[http://www.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_06|should accept 0 as the minimum number of occurrences (see section 5)]]. <<Anchor(pf33)>> == expr sub-string fails for "match" == This works reasonably well - most of the time {{{ word=abcde expr "$word" : ".\(.*\)" bcde }}} But WILL fail for the word "match" {{{ word=match expr "$word" : ".\(.*\)" }}} The problem is "match" is a keyword. Solution (Gnu only) is prefix with a '+' {{{ word=match expr + "$word" : ".\(.*\)" atch }}} ---- CategoryShell CategoryShell |
Bash Pitfalls
This page shows common errors that Bash programmers make. The following examples are all flawed in some way:
Contents
- for i in `ls *.mp3`
- cp $file $target
- Filenames with leading dashes
- [ $foo = "bar" ]
- cd `dirname "$f"`
- [ "$foo" = bar && "$bar" = foo ]
- [[ $foo > 7 ]]
- grep foo bar | while read line; do ((count++)); done
- if [grep foo myfile]
- if [bar="$foo"]
- if [ [ a = b ] && [ c = d ] ]
- cat file | sed s/foo/bar/ > file
- echo $foo
- $foo=bar
- foo = bar
- echo <<EOF
- su -c 'some command'
- cd /foo; bar
- [ bar == "$foo" ]
- for i in {1..10}; do ./something &; done
- cmd1 && cmd2 || cmd3
- On UTF-8 and Byte-Order Marks (BOM)
- echo "Hello World!"
- for arg in $*
- function foo()
- echo "~"
- local varname=$(command)
- sed 's/$foo/good bye/'
- tr [A-Z] [a-z]
- ps ax | grep gedit
- printf "$foo"
- [[ -e "$broken_symlink" ]] returns 1 even though $broken_symlink exists
- ed file <<<"g/d\{0,3\}/s//e/g" fails
- expr sub-string fails for "match"
1. for i in `ls *.mp3`
One of the most common mistakes BASH programmers make is to write a loop like this:
for i in `ls *.mp3`; do # Wrong! some command $i # Wrong! done
This breaks when the user has a file with a space in its name. Why? Because the output of the ls *.mp3 CommandSubstitution undergoes WordSplitting. Assuming we have a file named 01 - Don't Eat the Yellow Snow.mp3 in the current directory, the for loop will iterate over each word in the resulting file name (namely: "01", "-", "Don't", "Eat", and so on).
You can't double-quote the substitution either:
for i in "`ls *.mp3`"; do # Wrong! ...
This causes the entire output of the ls command to be treated as a single word, and instead of iterating over each file name in the output list, the loop will only execute once, with i taking on a value which is the concatenation of all the file names (with spaces between them).
In addition to this, the use of ls is just plain unnecessary. It's an external command, which simply isn't needed to do the job. So, what's the right way to do it?
for i in *.mp3; do # Better! and... some command "$i" # ...see Pitfall #2 for more info. done
Let Bash expand the list of filenames for you. The expansion will not be subject to word splitting. Each filename that's matched by the *.mp3 glob will be treated as a separate word, and the loop will iterate once per file name.
For more details on this question, please see Bash FAQ #20.
The astute reader will notice the double quotes in the second line. This leads to our second common pitfall.
2. cp $file $target
What's wrong with the command shown above? Well, nothing, if you happen to know in advance that $file and $target have no white space or wildcards in them.
But if you don't know that in advance, or if you're paranoid, or if you're just trying to develop good habits, then you should quote your variable references to avoid having them undergo WordSplitting.
cp "$file" "$target"
Without the double quotes, you'll get a command like cp 01 - Don't Eat the Yellow Snow.mp3 /mnt/usb and then you'll get errors like cp: cannot stat `01': No such file or directory. If $file has wildcards in it (* or ? or [...]), they will be expanded (see glob) if there are files that match them. With the double quotes, all's well, unless "$file" happens to start with a -, in which case cp thinks you're trying to feed it command line options.
3. Filenames with leading dashes
Filenames with leading dashes can cause many problems. Globs like "*.mp3" are sorted into an expanded list, and "-" sorts before letters. The list is then passed to some command, which incorrectly interprets the "-filename" as an option. There are two major solutions to this.
One solution is to insert -- between the command (like cp) and its arguments. That tells it to stop scanning for options, and all is well:
cp -- "$file" "$target"
The problem with this approach is that you have to insert this disabling for every command - which is easy to forget - and that not all commands support "--". For example, "echo" doesn't support "--".
Another solution is to ensure that your filenames always begin with a directory (including . for the current directory, if appropriate). For example, if we're in some sort of loop:
for i in ./*.mp3; do cp "$i" /target ...
In this case, even if we have a file whose name begins with -, the glob will ensure that the variable always contains something like ./-foo.mp3, which is perfectly safe as far as cp is concerned.
4. [ $foo = "bar" ]
This is very similar to the first part of the previous pitfall, but I repeat it because it's so important. In the example above, the quotes are in the wrong place. You do not need to quote a string literal in bash. But you should quote your variables if you aren't sure whether they could contain white space or wildcards.
This breaks for two reasons:
If a variable referenced in [ does not exist, or is blank, then the [ command would see the line:
[ $foo = "bar" ]
- .. as:
[ = "bar" ]
.. and throw the error unary operator expected. (The = operator is binary, not unary, so the [ command is rather shocked to see it there.)
If the variable contains internal whitespace, then it's split into separate words, before the [ command sees it. Thus:
[ multiple words here = "bar" ]
While that may look OK to you, it's a syntax error as far as [ is concerned.
A more correct way to write this would be:
[ "$foo" = bar ] # Pretty close!
But this still breaks if $foo begins with a -.
In bash, the [[ keyword, which embraces and extends the old test command (also known as [), can be used to solve the problem:
[[ $foo = bar ]] # Right!
You don't need to quote variable references within [[ ]] because they don't undergo word splitting, and even blank variables will be handled correctly. On the other hand, quoting them won't hurt anything either.
You may have seen code like this:
[ x"$foo" = xbar ] # Also right!
The x"$foo" hack is required for code that must run on ancient shells which lack [[, because if $foo begins with a -, then the [ command may become confused. But you'll get really tired of having to explain that to everyone else.
If one side is a constant, you could just do it this way:
[ bar = "$foo" ] # Also right!
[ doesn't care whether the token on the right hand side of the = begins with a -. It just uses it literally. It's just the left hand side that needs extra caution.
5. cd `dirname "$f"`
This is mostly the same issue we've been discussing. As with a variable expansion, the result of a CommandSubstitution undergoes WordSplitting and pathname expansion. So you should quote it:
cd "`dirname "$f"`"
What's not obvious here is how the quotes nest. A C programmer reading this would expect the first and second double-quotes to be grouped together; and then the third and fourth. But that's not the case in Bash. Bash treats the double-quotes inside the command substitution as one pair; and the double-quotes outside the substitution as another pair.
Another way of writing this: the parser treats the backticks as a "nesting level", and the quotes inside it are separate from the quotes outside it.
The same thing works if we use the preferred $() syntax, too:
cd "$(dirname "$f")"
Quotes inside $() are grouped together.
6. [ "$foo" = bar && "$bar" = foo ]
You can't use && inside the old test (or [) command. The Bash parser sees && outside of [[ ]] or (( )) and breaks your command into two commands, before and after the &&. Use one of these instead:
[ bar = "$foo" ] && [ foo = "$bar" ] # Right! [[ $foo = bar && $bar = foo ]] # Also right! [ bar = "$foo" -a foo = "$bar" ] # Not portable.
(Note that we reversed the constant and the variable inside [ for the reasons discussed in the previous pitfall.)
The same thing applies to ||. Use [[, or use -o, or use two [ commands.
The problem with [ A = B -a C = D ] is that POSIX does not specify the results of a test or [ command with more than 4 arguments. It probably works in most shells, but you can't count on it. You should use two test or [ commands with && between them instead, if you have to write for POSIX shells. If you have to write for Bourne, always use test instead of [.
7. [[ $foo > 7 ]]
The [[ ]] operator should not be used for an ArithmeticExpression. It should be used for strings only. If you want to do a numeric comparison using > or <, you should use (( )) instead:
((foo > 7)) # Right!
If you use the > operator inside [[ ]], it's treated as a string comparison, not an integer comparison. This may work sometimes, but it will fail when you least expect it. If you use > inside [ ], it's even worse: it's an output redirection. You'll get a file named 7 in your directory, and the test will succeed as long as $foo is not empty.
If you're developing for a BourneShell instead of bash, this is the historically correct version:
test $foo -gt 7 # Also right!
Note that the test ... -gt command will fail in interesting ways if $foo is not an integer. Therefore, there's not much point in quoting it properly -- if it's got white space, or is empty, or is anything other than an integer, we're probably going to crash anyway. You'll need to sanitize your input aggressively.
The double brackets support this syntax too:
[[ $foo -gt 7 ]] # Also right!
But why use that when you could use ((...)) instead?
8. grep foo bar | while read line; do ((count++)); done
The code above looks OK at first glance, doesn't it? Sure, it's just a poor implementation of grep -c, but it's intended as a simplistic example. So why doesn't it work? The variable count will be unchanged after the loop terminates, much to the surprise of Bash developers everywhere.
The reason this code does not work as expected is because each command in a pipeline is executed in a separate SubShell. The changes to the count variable within the loop's subshell aren't reflected within the parent shell (the script in which the code occurs).
For solutions to this, please see Bash FAQ #24.
9. if [grep foo myfile]
Many people are confused by the common practice of using the [ command after an if. They see this and convince themselves that the [ is part of the if statement's syntax, just like parentheses are used in C's if statement.
However, that is not the case! [ is a command, not a syntax marker for the if statement. It's equivalent to the test command, except for the requirement that the final argument must be a ].
The syntax of the if statement is as follows:
if COMMANDS then COMMANDS elif COMMANDS # optional then COMMANDS else # optional COMMANDS fi # required
There may be zero or more optional elif sections, and one optional else section. Note: there is no [ in the syntax!
Once again, [ is a command. It takes arguments, and it produces an exit code. It may produce error messages. It does not, however, produce any standard output.
The if statement evaluates the first set of COMMANDS that are given to it (up until then, as the first word of a new command). The exit code of the last command from that set determines whether the if statement will execute the COMMANDS that are in the then section, or move on.
If you want to make a decision based on the output of a grep command, you do not need to enclose it in parentheses, brackets, backticks, or any other syntax mark-up! Just use grep as the COMMANDS after the if, like this:
if grep foo myfile >/dev/null; then ... fi
Note that we discard the standard output of the grep (which would normally include the matching line, if any), because we don't want to see it -- we just want to know whether it's there. If the grep matches a line from myfile, then the exit code will be 0 (true), and the then part will be executed. Otherwise, if there is no matching line, the grep should return a non-zero exit code.
In recent versions of grep you can use -q (quiet) option to suppress stdout.
10. if [bar="$foo"]
As we explained in the previous example, [ is a command. Just like with any other command, Bash expects the command to be followed by a space, then the first argument, then another space, etc. You can't just run things all together without putting the spaces in! Here is the correct way:
if [ bar = "$foo" ]
Each of bar, =, "$foo" (after substitution, but without WordSplitting) and ] is a separate argument to the [ command. There must be whitespace between each pair of arguments, so the shell knows where each argument begins and ends.
11. if [ [ a = b ] && [ c = d ] ]
Here we go again. [ is a command. It is not a syntactic marker that sits between if and some sort of C-like "condition". Nor is it used for grouping. You cannot take C-like if commands and translate them into Bash commands just by replacing parentheses with square brackets!
If you want to express a compound conditional, do this:
if [ a = b ] && [ c = d ]
Note that here we have two commands after the if, joined by an && operator (see the documentation if you don't know what that does). It's precisely the same as:
if test a = b && test c = d
If the first test command returns false, the body of the if statement is not entered. If it returns true, then the second test command is run; and if that also one returns true, then the body of the if statement will be entered.
12. cat file | sed s/foo/bar/ > file
You cannot read from a file and write to it in the same pipeline. Depending on what your pipeline does, the file may be clobbered (to 0 bytes, or possibly to a number of bytes equal to the size of your operating system's pipeline buffer), or it may grow until it fills the available disk space, or reaches your operating system's file size limitation, or your quota, etc.
If you want to make a change to a file, other than appending to the end of it, there must be a temporary file created at some point. For example, the following is completely portable:
sed 's/foo/bar/g' file > tmpfile && mv tmpfile file
The following will only work on GNU sed 4.x:
sed -i 's/foo/bar/g' file(s)
Note that this also creates a temporary file, and does the same sort of renaming trickery -- it just handles it transparently.
And the following equivalent command requires perl 5.x (which is probably more widely available than GNU sed 4.x):
perl -pi -e 's/foo/bar/g' file(s)
For more details, please see Bash FAQ #21.
13. echo $foo
This relatively innocent-looking command causes massive confusion. Because the $foo isn't quoted, it will not only be subject to WordSplitting, but also file globbing. This misleads Bash programmers into thinking their variables contain the wrong values, when in fact the variables are OK -- it's just the word splitting or filename expansion that's messing up their view of what's happening.
MSG="Please enter a file name of the form *.zip" echo $MSG
This message is split into words and any globs are expanded, such as the *.zip. What will your users think when they see this message:
Please enter a file name of the form freenfss.zip lw35nfss.zip
To demonstrate:
VAR=*.zip # VAR contains an asterisk, a period, and the word "zip" echo "$VAR" # writes *.zip echo $VAR # writes the list of files which end with .zip
In fact, the echo command cannot be used with absolute safety here. If the variable contains -n for example, echo will consider that an option, rather than data to be printed. The only absolutely sure way to print the value of a variable is using printf:
printf "%s\n" "$foo"
14. $foo=bar
No, you don't assign a variable by putting a $ in front of the variable name. This isn't perl.
15. foo = bar
No, you can't put spaces around the = when assigning to a variable. This isn't C. When you write foo = bar the shell splits it into three words. The first word, foo, is taken as the command name. The second and third become the arguments to that command.
Likewise, the following are also wrong:
foo= bar # WRONG! foo =bar # WRONG! $foo = bar; # COMPLETELY WRONG! foo=bar # Right.
16. echo <<EOF
A here document is a useful tool for embedding large blocks of textual data in a script. It causes a redirection of the lines of text in the script to the standard input of a command. Unfortunately, echo is not a command which reads from stdin.
# This is wrong: echo <<EOF Hello world EOF # This is right: cat <<EOF Hello world EOF # OR by using plain echo. This very efficient in Bash, # because echo is a built-in command echo "\ Hello world "
17. su -c 'some command'
This syntax is almost correct. The problem is, on many platforms, su takes a -c argument, but it's not the one you want. For example, on OpenBSD:
$ su -c 'echo hello' su: only the superuser may specify a login class
You want to pass -c 'some command' to a shell, which means you need a username before the -c.
su root -c 'some command' # Now it's right.
su assumes a username of root when you omit one, but this falls on its face when you want to pass a command to the shell afterward. You must supply the username in this case.
18. cd /foo; bar
If you don't check for errors from the cd command, you might end up executing bar in the wrong place. This could be a major disaster, if for example bar happens to be rm *.
You must always check for errors from a cd command. The simplest way to do that is:
cd /foo && bar
If there's more than just one command after the cd, you might prefer this:
cd /foo || exit 1 bar baz bat ... # Lots of commands.
cd will report the failure to change directories, with a stderr message such as "bash: cd: /foo: No such file or directory". If you want to add your own message in stdout, however, you could use command grouping:
cd /net || { echo "Can't read /net. Make sure you've logged in to the Samba network, and try again."; exit 1; } do_stuff more_stuff
Note there's a required space between "{" and "echo", and a required ";" before the closing "}".
Some people also like to enable set -e to make their scripts abort on any command that returns non-zero, but this can be rather tricky to use correctly (since many common commands may return a non-zero for a warning condition, which you may not want to treat as fatal).
By the way, if you're changing directories a lot in a Bash script, be sure to read the Bash manual page on pushd, popd, and dirs. Perhaps all that code you wrote to manage cd's and pwd's is completely unnecessary.
Speaking of which, compare this:
find ... -type d | while read subdir; do cd "$subdir" && whatever && ... && cd - done
With this:
find ... -type d | while read subdir; do (cd "$subdir" && whatever && ...) done
Forcing a SubShell here causes the cd to occur only in the subshell; for the next iteration of the loop, we're back to our normal location, regardless of whether the cd succeeded or failed. We don't have to change back manually. In fact, the penultimate example isn't even valid -- if one of the whatever commands fails, we might not cd back where we need to be. To correct it without using the subshell, we'd have to arrange to execute some sort of cd "$ORIGINAL_DIR" command within each loop iteration. It would be frightfully messy.
The subshell version is much simpler and cleaner.
19. [ bar == "$foo" ]
The == operator is not valid for the [ command. Use = instead, or use the [[ keyword instead.
[ bar = "$foo" ] && echo yes [[ bar == $foo ]] && echo yes
20. for i in {1..10}; do ./something &; done
You cannot put a ; immediately after an &. Just remove the extraneous ; entirely.
for i in {1..10}; do ./something & done
Or:
for i in {1..10}; do ./something & done
& already functions as a command terminator, just like ; does. And you cannot mix the two.
In general, a ; can be replaced by a newline, but not all newlines can be replaced by ;.
21. cmd1 && cmd2 || cmd3
Some people like to use && and || as a shortcut syntax for if ... then ... else ... fi. In many cases, this is perfectly safe:
[[ -s $errorlog ]] && echo "Uh oh, there were some errors." || echo "Successful."
However, this construct is not completely equivalent to if ... fi in the general case, because the command that comes after the && also generates an exit status. And if that exit status isn't "true" (0), then the command that comes after the || will also be invoked. For example:
i=0 true && ((i++)) || ((i--)) echo $i # Prints 0
What happened here? It looks like i should be 1, but it ends up 0. Why? Because both the i++ and the i-- were executed. The ((i++)) command has an exit status, and that exit status is derived from a C-like evaluation of the expression inside the parentheses. That expression's value happens to be 0 (the initial value of i), and in C, an expression with an integer value of 0 is considered false. So ((i++)) (when i is 0) has an exit status of 1 (false), and therefore the ((i--)) command is executed as well.
This does not occur if we use the pre-increment operator, since the exit status from ++i is true:
i=0 true && (( ++i )) || (( --i )) echo $i # Prints 1
But that's missing the point of the example. It just happens to work by coincidence, and you cannot rely on x && y || z if y has any chance of failure! (This example fails if we initialize i to -1 instead of 0.)
If you need safety, or if you simply aren't sure how this works, or if anything in the preceding paragraphs wasn't completely clear, please just use the simple if ... fi syntax in your programs.
i=0 if true; then ((i++)) else ((i--)) fi echo $i # Prints 1
This section also applies to Bourne shell, here is the code that illustrates it:
true && { echo true; false; } || { echo false; true; }
Output is two lines "true" and "false", instead the single line "true".
22. On UTF-8 and Byte-Order Marks (BOM)
In general: Unix UTF-8 text does not use BOM. The encoding of plain text is determined by the locale or by mime types or other metadata. While the presence of a BOM would not normally damage a UTF-8 document meant only for reading by humans, it is problematic (often syntactically illegal) in any text file meant to be interpreted by automated processes such as scripts, source code, configuration files, and so on. Files starting with BOM should be considered equally foreign as those with MS-DOS linebreaks.
In shell scripting: 'Where UTF-8 is used transparently in 8-bit environments, the use of a BOM will interfere with any protocol or file format that expects specific ASCII characters at the beginning, such as the use of "#!" of at the beginning of Unix shell scripts.' http://unicode.org/faq/utf_bom.html#bom5
23. echo "Hello World!"
The problem here is that, in an interactive Bash shell, you'll see an error like:
bash: !": event not found
This is because, in the default settings for an interactive shell, Bash performs csh-style history expansion using the exclamation point. This is not a problem in shell scripts; only in interactive shells.
Unfortunately, the obvious attempt to "fix" this won't work:
$ echo "hi\!" hi\!
The easiest solution is unsetting the histexpand option: this can be done with set +H or set +o histexpand
Question: Why is playing with histexpand more apropriate than single quotes?
I personally ran into this issue when I was manipulating song files, using commands like
mp3info -t "Don't Let It Show" ... mp3info -t "Ah! Leah!" ...
Using single quotes is extremely inconvenient because of all the songs with apostrophes in their titles. Using double quotes ran into the history expansion issue. (And imagine a file that has both in its name. The quoting would be atrocious.) Since I never actually use history expansion, my personal preference was to turn it off in ~/.bashrc. -- GreyCat
These solutions will work:
echo 'Hello World!'
or
set +H echo "Hello World!"
Many people simply choose to put set +H or set +o histexpand in their ~/.bashrc to deactivate history expansion permanently. This is a personal preference, though, and you should choose whatever works best for you.
24. for arg in $*
Bash (like all Bourne shells) has a special syntax for referring to the list of positional parameters one at a time, and $* isn't it. Neither is $@. Both of those expand to the list of words in your script's parameters, not to each parameter as a separate word.
The correct syntax is:
for arg in "$@" # Or simply: for arg
Since looping over the positional parameters is such a common thing to do in scripts, for arg defaults to for arg in "$@". The double-quoted "$@" is special magic that causes each parameter to be used as a single word (or a single loop iteration). It's what you should be using at least 99% of the time.
Here's an example:
# Incorrect version for x in $*; do echo "parameter: '$x'" done $ ./myscript 'arg 1' arg2 arg3 parameter: 'arg' parameter: '1' parameter: 'arg2' parameter: 'arg3'
It should have been written:
# Correct version for x in "$@"; do echo "parameter: '$x'" done $ ./myscript 'arg 1' arg2 arg3 parameter: 'arg 1' parameter: 'arg2' parameter: 'arg3'
25. function foo()
This works in some shells, but not in others. You should never combine the keyword function with the parentheses () when defining a function.
Bash (at least some versions) will allow you to mix the two. No other shell will, as far as I know. Some shells will accept function foo, but for maximum portability, you should always use:
foo() { ... }
26. echo "~"
Tilde expansion only applies when '~' is unquoted. In this example echo writes '~' to stdout, rather than the path of the user's home directory.
Quoting path parameters that are expressed relative to a user's home directory should be done using $HOME rather than '~'. For instance consider the situation where $HOME is "/home/my photos".
"~/dir with spaces" # expands to "~/dir with spaces" ~"/dir with spaces" # expands to "~/dir with spaces" ~/"dir with spaces" # expands to "/home/my photos/dir with spaces" "$HOME/dir with spaces" # expands to "/home/my photos/dir with spaces"
27. local varname=$(command)
When declaring a local variable in a function, the local acts as a command in its own right. This can sometimes interact oddly with the rest of the line -- for example, if you wanted to capture the exit status ($?) of the CommandSubstitution, you can't do it. local's exit status masks it.
It's best to use separate commands for this:
local varname varname=$(command) rc=$?
28. sed 's/$foo/good bye/'
In single quotes, bash parameter expansions like $foo do not get expanded. That is the purpose of single quotes, to protect characters like $ from the shell.
Change the quotes to double quotes:
foo="hello"; sed "s/$foo/good bye/"
But keep in mind, if you use double quotes you might need to use more escapes. See the Quotes page.
29. tr [A-Z] [a-z]
There are (at least) three things wrong here. The first problem is that [A-Z] and [a-z] are seen as globs by the shell. If you don't have any single-lettered filenames in your current directory, it'll seem like the command is correct; but if you do, things will go wrong. Probably at 0300 hours on a weekend.
The second problem is that this is not really the correct notation for tr. What this actually does is translate '[' into '['; anything in the range A-Z into a-z; and ']' into ']'. So you don't even need those brackets, and the first problem goes away.
The third problem is that depending on the locale, A-Z or a-z may not give you the 26 ASCII characters you were expecting. In fact, in some locales z is in the middle of the alphabet! The solution to this depends on what you want to happen:
# Use this if you want to change the case of the 26 latin letters LC_COLLATE=C tr A-Z a-z # Use this if you want the case conversion to depend upon the locale, which might be more like what a user is expecting tr '[:upper:]' '[:lower:]'
The quotes are required on the second command, to avoid globbing.
30. ps ax | grep gedit
The fundamental problem here is that the name of a running process is inherently unreliable. There could be more than one legitimate gedit process. There could be something else disguising itself as gedit (changing the reported name of an executed command is trivial). For real answers to this, see ProcessManagement.
The following is the quick and dirty stuff.
Searching for the PID of (for example) gedit, many people start with
$ ps ax | grep gedit 10530 ? S 6:23 gedit 32118 pts/0 R+ 0:00 grep gedit
which, depending on a RaceCondition, often yields grep itself as a result. To filter grep out:
ps ax | grep -v grep | grep gedit # will work, but ugly
On GNU/Linux, the parameter -C can be used instead to filter by commandname:
$ ps -C gedit PID TTY TIME CMD 10530 ? 00:06:23 gedit
But why bother when you could just use pgrep instead?
$ pgrep gedit 10530
Now in a second step the PID is often extracted by awk or cut:
$ ps -C gedit | awk '{print $1}' | tail -n1
but even that can be handled by some of the trillions of parameters for ps:
$ ps -C gedit -opid= 10530
If you're stuck in 1992 and aren't using pgrep, you could use the ancient, obsolete, deprecated pidof (GNU/Linux only) instead:
$ pidof gedit 10530
and if you need the PID to kill the process, pkill might be interesting for you. Note however that, for example, pgrep/pkill ssh would also find processes named sshd, and you wouldn't want to kill those.
Unfortunately some programs aren't started with their name, for example firefox is often started as firefox-bin, which you would need to find out with - well - ps ax | grep firefox.
Please read ProcessManagement. Seriously.
31. printf "$foo"
This isn't wrong because of quotes, but because of a format string exploit. If $foo is not strictly under your control, then any \ or % characters in the variable may cause undesired behavior.
Always supply your own format string:
printf %s "$foo" printf '%s\n' "$foo"
32. [[ -e "$broken_symlink" ]] returns 1 even though $broken_symlink exists
Test follows symlinks, therefore if a symlink is broken, i.e. it points to a file that doesn't exists, test -e returns 1 for it even though it exists.
In order to work around it (and prepare against it) you should use:
[[ -e "$broken_symlink" || -L "$broken_symlink" ]]
33. ed file <<<"g/d\{0,3\}/s//e/g" fails
The problem caused because ed doesn't accept 0 for \{0,3\}.
You can check that the following do work:
ed file <<<"g/d\{1,3\}/s//e/g"
Note that this happens even though POSIX states that BRE (which is the Regular Expression flavor used by ed) should accept 0 as the minimum number of occurrences (see section 5).
34. expr sub-string fails for "match"
This works reasonably well - most of the time
word=abcde expr "$word" : ".\(.*\)" bcde
But WILL fail for the word "match"
word=match expr "$word" : ".\(.*\)"
The problem is "match" is a keyword. Solution (Gnu only) is prefix with a '+'
word=match expr + "$word" : ".\(.*\)" atch