Differences between revisions 2 and 8 (spanning 6 versions)
Revision 2 as of 2010-03-17 19:25:18
Size: 4051
Editor: GreyCat
Comment: more stuff
Revision 8 as of 2012-03-08 22:02:51
Size: 8861
Editor: ormaaj
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Quoting in shell programming is extremely important. There are multiple types of quotes, and you must know how and when to use each type. In many languages, quotes are mainly used to specify that the enclosed text is to be interpreted as a string datatype, but in shell programming almost everything is a string, so quoting in the shell has other functions, does much more. There are multiple types of quotes which serve primarily to enable different ways of interpreting their contents. Unfortunately, the rules are often hard for beginners to learn and their behavior varies by context with several special cases and exceptions to remember. Also unfortunately, '''quoting in shell programming is ''extremely'' important'''. It's something no one can avoid learning. Improper shell quoting is one of the most common sources of scripting bugs and security issues. Fortunately, it's possible to get it right most of the time by following a few guidelines, but do not guess about quoting. When in doubt, test, and read the man page for how quotes are to be interpreted in a given context.
Line 3: Line 3:
= Standard quoting = == Types of quoting ==
Line 5: Line 5:
A shell command is parsed by the shell into words, using whitespace (regardless of the `IFS` variable) and other shell metacharacters. The first function of quoting is to permit words to contain these metacharacters.  * '''Single quotes:''' (apostrophes) prevent all special interpretation of the characters between them. Everything between single quotes is literal and expands to a single word. The only thing which cannot be single-quoted is a single quote itself.

 * '''Double quotes:''' permit parameter expansions, [[ArithmeticExpression|arithmetic expansions]] and [[CommandSubstitution|command substitutions]] to occur inside them. They do not allow [[glob|filename expansions]], brace expansions, [[ProcessSubstitution|process substitutions]], tilde expansion, etc. In short, any substitution that starts with `$` is evaluated, and also {{{`}}} for legacy compatibility. The most important function of double-quotes is their effect on [[ParameterExpansion|parameter expansions]], where they disable subjecting the results to pathname expansion and word splitting, and depending on the type of parameter, affect the number of resulting words.

 * '''$'...' and $"...":''' Bash introduces two additional forms of quoting -- single and double quotes preceded by a ''$'' sign. Of these,`$'...'` is the most common, and acts just like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard. This allows a convenient way to embed nonprintable characters into strings, or to pass them as arguments. The second form is `$"..."` and is used for [[BashFAQ/098|localization support]]. Neither of these are specified or mentioned by POSIX, but at least the `$'...'` variety is widely supported by Bash, Ksh93, and relatives.

== Standard Quoting ==

A shell command is parsed by the shell into words, using whitespace (regardless of the [[IFS]] variable) and other shell metacharacters. The first function of quoting is to permit words to contain these metacharacters.
Line 15: Line 23:
'''Single quotes''' (apostrophes) prevent all interpretation of the characters between them.

'''Double quotes''' permit parameter expansions and command substitutions to occur inside them. They do not allow filename expansions, brace expansions, process substitutions, tilde expansion, etc.

The second purpose of quoting is to prevent WordSplitting and [[glob|globbing]].
The second purpose of quoting is to prevent WordSplitting and [[glob|globbing]]. The result of a double-quoted substitution does ''not'' undergo any further processing. (The result of an unquoted substitution ''does''.)
Line 25: Line 29:
In this example, the double quotes protect each parameter (variable) from undergoing word splitting or globbing should it happen to contain whitespace or wildcard characters (`*` or `?` or `[...]`). Without the quotes, a filename like `hot stuff.mp3` would be split into two words, and each word would be passed to the `cp` command as a separate argument. That is not what we want. In this example, the double quotes protect the value of each parameter (variable) from undergoing word splitting or globbing should it happen to contain whitespace or wildcard characters (`*` or `?` or `[...]`). Without the quotes, a `filename` like `hot stuff.mp3` would be split into two words, and each word would be passed to the `cp` command as a separate argument. Or, a `filename` that contains `*` with whitespace around it would produce one word for every file in the current directory. That is not what we want.
Line 27: Line 31:
When in doubt, quote your parameter expansions. It's rare that you would need one to be unquoted, and if that's the case, you'll know it. With the quotes, every character in the value of the `filename` parameter is treated literally, and the whole value becomes the second argument to the `cp` command.

When in doubt, always double-quote your parameter expansions.

You may concatenate the various types of quoting if you need to. For example, if you have one section of a string that has lots of special characters that you'd like to single-quote, and another section with a parameter expansion in it which must be double-quoted, you may mix them:

{{{
echo '!%$*&'"$foo"
}}}

Any number of quoted substrings, of any style, may be concatenated in this manner. The result (after appropriate expansions in the double-quoted sections) is a single word.

Double-quoting `$@` or `${array[@]}` has a special meaning. `"$@"` expands to a list of words, with each positional parameter's value being one word. Likewise, `"${array[@]}"` expands to a list words, one per array element. When dealing with the positional parameters or with the contents of an array as a list of words, ''always'' use the double-quoted syntax.

Double-quoting `$*` or `${array[*]}` results in ''one word'' which is the concatenation of all the positional parameters (or array elements) with the first character of [[IFS]] between them. This is similar to the `join` function in some other languages.
Line 33: Line 51:

# Never use for file in $* or for file in $@
Line 40: Line 60:
for index in ${!array[*]}
# We omit quotes here because the indices are a list of numbers,
# which must be word-split in order to be processed. Numbers will
# never undergo glob expansion either, so we can skip "set -f".
# bash 3.0 and higher
for index in "${!array[@]}"

# This works with both regular and associative arrays. (Associative
# arrays require bash 4.0 or higher.)
Line 47: Line 68:
# bash or ksh93
Line 57: Line 79:
Line 59: Line 82:

echo "The matching line is: $(grep foo "$filename")"
                                       ^---------^ inner layer (quotes)
                            ^^--------------------^ middle layer (command sub)
     ^---------------------------------------------^ outer layer (quotes)

# Without the inner quotes, the value of $filename would be word-split and globbed
# before being handed to grep.
}}}

{{{
# bash
ip=192.168.1.30
netmask=255.255.254.0
IFS=. read -ra ip_octets <<< "$ip"
IFS=. read -ra netmask_octets <<< "$netmask"
for i in 0 1 2 3; do
  ((ip_octets[i] &= netmask_octets[i]))
done
IFS=.; network="${ip_octets[*]}"; unset IFS
Line 63: Line 106:
= Bash extensions =

Bash introduces two additional forms of quoting. The first is `$'...'` which acts like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard. This allows a convenient way to embed nonprintable characters into strings, or to pass them as arguments.
the `$'...'` quoting style is used to interpret backslash escapes.
Line 73: Line 113:
The second form is `$"..."` and is used for [[BashFAQ/098|localization support]]. {{{
echo $'It\'s also easier to escape apostrophes this way.'
}}}
Line 75: Line 117:
In addition to these, bash uses quotes to suppress the "specialness" of the right-hand-side of an `=` or `=~` operator inside a `[[` keyword. That sounds complicated, but it's simpler when shown as an example: {{{
# The following are all equivalent:
echo $'hi\nthere\n'
printf '%s\n' hi there
# discouraged:
echo -e 'hi\nthere\n'
# ksh93:
print 'hi\nthere\n'
}}}

== Exceptions ==

One important situation where quotes can have an unintended effect is in the case of [[glob|pattern matching]] contexts. Here, quote removal does occur even though neither pathname expansion nor word splitting take place, yet quotes nevertheless cause a side-effect other than to suppress these. Instead, Bash uses quotes to suppress the "specialness" of patterns and [[RegularExpression|regular expressions]]. Characters interpreted specially in a pattern matching context are treated literally when quoted whether the pattern is expanded from a parameter or not.
Line 92: Line 146:
See [[glob]] and RegularExpression for explanations of those terms. {{{
# Bash 4
# the "quoted" branch is taken.
Line 94: Line 150:
g=a*[b]

case $g in
    "$g")
        echo 'quoted pattern'
        ;;&
    $g)
        echo 'unquoted pattern'
        ;;
esac
}}}

See also:
 * http://wiki.bash-hackers.org/syntax/quoting
 * http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02

In many languages, quotes are mainly used to specify that the enclosed text is to be interpreted as a string datatype, but in shell programming almost everything is a string, so quoting in the shell has other functions, does much more. There are multiple types of quotes which serve primarily to enable different ways of interpreting their contents. Unfortunately, the rules are often hard for beginners to learn and their behavior varies by context with several special cases and exceptions to remember. Also unfortunately, quoting in shell programming is extremely important. It's something no one can avoid learning. Improper shell quoting is one of the most common sources of scripting bugs and security issues. Fortunately, it's possible to get it right most of the time by following a few guidelines, but do not guess about quoting. When in doubt, test, and read the man page for how quotes are to be interpreted in a given context.

Types of quoting

  • Single quotes: (apostrophes) prevent all special interpretation of the characters between them. Everything between single quotes is literal and expands to a single word. The only thing which cannot be single-quoted is a single quote itself.

  • Double quotes: permit parameter expansions, arithmetic expansions and command substitutions to occur inside them. They do not allow filename expansions, brace expansions, process substitutions, tilde expansion, etc. In short, any substitution that starts with $ is evaluated, and also ` for legacy compatibility. The most important function of double-quotes is their effect on parameter expansions, where they disable subjecting the results to pathname expansion and word splitting, and depending on the type of parameter, affect the number of resulting words.

  • $'...' and $"...": Bash introduces two additional forms of quoting -- single and double quotes preceded by a $ sign. Of these,$'...' is the most common, and acts just like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard. This allows a convenient way to embed nonprintable characters into strings, or to pass them as arguments. The second form is $"..." and is used for localization support. Neither of these are specified or mentioned by POSIX, but at least the $'...' variety is widely supported by Bash, Ksh93, and relatives.

Standard Quoting

A shell command is parsed by the shell into words, using whitespace (regardless of the IFS variable) and other shell metacharacters. The first function of quoting is to permit words to contain these metacharacters.

echo '&'

Without quotes, the & would put the echo command into the background. With quotes, the & is simply made into a word, and passed as an argument to the echo command instead.

The quotes are not actually passed along to the command. They are removed by the shell. In the example above, the echo command sees only the &, not the quotes.

The second purpose of quoting is to prevent WordSplitting and globbing. The result of a double-quoted substitution does not undergo any further processing. (The result of an unquoted substitution does.)

cp -- "$filename" "$destination"

In this example, the double quotes protect the value of each parameter (variable) from undergoing word splitting or globbing should it happen to contain whitespace or wildcard characters (* or ? or [...]). Without the quotes, a filename like hot stuff.mp3 would be split into two words, and each word would be passed to the cp command as a separate argument. Or, a filename that contains * with whitespace around it would produce one word for every file in the current directory. That is not what we want.

With the quotes, every character in the value of the filename parameter is treated literally, and the whole value becomes the second argument to the cp command.

When in doubt, always double-quote your parameter expansions.

You may concatenate the various types of quoting if you need to. For example, if you have one section of a string that has lots of special characters that you'd like to single-quote, and another section with a parameter expansion in it which must be double-quoted, you may mix them:

echo '!%$*&'"$foo"

Any number of quoted substrings, of any style, may be concatenated in this manner. The result (after appropriate expansions in the double-quoted sections) is a single word.

Double-quoting $@ or ${array[@]} has a special meaning. "$@" expands to a list of words, with each positional parameter's value being one word. Likewise, "${array[@]}" expands to a list words, one per array element. When dealing with the positional parameters or with the contents of an array as a list of words, always use the double-quoted syntax.

Double-quoting $* or ${array[*]} results in one word which is the concatenation of all the positional parameters (or array elements) with the first character of IFS between them. This is similar to the join function in some other languages.

Assorted examples, to show how things should be done. Some of these examples use bash/ksh syntax that won't work in strict POSIX shells.

for file in "$@"

# Never use  for file in $*  or  for file in $@

for element in "${array[@]}"

# bash 3.0 and higher
for index in "${!array[@]}"

# This works with both regular and associative arrays.  (Associative
# arrays require bash 4.0 or higher.)

# bash or ksh93
find_opts=('(' -iname '*.jpg' -o -iname '*.gif' -o -iname '*.png' ')')
find . "${find_opts[@]}" -print

echo 'Don'\''t walk!'

echo "The matching line is: $(grep foo "$filename")"

# Note that the quotes inside the $() command substitution are nested.
# This looks wrong to a C programmer, but it is correct in shells.

echo "The matching line is: $(grep foo "$filename")"
                                       ^---------^    inner layer (quotes)
                            ^^--------------------^   middle layer (command sub)
     ^---------------------------------------------^  outer layer (quotes)

# Without the inner quotes, the value of $filename would be word-split and globbed
# before being handed to grep.

# bash
ip=192.168.1.30
netmask=255.255.254.0
IFS=. read -ra ip_octets <<< "$ip"
IFS=. read -ra netmask_octets <<< "$netmask"
for i in 0 1 2 3; do
  ((ip_octets[i] &= netmask_octets[i]))
done
IFS=.; network="${ip_octets[*]}"; unset IFS

The third type of quote is the backtick (`) or back quote. It's a deprecated markup for command substitutions, used in Bourne shells before the introduction of the $(...) syntax. For a discussion of the difference between `...` and $(...) please see BashFAQ/082.

the $'...' quoting style is used to interpret backslash escapes.

IFS=$' \t\n'
# sets the IFS variable to the three-byte string containing
# a space, a tab, and a newline

echo $'It\'s also easier to escape apostrophes this way.'

# The following are all equivalent:
echo $'hi\nthere\n'
printf '%s\n' hi there
# discouraged:
echo -e 'hi\nthere\n'
# ksh93:
print 'hi\nthere\n'

Exceptions

One important situation where quotes can have an unintended effect is in the case of pattern matching contexts. Here, quote removal does occur even though neither pathname expansion nor word splitting take place, yet quotes nevertheless cause a side-effect other than to suppress these. Instead, Bash uses quotes to suppress the "specialness" of patterns and regular expressions. Characters interpreted specially in a pattern matching context are treated literally when quoted whether the pattern is expanded from a parameter or not.

if [[ $path = foo* ]]; then
# unquoted foo* acts as a glob

if [[ $path = "foo*" ]]; then
# quoted "foo*" is a literal string

if [[ $path =~ $some_re ]]; then
# the contents of $some_re are treated as an ERE

if [[ $path =~ "$some_re" ]]; then
# the contents of $some_re are treated as a literal string
# despite the =~ operator

# Bash 4
# the "quoted" branch is taken.

g=a*[b]

case $g in
    "$g")
        echo 'quoted pattern'
        ;;&
    $g)
        echo 'unquoted pattern'
        ;;
esac

See also:


CategoryShell

Quotes (last edited 2024-03-07 22:57:49 by emanuele6)