Differences between revisions 12 and 13
Revision 12 as of 2009-03-04 14:44:19
Size: 5372
Editor: NeilMoore
Comment: gramer iz gud
Revision 13 as of 2009-03-04 18:54:46
Size: 5930
Editor: GreyCat
Comment: move parsing note to the bottom and expand on it; fix some leftover }}}; link to WordSplitting; link to case in the guide; link to RegularExpression
Deletions are marked like this. Additions are marked like this.
Line 30: Line 30:
done}}} done
}}}
Line 32: Line 33:
In the second example above, the output of {{{ls}}} is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop. This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. The first example has no such problem, because the filenames produced by the glob do ''not'' undergo any further word-splitting. For more such examples, see BashPitfalls. In the second example above, the output of {{{ls}}} is filtered, and then the result of the whole pipeline is [[WordSplitting|divided into words]], to serve as iterative values for the loop. This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. The first example has no such problem, because the filenames produced by the glob do ''not'' undergo any further word-splitting. For more such examples, see BashPitfalls.
Line 34: Line 35:
Globs are also used to match patterns in a few places in Bash. The most traditional is in the {{{case}}} command: Globs are also used to match patterns in a few places in Bash. The most traditional is in the [[BashGuide/TheBasics/TestsAndConditionals#Choices|case]] command:
Line 41: Line 42:
esac}}} esac
}}}
Line 59: Line 61:
unset IFS}}} unset IFS
}}}
Line 61: Line 64:
In addition to the traditional globs (supported by all Bourne-family shells) that we've seen so far, Bash (and Korn Shell) offers ''extended globs'', which have the expressive power of regular expressions. Korn shell enables these by default; in Bash, you must run the command In addition to the traditional globs (supported by all Bourne-family shells) that we've seen so far, Bash (and Korn Shell) offers ''extended globs'', which have the expressive power of [[RegularExpression|regular expressions]]. Korn shell enables these by default; in Bash, you must run the command
Line 67: Line 70:
in your shell (or at the start of your script) to use them. Because the extglob option changes the way certain characters are parsed, it is necessary to have a newline (not just a semicolon) between the shopt command and any subsequent commands that use extended globs. The [[http://tiswww.case.edu/php/chet/bash/bashref.html#SEC36|pattern matching reference]] describes the syntax, which is reproduced here: in your shell (or at the ''start'' of your script -- see note on parsing below) to use them. The [[http://tiswww.case.edu/php/chet/bash/bashref.html#SEC36|pattern matching reference]] describes the syntax, which is reproduced here:
Line 83: Line 86:
rm !(*.jpg|*.gif|*.png)}}} rm !(*.jpg|*.gif|*.png)
}}}
Line 87: Line 91:
cp !(04*).mp3 /mnt}}} cp !(04*).mp3 /mnt
}}}
Line 91: Line 96:
x=${x##+([[:space:]])}; x=${x%%+([[:space:]])} }}} x=${x##+([[:space:]])}; x=${x%%+([[:space:]])}
}}}
Line 99: Line 105:
Because the `extglob` option changes the way certain characters are parsed, it is necessary to have a newline (not just a semicolon) between the `shopt` command and any subsequent commands that use extended globs. Likewise, you cannot put `shopt -s extglob` inside a function that uses extended globs, because the function as a whole must be parsed when it's defined; the `shopt` command won't take effect until the function is ''called'', at which point it's too late. Therefore, if you use this option in a script, it's best to put it right under the shebang line, or as close as you can get it while still making your boss happy.

"Glob" is the common name for a set of Bash features that match or expand specific types of patterns. Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. A glob may look like *.txt and, when used to match filenames, is sometimes called a "wildcard".

Traditional shell globs use a very simple syntax, which is less expressive than a regular expression. Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see the previous reference for details). All globs are implicitly anchored at both start and end. For example:

*

Matches any string, of any length

foo*

Matches any string beginning with foo

*x*

Matches any string containing an x (beginning, middle or end)

*.tar.gz

Matches any string ending with .tar.gz

*.[ch]

Matches any string ending with .c or .h

foo?

Matches foot or foo$ but not fools

Bash expands globs which appear unquoted in commands, by matching filenames relative to the current directory. The expansion of the glob results in 1 or more words (0 or more, if certain options are set), and those words (filenames) are used in the command. For example:

tar xvf *.tar
# Expands to: tar xvf file1.tar file2.tar file42.tar ...
# (which is generally not what one wants)

Even if a file contains internal whitespace, the expansion of a glob that matches that file will still preserve each filename as a single word. For example,

# This is safe even if a filename contains whitespace:
for f in *.tar; do
    tar tvf "$f"
done

# But this one is not:
for f in $(ls | grep '\.tar$'); do
    tar tvf "$f"
done

In the second example above, the output of ls is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop. This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. The first example has no such problem, because the filenames produced by the glob do not undergo any further word-splitting. For more such examples, see BashPitfalls.

Globs are also used to match patterns in a few places in Bash. The most traditional is in the case command:

case "$input" in
    [Yy]|'') confirm=1;;
    [Nn]*) confirm=0;;
    *) echo "I don't understand.  Please try again.";;
esac

Patterns (which are separated by | characters) are matched against the first word after the case itself. The first pattern which matches, "wins", causing the corresponding commands to be executed.

Bash also allows globs to appear on the right-hand side of a comparison inside a [[ command:

if [[ $output = *[Ee]rror* ]]; then ...

Finally, globs are used during parameter expansion to indicate patterns which may be stripped out, or replaced, during a substitution. Simple examples (there are many more on the previously referenced page):

filename=${path##*/}    # strip leading pattern that matches */ (be greedy)
dirname=${path%/*}      # strip trailing pattern matching /* (non-greedy)

IFS=$'\n'; echo "${arr[*]}"          # dump an array, one element per line
IFS=$'\n'; echo "${arr[*]/error*/}"  # dump array, removing error* if matched
unset IFS

In addition to the traditional globs (supported by all Bourne-family shells) that we've seen so far, Bash (and Korn Shell) offers extended globs, which have the expressive power of regular expressions. Korn shell enables these by default; in Bash, you must run the command

shopt -s extglob

in your shell (or at the start of your script -- see note on parsing below) to use them. The pattern matching reference describes the syntax, which is reproduced here:

?(pattern-list)
Matches zero or one occurrence of the given patterns.
*(pattern-list)
Matches zero or more occurrences of the given patterns.
+(pattern-list)
Matches one or more occurrences of the given patterns.
@(pattern-list)
Matches one of the given patterns.
!(pattern-list)
Matches anything except one of the given patterns.

Patterns in a list are separated by | characters.

Extended globs allow you to solve a number of problems which otherwise require a rather surprising amount of ugly hacking; for example,

# To remove all the files except ones matching *.jpg:
rm !(*.jpg)
# All except *.jpg and *.gif and *.png:
rm !(*.jpg|*.gif|*.png)

# To copy all the MP3 songs except one to your device
cp !(04*).mp3 /mnt

# To trim leading and trailing whitespace from a variable
x=${x##+([[:space:]])}; x=${x%%+([[:space:]])}

Extended glob patterns can be nested, too.

[[ $fruit = @(ba*(na)|a+(p)le) ]] && echo 'Nice fruit'

Because the extglob option changes the way certain characters are parsed, it is necessary to have a newline (not just a semicolon) between the shopt command and any subsequent commands that use extended globs. Likewise, you cannot put shopt -s extglob inside a function that uses extended globs, because the function as a whole must be parsed when it's defined; the shopt command won't take effect until the function is called, at which point it's too late. Therefore, if you use this option in a script, it's best to put it right under the shebang line, or as close as you can get it while still making your boss happy.


CategoryShell

glob (last edited 2022-02-17 13:06:05 by emanuele6)