Size: 730
Comment:
|
← Revision 19 as of 2025-04-19 15:51:39 ⇥
Size: 3209
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
[[Anchor(faq41)]] | <<Anchor(faq41)>> |
Line 3: | Line 3: |
There are many choices here: you can perform an exact substring match, or a [[glob]]-style pattern match, or a RegularExpression match. To match exact substrings, POSIX sh uses `case`: {{{ # POSIX case $bigvar in *substr*) ... ;; }}} If the substring is in a variable, and if an exact substring match is wanted, then the substring variable should be quoted: {{{ # POSIX case $bigvar in *"$substr"*) ... ;; }}} In Bash, you may also use the `[[...]]` construct. It follows the same quoting semantics as `case`: {{{ # Bash if [[ $bigvar = *substr* ]]; then ... # These are if [[ $bigvar == *substr* ]]; then ... # equivalent. if [[ $bigvar = *"$substr"* ]]; then ... # These are if [[ $bigvar == *"$substr"* ]]; then ... # equivalent. }}} In both `case` and `[[...]]` you may also do glob-style pattern matching. Simply use unquoted glob characters in the pattern. If the pattern is in a variable, '''omit''' the double quotes, and it will be interpreted as a pattern instead of an exact substring. {{{ # POSIX case $filename in *.txt) ... ;; pattern='*.txt' case $filename in $pattern) ... ;; }}} |
|
Line 5: | Line 41: |
if [[ $foo = *bar* ]] | # Bash if [[ $filename = *.txt ]]; then ... if [[ $filename == *.txt ]]; then ... pattern='*.txt' if [[ $filename = $pattern ]]; then ... if [[ $filename == $pattern ]]; then ... |
Line 8: | Line 50: |
The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions: | Since Bash 4.1, ksh88 extended glob operators are recognised in `[[...]]` even when the `extglob` option is not enabled. In Bash 3.x or later, `[[..]]` can also do Extended Regular Expression (ERE) matches using the `=~` operator: {{{ # Bash # Matches ac, zabcz, xabbbbcq, etc. re='ab*c' if [[ $foo =~ $re ]]; then ... }}} Storing the regular expression in a variable and using `=~ $variable` (where `$variable` is ''not'' quoted) is strongly recommended, as it avoids many undesirable surprises. POSIX sh has no builtin regular expression matching operator, but you can call standard utilities such as `awk`, `expr` or `grep` to do it (which may or may not be implemented as shell builtins; they are not in Bash). |
Line 11: | Line 65: |
if [[ $foo =~ ab*c ]] # bash 3, matches abbbbcde, or ac, etc. | # POSIX ere_match() { awk -- 'BEGIN{exit !(ARGV[1] ~ ARGV[2])}' "$@"; } if ere_match "$foo" "$re"; then # With expr, leading anchors are implied. An initial .* works around this. We # also need to prefix the subject with a character or string not starting with - # and that is not found at the start of any expr operator present or future. if expr "@$foo" : "@.*$re" > /dev/null; then ... # Grep can only be used for single-line strings, but can do case insensitive matching # with -i. -x can be used to anchor at both start or end or use the usual ^ or $ if printf '%s\n' "$foo" | grep -q -- "$re"; then ... |
Line 14: | Line 79: |
If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax: | `expr` uses Basic Regular Expressions (BRE); `grep` defaults to BRE, but may use ERE with the `-E` option, `awk` uses an ERE variant that also recognises some C-like escape sequences such as `\n` (at least with ''some'' `awk` implementation when the regexp is not literal like here). |
Line 16: | Line 81: |
{{{ case "$foo" in *bar*) .... ;; esac }}} |
For more hints on string manipulations in Bash, see [[BashFAQ/100|FAQ #100]]. |
Line 22: | Line 83: |
This should allow you to match variables against globbing-style patterns. if you need a portable way to match variables against regular expressions, use {{{grep}}} or {{{egrep}}}. {{{ if echo "$foo" | egrep some-regex >/dev/null; then ... }}} |
---- CategoryShell |
How do I determine whether a variable contains a substring?
There are many choices here: you can perform an exact substring match, or a glob-style pattern match, or a RegularExpression match.
To match exact substrings, POSIX sh uses case:
# POSIX case $bigvar in *substr*) ... ;;
If the substring is in a variable, and if an exact substring match is wanted, then the substring variable should be quoted:
# POSIX case $bigvar in *"$substr"*) ... ;;
In Bash, you may also use the [[...]] construct. It follows the same quoting semantics as case:
# Bash if [[ $bigvar = *substr* ]]; then ... # These are if [[ $bigvar == *substr* ]]; then ... # equivalent. if [[ $bigvar = *"$substr"* ]]; then ... # These are if [[ $bigvar == *"$substr"* ]]; then ... # equivalent.
In both case and [[...]] you may also do glob-style pattern matching. Simply use unquoted glob characters in the pattern. If the pattern is in a variable, omit the double quotes, and it will be interpreted as a pattern instead of an exact substring.
# POSIX case $filename in *.txt) ... ;; pattern='*.txt' case $filename in $pattern) ... ;;
# Bash if [[ $filename = *.txt ]]; then ... if [[ $filename == *.txt ]]; then ... pattern='*.txt' if [[ $filename = $pattern ]]; then ... if [[ $filename == $pattern ]]; then ...
Since Bash 4.1, ksh88 extended glob operators are recognised in [[...]] even when the extglob option is not enabled.
In Bash 3.x or later, [[..]] can also do Extended Regular Expression (ERE) matches using the =~ operator:
# Bash # Matches ac, zabcz, xabbbbcq, etc. re='ab*c' if [[ $foo =~ $re ]]; then ...
Storing the regular expression in a variable and using =~ $variable (where $variable is not quoted) is strongly recommended, as it avoids many undesirable surprises.
POSIX sh has no builtin regular expression matching operator, but you can call standard utilities such as awk, expr or grep to do it (which may or may not be implemented as shell builtins; they are not in Bash).
# POSIX ere_match() { awk -- 'BEGIN{exit !(ARGV[1] ~ ARGV[2])}' "$@"; } if ere_match "$foo" "$re"; then # With expr, leading anchors are implied. An initial .* works around this. We # also need to prefix the subject with a character or string not starting with - # and that is not found at the start of any expr operator present or future. if expr "@$foo" : "@.*$re" > /dev/null; then ... # Grep can only be used for single-line strings, but can do case insensitive matching # with -i. -x can be used to anchor at both start or end or use the usual ^ or $ if printf '%s\n' "$foo" | grep -q -- "$re"; then ...
expr uses Basic Regular Expressions (BRE); grep defaults to BRE, but may use ERE with the -E option, awk uses an ERE variant that also recognises some C-like escape sequences such as \n (at least with some awk implementation when the regexp is not literal like here).
For more hints on string manipulations in Bash, see FAQ #100.