Differences between revisions 1 and 18 (spanning 17 versions)
Revision 1 as of 2007-05-02 23:31:14
Size: 730
Editor: redondos
Comment:
Revision 18 as of 2025-04-19 15:44:32
Size: 2946
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq41)]] <<Anchor(faq41)>>
Line 3: Line 3:
There are many choices here: you can perform an exact substring match, or a [[glob]]-style pattern match, or a RegularExpression match.

To match exact substrings, POSIX sh uses `case`:
{{{
# POSIX
case $bigvar in
    *substr*) ... ;;
}}}

If the substring is in a variable, and if an exact substring match is wanted, then the substring variable should be quoted:
{{{
# POSIX
case $bigvar in
    *"$substr"*) ... ;;
}}}

In Bash, you may also use the `[[` command. It follows the same quoting semantics as `case`:
{{{
# Bash
if [[ $bigvar = *substr* ]]; then ... # These are
if [[ $bigvar == *substr* ]]; then ... # equivalent.

if [[ $bigvar = *"$substr"* ]]; then ... # These are
if [[ $bigvar == *"$substr"* ]]; then ... # equivalent.
}}}

In both `case` and `[[` you may also do glob-style pattern matching. Simply use unquoted glob characters in the pattern. If the pattern is in a variable, '''omit''' the double quotes, and it will be interpreted as a pattern instead of an exact substring.
{{{
# POSIX
case $filename in
    *.txt) ... ;;

pattern='*.txt'
case $filename in
    $pattern) ... ;;
}}}
Line 5: Line 41:
  if [[ $foo = *bar* ]] # Bash
if [[ $filename = *.txt ]]; then ...
if [[ $filename == *.txt ]]; then ...

pattern='*.txt'
if [[ $filename = $pattern ]]; then ...
if [[ $filename == $pattern ]]; then ...
Line 8: Line 50:
The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions: In Bash 3.x or later, `[[` can also do Extended Regular Expression (ERE) matches using the `=~` operator:
{{{
# Bash
# Matches ac, zabcz, xabbbbcq, etc.
re='ab*c'
if [[ $foo =~ $re ]]; then ...
}}}

Storing the regular expression in a variable and using `=~ $variable` is strongly recommended, as it avoids many undesirable surprises.

POSIX sh has no builtin regular expression matching, but you can call external programs such as `awk`, `expr` or `grep` to do it.
Line 11: Line 63:
  if [[ $foo =~ ab*c ]] # bash 3, matches abbbbcde, or ac, etc. # POSIX
ere_match() { awk -- 'BEGIN{exit !(ARGV[1] ~ ARGV[2])}' "$@"; }
if ere_match "$foo" "$re"; then

# With expr, leading anchors are implied.
# An initial .* works around this. We also need to prefix the subject with a character
# or string not starting with - and that is not found at the start of any expr operator
# present or future.
if expr "@$foo" : "@.*$re" > /dev/null; then ...

# Grep can only be used for single-line strings, but can do case insensitive matching
# with -i. -x can be used to anchor at both start or end or use the usual ^ or $
if printf '%s\n' "$foo" | grep -q -- "$re"; then ...
Line 14: Line 78:
If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax: `expr` uses Basic Regular Expressions (BRE); `grep` defaults to BRE, but may use ERE with the `-E` option, `awk` uses an ERE variant that also recognises some C-like escape sequences such as `\n` (at least with ''some'' `awk` implementation when the regexp is not literal like here).
Line 16: Line 80:
{{{
  case "$foo" in
    *bar*) .... ;;
  esac
}}}
For more hints on string manipulations in Bash, see [[BashFAQ/100|FAQ #100]].
Line 22: Line 82:
This should allow you to match variables against globbing-style patterns. if you need a portable way to match variables against regular expressions, use {{{grep}}} or {{{egrep}}}.

{{{
  if echo "$foo" | egrep some-regex >/dev/null; then ...
}}}
----
CategoryShell

How do I determine whether a variable contains a substring?

There are many choices here: you can perform an exact substring match, or a glob-style pattern match, or a RegularExpression match.

To match exact substrings, POSIX sh uses case:

# POSIX
case $bigvar in
    *substr*) ... ;;

If the substring is in a variable, and if an exact substring match is wanted, then the substring variable should be quoted:

# POSIX
case $bigvar in
    *"$substr"*) ... ;;

In Bash, you may also use the [[ command. It follows the same quoting semantics as case:

# Bash
if [[ $bigvar = *substr* ]]; then ...     # These are
if [[ $bigvar == *substr* ]]; then ...    # equivalent.

if [[ $bigvar = *"$substr"* ]]; then ...  # These are
if [[ $bigvar == *"$substr"* ]]; then ... # equivalent.

In both case and [[ you may also do glob-style pattern matching. Simply use unquoted glob characters in the pattern. If the pattern is in a variable, omit the double quotes, and it will be interpreted as a pattern instead of an exact substring.

# POSIX
case $filename in
    *.txt) ... ;;

pattern='*.txt'
case $filename in
    $pattern) ... ;;

# Bash
if [[ $filename = *.txt ]]; then ...
if [[ $filename == *.txt ]]; then ...

pattern='*.txt'
if [[ $filename = $pattern ]]; then ...
if [[ $filename == $pattern ]]; then ...

In Bash 3.x or later, [[ can also do Extended Regular Expression (ERE) matches using the =~ operator:

# Bash
# Matches ac, zabcz, xabbbbcq, etc.
re='ab*c'
if [[ $foo =~ $re ]]; then ...

Storing the regular expression in a variable and using =~ $variable is strongly recommended, as it avoids many undesirable surprises.

POSIX sh has no builtin regular expression matching, but you can call external programs such as awk, expr or grep to do it.

# POSIX
ere_match() { awk -- 'BEGIN{exit !(ARGV[1] ~ ARGV[2])}' "$@"; }
if ere_match "$foo" "$re"; then

# With expr, leading anchors are implied.
# An initial .* works around this. We also need to prefix the subject with a character
# or string not starting with - and that is not found at the start of any expr operator
# present or future.
if expr "@$foo" : "@.*$re" > /dev/null; then ...

# Grep can only be used for single-line strings, but can do case insensitive matching
# with -i. -x can be used to anchor at both start or end or use the usual ^ or $
if printf '%s\n' "$foo" | grep -q -- "$re"; then ...

expr uses Basic Regular Expressions (BRE); grep defaults to BRE, but may use ERE with the -E option, awk uses an ERE variant that also recognises some C-like escape sequences such as \n (at least with some awk implementation when the regexp is not literal like here).

For more hints on string manipulations in Bash, see FAQ #100.


CategoryShell

BashFAQ/041 (last edited 2025-04-19 15:51:39 by StephaneChazelas)