Differences between revisions 2 and 15 (spanning 13 versions)
Revision 2 as of 2008-05-08 07:54:33
Size: 3277
Editor: Lhunath
Comment: add a note about lists in strings, add a function to search for elements.
Revision 15 as of 2012-07-24 04:36:07
Size: 6812
Editor: ormaaj
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq46)]] <<Anchor(faq46)>>
Line 3: Line 3:
If your real question was ''How do I check whether one of my parameters was -v?'' then please see [[BashFAQ/035|FAQ #35]] instead. Otherwise, read on....
Line 4: Line 5:
First of all, let's get the terminoligy straight. Bash has no notion of 'lists' or 'sets' or any such. Bash has strings and arrays. Strings are a 'list' of '''characters''', arrays are a 'list' of '''strings'''. First of all, let's get the terminology straight. Bash has no notion of "lists" or "sets" or any such. Bash has strings and [[BashFAQ/005|arrays]]. Strings are a "list" of '''characters''', arrays are a "list" of '''strings'''.
Line 6: Line 7:
'''NOTE:''' A string can not possibly contain a list of other strings because there is no way to reliably tell where each of those other strings end and the next other string starts. '''NOTE:''' In the general case, a string cannot possibly contain a list of other strings because there is no reliable way to tell where each substring begins and ends.
Line 8: Line 9:
The only proper way to do this is to loop over all elements in your array and check them for the element you are looking for. Say what we are looking for is in `bar` and our list is in the array `foo`: Given a traditional array, the only proper way to do this is to loop over all elements in your array and check them for the element you are looking for. Say what we are looking for is in `bar` and our list is in the array `foo`:
Line 10: Line 11:
   # Bash
Line 12: Line 14:
   done}}}    done
   
}}}
Line 15: Line 18:
{{{
isIn() {
    pattern=$1
    shift
   {{{
   # Bash
   
isIn() {
       local pattern="$1" element
    shift
Line 20: Line 24:
    for element
    do
        [[ $element = $pattern ]] && return 0
    done
    for element
       do
    [[ $element = $pattern ]] && return 0
       done
Line 25: Line 29:
    return 1
}
    return 1
   }
Line 28: Line 32:
if isIn "jacob" "${names[@]}"
then
echo "Jacob is on the list."
fi
}}}
   if isIn "jacob" "${names[@]}"
   then
   echo "Jacob is on the list."
   fi
   }}}
Line 35: Line 39:
{{{
indexOf() {
    pattern=$1
    shift
   {{{
   # Bash 3.0 or higher
   
indexOf() {
       local pattern=$1
       local index list
       
shift
Line 40: Line 46:
    list=("$@")
    for index in "${!list[@]}"
    do
        [[ ${list[index]} = $pattern ]] && {
            echo $index
            return
        }
    done
    list=("$@")
    for index in "${!list[@]}"
       do
    [[ ${list[index]} = $pattern ]] && {
               echo $index
               return 0
    }
       done
Line 49: Line 55:
    echo -1
    return 1
}
    echo -1
       return 1
   }
Line 53: Line 59:
if index=$(indexOf "jacob" "${names[@]}")
then
    echo "Jacob is the ${index}th on the list."
fi
}}}
   if index=$(indexOf "jacob" "${names[@]}")
   then
    echo "Jacob is the ${index}th on the list."
   else
       echo "Jacob is not on the list."
   
fi
   }}}
Line 59: Line 67:
If your 'list' is contained in a string, and for some half-witted reason you choose not to heed warning to the note above, you can use the following code to search through 'words' in a string; where a word is defined by any substring that is delimited by whitespace (or more specifically, the characters currently in IFS): If your "list" is contained in a string, and for some half-witted reason you choose not to heed the warnings above, you can use the following code to search through "words" in a string. (The only real excuse for this would be that you're stuck in Bourne shell, which has no arrays.)
Line 61: Line 69:
   # Bourne
   set -f
Line 62: Line 72:
      [[ $element = $bar ]] && echo "Found $bar."
   done}}}
      if test x"$element" = x"$bar"; then
        echo "Found $bar."
      fi
   
done
   set +f
   
}}}
Line 65: Line 79:
Here's a shorter way of doing it: Here, a "word" is defined as any substring that is delimited by whitespace (or more specifically, the characters currently in IFS). The `set -f` prevents [[glob]] expansion of the words in the list. Turning glob expansions back on (`set +f`) is optional.

If you're working in bash 4 or ksh93, you have access to associative arrays. These will allow you to restructure the problem -- instead of making a list of words that are allowed, you can make an ''associative array'' whose keys are the words you want to allow. Their values could be meaningful, or not -- depending on the nature of the problem.
Line 67: Line 84:
   # Bash 4
   declare -A good
   for word in "goodword1" "goodword2" ...; do
     good["$word"]=1
   done

   # Check whether $foo is allowed:
   if ((${good[$foo]})); then ...
   }}}

Here's a hack that you shouldn't use, but which is presented for the sake of completeness:
   {{{
   # Bash
Line 69: Line 99:
   fi}}}    fi
   }}}
(The problem here is that is assumes ''space'' can be used as a delimiter between words. Your elements might contain spaces, which would break this!)
Line 71: Line 103:
And, if for some reason you don't know the syntax of for well enough, here's how to check your script's parameters for an element. For example, '-v': That same hack, for Bourne shells:
Line 73: Line 105:
   for element; do
      [[ $element = '-v' ]] && echo "Switching to verbose mode."
   done}}}
   # Bourne
   case " $foo " in
      *" $bar "*) echo "Found $bar.";;
   esac
   }}}
Line 77: Line 111:
GNU's grep has a {{{\b}}} feature which allegedly matches the edges of words. Using that, one may attempt to replicate the shorter approach used above, but it is fraught with peril: You can also use extended glob with printf to search for a word in an array.
''I haven't tested it enough, so it might break in some cases --sn18''
Line 80: Line 115:
   # Is 'foo' one of the positional parameters?    # Bash
   shopt -s extglob
   #convert array to glob
   printf -v glob '%q|' "${array[@]}"
   glob=${glob%|}
   [[ $word = @($glob) ]] && echo "Found $word"
   }}}

  . ''It will break when an array element contains a | character. Hence, I moved it down here with the other hacks that work in a similar fashion and have a similar limitation.'' -- GreyCat
   . ''printf %q quotes a | character too, so it probably should not'' --sn18

GNU's grep has a {{{\b}}} feature which allegedly matches the edges of words (word "boundaries"). Using that, one may attempt to replicate the shorter approach used above, but it is fraught with peril:

   {{{
   # Is 'foo' one of the positional parameters?
Line 82: Line 131:
Line 89: Line 139:
   # Obviously, you can't use this if someword is '-v'!}}}
 
Since this "feature" of GNU grep is both non-portable and poorly defined, we recommend '''not''' using it. It is simply mentioned here for the sake of completion.
   
   # Obviously, you can't use this if someword is '-v'!
   }}}

Since this "feature" of GNU grep is both non-portable and poorly defined, we recommend '''not''' using it. It is simply mentioned here for the sake of completeness.

== Bulk comparison ==
This method tries to compare the desired string to the entire contents of the array. It can potentially be very efficient, but it depends on a delimiter that must not be in the sought value or the array. Here we use $'\a', the BEL character, because it's extremely uncommon.

   {{{
   # usage: if has "element" list of words; then ...; fi
   has() {
     local IFS=$'\a' t="$1"
     shift
     [[ $'\a'"$*"$'\a' == *$'\a'$t$'\a'* ]]
   }
   }}}

== Enumerated types ==

In ksh93t or later, one may create enum types/variables/constants using the `enum` builtin. These work similarly to C enums (and the equivalent feature of other languages). These may be used to restrict which values may be assigned to a variable so as to avoid the need for an expensive test each time an array variable is set or referenced. Like types created using `typeset -T`, the result of an `enum` command is a new declaration command that can be used to instantiate objects of that type.

{{{
# ksh93
 $ enum colors=(red green blue)
 $ colors foo=green
 $ foo=yellow
ksh: foo: invalid value yellow
}}}

`typeset -a` can also be used in combination with an enum type to allow enum constants as subscripts.

{{{
# ksh93
 $ typeset -a [colors] bar
 $ bar[blue]=test1
 $ typeset -p bar
typeset -a [colors] bar=([blue]=test)
 $ bar[orange]=test
ksh: colors: invalid value orange
}}}

See `src/cmd/ksh93/tests/enum.sh` in the AST source for more examples.

----
CategoryShell

I want to check to see whether a word is in a list (or an element is a member of a set).

If your real question was How do I check whether one of my parameters was -v? then please see FAQ #35 instead. Otherwise, read on....

First of all, let's get the terminology straight. Bash has no notion of "lists" or "sets" or any such. Bash has strings and arrays. Strings are a "list" of characters, arrays are a "list" of strings.

NOTE: In the general case, a string cannot possibly contain a list of other strings because there is no reliable way to tell where each substring begins and ends.

Given a traditional array, the only proper way to do this is to loop over all elements in your array and check them for the element you are looking for. Say what we are looking for is in bar and our list is in the array foo:

  •    # Bash
       for element in "${foo[@]}"; do
          [[ $element = $bar ]] && echo "Found $bar."
       done

If you need to perform this several times in your script, you might want to extract the logic into a function:

  •    # Bash
       isIn() {
           local pattern="$1" element
           shift
    
           for element
           do
               [[ $element = $pattern ]] && return 0
           done
    
           return 1
       }
    
       if isIn "jacob" "${names[@]}"
       then
           echo "Jacob is on the list."
       fi

Or, if you want your function to return the index at which the element was found:

  •    # Bash 3.0 or higher
       indexOf() {
           local pattern=$1
           local index list
           shift
    
           list=("$@")
           for index in "${!list[@]}"
           do
               [[ ${list[index]} = $pattern ]] && {
                   echo $index
                   return 0
               }
           done
    
           echo -1
           return 1
       }
    
       if index=$(indexOf "jacob" "${names[@]}")
       then
           echo "Jacob is the ${index}th on the list."
       else
           echo "Jacob is not on the list."
       fi

If your "list" is contained in a string, and for some half-witted reason you choose not to heed the warnings above, you can use the following code to search through "words" in a string. (The only real excuse for this would be that you're stuck in Bourne shell, which has no arrays.)

  •    # Bourne
       set -f
       for element in $foo; do
          if test x"$element" = x"$bar"; then
             echo "Found $bar."
          fi
       done
       set +f

Here, a "word" is defined as any substring that is delimited by whitespace (or more specifically, the characters currently in IFS). The set -f prevents glob expansion of the words in the list. Turning glob expansions back on (set +f) is optional.

If you're working in bash 4 or ksh93, you have access to associative arrays. These will allow you to restructure the problem -- instead of making a list of words that are allowed, you can make an associative array whose keys are the words you want to allow. Their values could be meaningful, or not -- depending on the nature of the problem.

  •    # Bash 4
       declare -A good
       for word in "goodword1" "goodword2" ...; do
         good["$word"]=1
       done
    
       # Check whether $foo is allowed:
       if ((${good[$foo]})); then ...

Here's a hack that you shouldn't use, but which is presented for the sake of completeness:

  •    # Bash
       if [[ " $foo " = *" $bar "* ]]; then
          echo "Found $bar."
       fi

(The problem here is that is assumes space can be used as a delimiter between words. Your elements might contain spaces, which would break this!)

That same hack, for Bourne shells:

  •    # Bourne
       case " $foo " in
          *" $bar "*) echo "Found $bar.";;
       esac

You can also use extended glob with printf to search for a word in an array. I haven't tested it enough, so it might break in some cases --sn18

  •    # Bash
       shopt -s extglob
       #convert array to glob
       printf -v glob '%q|' "${array[@]}"
       glob=${glob%|}
       [[ $word = @($glob) ]] && echo "Found $word"
  • It will break when an array element contains a | character. Hence, I moved it down here with the other hacks that work in a similar fashion and have a similar limitation. -- GreyCat

    • printf %q quotes a | character too, so it probably should not --sn18

GNU's grep has a \b feature which allegedly matches the edges of words (word "boundaries"). Using that, one may attempt to replicate the shorter approach used above, but it is fraught with peril:

  •    # Is 'foo' one of the positional parameters?
       egrep '\bfoo\b' <<<"$@" >/dev/null && echo yes
    
       # This is where it fails: is '-v' one of the positional parameters?
       egrep '\b-v\b' <<<"$@" >/dev/null && echo yes
       # Unfortunately, \b sees "v" as a separate word.
       # Nobody knows what the hell it's doing with the "-".
    
       # Is "someword" in the array 'array'?
       egrep '\bsomeword\b' <<<"${array[@]}"
       # Obviously, you can't use this if someword is '-v'!

Since this "feature" of GNU grep is both non-portable and poorly defined, we recommend not using it. It is simply mentioned here for the sake of completeness.

Bulk comparison

This method tries to compare the desired string to the entire contents of the array. It can potentially be very efficient, but it depends on a delimiter that must not be in the sought value or the array. Here we use $'\a', the BEL character, because it's extremely uncommon.

  •    # usage: if has "element" list of words; then ...; fi
       has() {
         local IFS=$'\a' t="$1"
         shift
         [[ $'\a'"$*"$'\a' == *$'\a'$t$'\a'* ]]
       }

Enumerated types

In ksh93t or later, one may create enum types/variables/constants using the enum builtin. These work similarly to C enums (and the equivalent feature of other languages). These may be used to restrict which values may be assigned to a variable so as to avoid the need for an expensive test each time an array variable is set or referenced. Like types created using typeset -T, the result of an enum command is a new declaration command that can be used to instantiate objects of that type.

# ksh93
 $ enum colors=(red green blue)
 $ colors foo=green
 $ foo=yellow
ksh: foo:  invalid value yellow

typeset -a can also be used in combination with an enum type to allow enum constants as subscripts.

# ksh93
 $ typeset -a [colors] bar
 $ bar[blue]=test1
 $ typeset -p bar
typeset -a [colors] bar=([blue]=test)
 $ bar[orange]=test
ksh: colors:  invalid value orange

See src/cmd/ksh93/tests/enum.sh in the AST source for more examples.


CategoryShell

BashFAQ/046 (last edited 2023-04-29 04:33:04 by ormaaj)