6058
Comment: first-line
|
3574
Fix some bugs. ksh has always supported [[. rm'd example. =~ shouldn't be shunned due to bash 3's bad implementation. rm'd (again). =~ (and [[) will quite likely be POSIX at some point.
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. This can be checked using standard [[glob|globs]]: | First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point. === Hand parsing === If you're validating a simple "string of digits", you can do it with a [[glob]]: |
Line 7: | Line 10: |
if [[ $foo = *[^0-9]* ]]; then | if [[ $foo != *[!0-9]* ]]; then echo "'$foo' is strictly numeric" else |
Line 9: | Line 14: |
else echo "'$foo' is strictly numeric" |
|
Line 13: | Line 16: |
Line 17: | Line 19: |
# ksh, POSIX case "$foo" in *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;; *) echo "'$foo' is strictly numeric" ;; esac |
# POSIX case $var in *[!0-9]*|'') printf '%s has a non-digit somewhere in it\' "$var" ;; *) printf '%s is strictly numeric\n' "$var" esac >&2 |
Line 23: | Line 28: |
If what you actually mean is "a valid floating-point number" or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but we can use [[glob|extended globs]]: |
If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but we can use [[glob|extended globs]]: |
Line 27: | Line 31: |
# Bash -- extended globs must be enabled. | # Bash -- extended globs must be enabled explicitly in versions prior to 4.1. |
Line 36: | Line 40: |
# Bash | # Bash / ksh |
Line 38: | Line 42: |
[[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]] && echo "foo is a floating-point number" |
if [[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then echo 'foo is a floating-point number' fi |
Line 42: | Line 48: |
The leading test of {{{$foo}}} is to ensure that it contains at least one digit. The extended glob, by itself, would match the empty string, or a lone {{{+}}} or {{{-}}}, which may not be desirable behavior. | Optionally, `case..esac` may have been used in shells with extended pattern matching. The leading test of {{{$foo}}} is to ensure that it contains at least one digit. The extended glob, by itself, would match the empty string, or a lone {{{+}}} or {{{-}}}, which may not be desirable behavior. |
Line 44: | Line 50: |
The features enabled with {{{extglob}}} in Bash are also allowed in the Korn shell by default. The difference here is that ksh lacks Bash's {{{[[}}} and must use {{{case}}} instead: | |
Line 46: | Line 51: |
{{{ # ksh - extended globs are on by default case $foo in *[0-9]*) case $foo in ?([+-])*([0-9])?(.*([0-9]))) echo "foo is a number";; esac;; esac }}} Note that this uses the same extended glob as the Bash example before it; the third closing parenthesis at the end of it is actually part of the {{{case}}} syntax. If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax. Here is a portable version, using {{{egrep}}}: |
If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax. Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{egrep}}}: |
Line 62: | Line 55: |
if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]*)?$' >/dev/null then echo "'$foo' might be a number" |
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then echo "'$foo' is a number" |
Line 66: | Line 58: |
echo "'$foo' might not be a number" | echo "'$foo' is not a number" |
Line 69: | Line 61: |
(Like the extended globs, this [[RegularExpression|extended regular expression]] will match a lone {{{+}}} or {{{-}}}. The initial {{{test}}} command only requires a non-empty string. Closing the last "bug" is left as an exercise for the reader, mostly because GreyCat is too damned lazy to learn {{{expr(1)}}}.) Bash version 3 and above have regular expression support in the [[ command. However, due to serious bugs and syntax changes in Bash's [[ regex support, we '''do not recommend''' using it. Nevertheless, if I simply omit all Bash regex answers here, someone will come along and fill them in -- and they probably won't work, or won't contain all the caveats necessary. So, in the interest of preventing disasters, here are the Bash regex answers that you should not use. |
Bash version 3 and above have regular expression support in the [[ command. |
Line 75: | Line 64: |
# Bash 3.1 ONLY if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*\(\.[0-9]*\)?$ ]]; then |
# Bash # The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2 regexp='^[-+]?[0-9]*(\.[0-9]*)?$' if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then |
Line 83: | Line 75: |
Unfortunately, Bash changed the syntax of its regular expression support after version 3.1, so the following ''may'' work in some patched versions of Bash 3.2: | === Using the parsing done by [ and printf (or "using eq") === {{{ # fails with ksh if [ "$foo" -eq "$foo" ] 2>/dev/null; then echo "$foo is an integer" fi }}} `[` parses the variable and interprets it as an integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression. You can use a similar trick with `printf`, but this won't work in all shells either: |
Line 86: | Line 87: |
# Bash 3.2 *PATCHED* only! if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*(\.[0-9]*)?$ ]]; then echo "'$foo' looks rather like a number" else echo "'$foo' doesn't look particularly numeric to me" |
# BASH if printf %f "$foo" >/dev/null 2>&1; then echo "$foo is a float" |
Line 94: | Line 93: |
It fails rather spectacularly in bash 3.1 and in bash 3.2 without patches. Note that the parentheses in the {{{egrep}}} regular expression and the bash 3.2.patched regular expression don't require backslashes in front of them, whereas the ones in the bash 3.1 command do. Stuffing the Bash regex into a variable, and then using {{{[[ $foo =~ $bar ]]}}}, may also be an effective workaround in some cases. But this belongs in a separate FAQ.... If you just want to guarantee ahead of time that a variable contains an integer, without actually checking, you can give the variable the "integer" attribute. {{{ declare -i foo foo=-10+1; echo "$foo" # prints -9 foo="hello"; echo "$foo" # the value of the variable "hello" is evaluated; if unset, foo is 0 foo="Some random string" # results in an error. }}} Any value assigned to a variable with the integer attribute set is evaluated as an [[ArithmeticExpression|arithmetic expression]] just like inside `$(( ))`. Bash will raise an error if you try to assign an invalid arithmetic expression. In Bash and ksh93, if a variable which has been declared integer is used in a `read` command, the user's input is treated as an [[ArithmeticExpression|arithmetic expression]], as with assignment. In particular, if the user types an identifier, the variable will be set to the value of the variable with that name, and `read` will give no other indication of a problem. {{{ # Bash (and ksh93, if you replace declare with typeset) $ declare -i foo $ read foo hello $ echo $foo # prints 0; 'hello' is unset, so is treated as 0 for arithmetic purposes $ hello=5 $ read foo # user types hello again hello $ echo $foo # prints 5, the value of 'hello' as an arithmetic expression }}} Pretty useless if you want to read only integers. In the older Korn shell (ksh88), if a variable is declared integer and used in a `read` command, and the user types an invalid integer, the shell complains, the read command returns an error status, and the value of the variable is unchanged. {{{ # ksh88 $ typeset -i foo $ foo=42 $ read foo hello ksh: hello: bad number $ echo $? 1 $ echo $foo 42 }}} |
You can use `%d` to parse an integer. Take care that the parsing might be (is supposed to be?) [[locale]]-dependent. |
How can I tell whether a variable contains a valid number?
First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.
Hand parsing
If you're validating a simple "string of digits", you can do it with a glob:
# Bash if [[ $foo != *[!0-9]* ]]; then echo "'$foo' is strictly numeric" else echo "'$foo' has a non-digit somewhere in it" fi
The same thing can be done in Korn and POSIX shells as well, using case:
# POSIX case $var in *[!0-9]*|'') printf '%s has a non-digit somewhere in it\' "$var" ;; *) printf '%s is strictly numeric\n' "$var" esac >&2
If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but we can use extended globs:
# Bash -- extended globs must be enabled explicitly in versions prior to 4.1. # Check whether the variable is all digits. shopt -s extglob [[ $var == +([0-9]) ]]
A more complex case:
# Bash / ksh shopt -s extglob if [[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then echo 'foo is a floating-point number' fi
Optionally, case..esac may have been used in shells with extended pattern matching. The leading test of $foo is to ensure that it contains at least one digit. The extended glob, by itself, would match the empty string, or a lone + or -, which may not be desirable behavior.
If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version (explained in detail here), using egrep:
# Bourne if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then echo "'$foo' is a number" else echo "'$foo' is not a number" fi
Bash version 3 and above have regular expression support in the [[ command.
# Bash # The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2 regexp='^[-+]?[0-9]*(\.[0-9]*)?$' if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then echo "'$foo' looks rather like a number" else echo "'$foo' doesn't look particularly numeric to me" fi
Using the parsing done by [ and printf (or "using eq")
# fails with ksh if [ "$foo" -eq "$foo" ] 2>/dev/null; then echo "$foo is an integer" fi
[ parses the variable and interprets it as an integer because of the -eq. If the parsing succeeds the test is trivially true; if it fails [ prints an error message that 2>/dev/null hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.
You can use a similar trick with printf, but this won't work in all shells either:
# BASH if printf %f "$foo" >/dev/null 2>&1; then echo "$foo is a float" fi
You can use %d to parse an integer. Take care that the parsing might be (is supposed to be?) locale-dependent.