Diff for "BashFAQ/054"

Differences between revisions 17 and 36 (spanning 19 versions)

How can I tell whether a variable contains a valid number?

First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.

Hand parsing

If you're validating a simple "string of digits", you can do it with a glob:

# Bash
if [[ $foo != *[!0-9]* ]]; then
    echo "'$foo' is strictly numeric"
else
    echo "'$foo' has a non-digit somewhere in it"
fi

The same thing can be done in Korn and POSIX shells as well, using case:

# POSIX
case $var in
    *[!0-9]*|'')
        printf '%s has a non-digit somewhere in it\' "$var"
        ;;
    *)
        printf '%s is strictly numeric\n' "$var"
esac >&2

If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but we can use extended globs:

# Bash -- extended globs must be enabled explicitly in versions prior to 4.1.
# Check whether the variable is all digits.
shopt -s extglob
[[ $var == +([0-9]) ]]

A more complex case:

# Bash / ksh
shopt -s extglob

if [[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then
  echo 'foo is a floating-point number'
fi

Optionally, case..esac may have been used in shells with extended pattern matching. The leading test of $foo is to ensure that it contains at least one digit. The extended glob, by itself, would match the empty string, or a lone + or -, which may not be desirable behavior.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version (explained in detail here), using egrep:

# Bourne
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then
    echo "'$foo' is a number"
else
    echo "'$foo' is not a number"
fi

Bash version 3 and above have regular expression support in the [[ command.

# Bash
# The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2

regexp='^[-+]?[0-9]*(\.[0-9]*)?$'
if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

Using the parsing done by [ and printf (or "using eq")

# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null; then
 echo "$foo is an integer"
fi

[ parses the variable and interprets it as an integer because of the -eq. If the parsing succeeds the test is trivially true; if it fails [ prints an error message that 2>/dev/null hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.

You can use a similar trick with printf, but this won't work in all shells either:

# BASH
if printf %f "$foo" >/dev/null 2>&1; then
  echo "$foo is a float"
fi

You can use %d to parse an integer. Take care that the parsing might be (is supposed to be?) locale-dependent.

-  ⇤ ← Revision 17 as of 2008-11-22 21:55:28 → 
  Size: 6058
  Editor: GreyCat
  Comment: first-line
+   ← Revision 36 as of 2014-05-02 14:28:21 → ⇥
  Size: 3574
  Editor: ormaaj
  Comment: Fix some bugs. ksh has always supported [[. rm'd example. =~ shouldn't be shunned due to bash 3's bad implementation. rm'd (again). =~ (and [[) will quite likely be POSIX at some point.
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-First, you have to define what you mean by "number".  The most common case when people ask this seems to be "a non-negative integer, with no leading + sign".  Or in other words, a string of all digits.  This can be checked using standard [[glob|globs]]:
+First, you have to define what you mean by "number".  The most common case when people ask this seems to be "a non-negative integer, with no leading + sign".  Or in other words, a string of all digits.  Other times, people want to validate a floating-point input, with optional sign and optional decimal point.

=== Hand parsing ===
If you're validating a simple "string of digits", you can do it with a [[glob]]:
-Line 7:
+Line 10:
-if [[ $foo = *[^0-9]* ]]; then
+if [[ $foo != *[!0-9]* ]]; then
    echo "'$foo' is strictly numeric"
else
-Line 9:
+Line 14:
-else
    echo "'$foo' is strictly numeric"
-Line 13:
+Line 16:
-Line 17:
+Line 19:
-# ksh, POSIX
case "$foo" in
    *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
esac
+# POSIX
case $var in
    *[!0-9]*|'')
        printf '%s has a non-digit somewhere in it\' "$var"
        ;;
    *)
        printf '%s is strictly numeric\n' "$var"
esac >&2
-Line 23:
+Line 28:
-If what you actually mean is "a valid floating-point number" or something else more complex, then there are a few possible ways.  Standard globs aren't expressive enough to do this, but we can use [[glob|extended globs]]:
+If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways.  Standard globs aren't expressive enough to do this, but we can use [[glob|extended globs]]:
-Line 27:
+Line 31:
-# Bash -- extended globs must be enabled.
+# Bash -- extended globs must be enabled explicitly in versions prior to 4.1.
-Line 36:
+Line 40:
-# Bash
+# Bash / ksh
-Line 38:
+Line 42:
-[[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]] &&
  echo "foo is a floating-point number"
+if [[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then
  echo 'foo is a floating-point number'
fi
-Line 42:
+Line 48:
-The leading test of {{{$foo}}} is to ensure that it contains at least one digit.  The extended glob, by itself, would match the empty string, or a lone {{{+}}} or {{{-}}}, which may not be desirable behavior.
+Optionally, `case..esac` may have been used in shells with extended pattern matching. The leading test of {{{$foo}}} is to ensure that it contains at least one digit.  The extended glob, by itself, would match the empty string, or a lone {{{+}}} or {{{-}}}, which may not be desirable behavior.
-Line 44:
+Line 50:
-The features enabled with {{{extglob}}} in Bash are also allowed in the Korn shell by default.  The difference here is that ksh lacks Bash's {{{[[}}} and must use {{{case}}} instead:
-Line 46:
+Line 51:
-{{{
# ksh - extended globs are on by default
case $foo in
  *[0-9]*)
    case $foo in
        ?([+-])*([0-9])?(.*([0-9]))) echo "foo is a number";;
    esac;;
esac
}}}

Note that this uses the same extended glob as the Bash example before it; the third closing parenthesis at the end of it is actually part of the {{{case}}} syntax.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax.  Here is a portable version, using {{{egrep}}}:
+If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax.  Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{egrep}}}:
-Line 62:
+Line 55:
-if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]*)?$' >/dev/null
then
    echo "'$foo' might be a number"
+if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then
    echo "'$foo' is a number"
-Line 66:
+Line 58:
-    echo "'$foo' might not be a number"
+    echo "'$foo' is not a number"
-Line 69:
+Line 61:
-(Like the extended globs, this [[RegularExpression|extended regular expression]] will match a lone {{{+}}} or {{{-}}}.  The initial {{{test}}} command only requires a non-empty string.  Closing the last "bug" is left as an exercise for the reader, mostly because GreyCat is too damned lazy to learn {{{expr(1)}}}.)

Bash version 3 and above have regular expression support in the [[ command.  However, due to serious bugs and syntax changes in Bash's [[ regex support, we '''do not recommend''' using it.  Nevertheless, if I simply omit all Bash regex answers here, someone will come along and fill them in -- and they probably won't work, or won't contain all the caveats necessary.  So, in the interest of preventing disasters, here are the Bash regex answers that you should not use.
+Bash version 3 and above have regular expression support in the [[ command.
-Line 75:
+Line 64:
-# Bash 3.1 ONLY
if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*\(\.[0-9]*\)?$ ]]; then
+# Bash
# The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2

regexp='^[-+]?[0-9]*(\.[0-9]*)?$'
if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then
-Line 83:
+Line 75:
-Unfortunately, Bash changed the syntax of its regular expression support after version 3.1, so the following ''may'' work in some patched versions of Bash 3.2:
+=== Using the parsing done by [ and printf (or "using eq") ===
{{{
# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null; then
 echo "$foo is an integer"
fi
}}}
`[` parses the variable and interprets it as an integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0.  However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.

You can use a similar trick with `printf`, but this won't work in all shells either:
-Line 86:
+Line 87:
-# Bash 3.2 *PATCHED* only!
if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*(\.[0-9]*)?$ ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
+# BASH
if printf %f "$foo" >/dev/null 2>&1; then
  echo "$foo is a float"
-Line 94:
+Line 93:
-It fails rather spectacularly in bash 3.1 and in bash 3.2 without patches.

Note that the parentheses in the {{{egrep}}} regular expression and the bash 3.2.patched regular expression don't require backslashes in front of them, whereas the ones in the bash 3.1 command do.

Stuffing the Bash regex into a variable, and then using {{{[[ $foo =~ $bar ]]}}}, may also be an effective workaround in some cases.  But this belongs in a separate FAQ....

If you just want to guarantee ahead of time that a variable contains an integer, without actually checking, you can give the variable the "integer" attribute.

{{{
declare -i foo
foo=-10+1; echo "$foo"    # prints -9

foo="hello"; echo "$foo"
# the value of the variable "hello" is evaluated; if unset, foo is 0

foo="Some random string"  # results in an error.
}}}

Any value assigned to a variable with the integer attribute set is evaluated as an [[ArithmeticExpression|arithmetic expression]] just like inside `$(( ))`.  Bash will raise an error if you try to assign an invalid arithmetic expression.

In Bash and ksh93, if a variable which has been declared integer is used in a `read` command, the user's input is treated as an [[ArithmeticExpression|arithmetic expression]],  as with assignment.  In particular, if the user types an identifier, the variable will be set to the value of the variable with that name, and `read` will give no other indication of a problem.

{{{
# Bash (and ksh93, if you replace declare with typeset)
$ declare -i foo
$ read foo
hello
$ echo $foo    # prints 0; 'hello' is unset, so is treated as 0 for arithmetic purposes
$ hello=5
$ read foo     # user types hello again
hello
$ echo $foo    # prints 5, the value of 'hello' as an arithmetic expression
}}}

Pretty useless if you want to read only integers.

In the older Korn shell (ksh88), if a variable is declared integer and used in a `read` command, and the user types an invalid integer, the shell complains, the read command returns an error status, and the value of the variable is unchanged.

{{{
# ksh88
$ typeset -i foo
$ foo=42
$ read foo
hello
ksh: hello: bad number
$ echo $?
1
$ echo $foo
42
}}}
+You can use `%d` to parse an integer.  Take care that the parsing might be (is supposed to be?) [[locale]]-dependent.