Diff for "BashFAQ/054"

Differences between revisions 11 and 18 (spanning 7 versions)

How can I tell whether a variable contains a valid number?

Hand Parsing

First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. This can be checked using standard globs:

# Bash
if [[ $foo = *[^0-9]* ]]; then
    echo "'$foo' has a non-digit somewhere in it"
else
    echo "'$foo' is strictly numeric"
fi

The same thing can be done in Korn and POSIX shells as well, using case:

# ksh, POSIX
case "$foo" in
    *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
esac

If what you actually mean is "a valid floating-point number" or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but we can use extended globs:

# Bash -- extended globs must be enabled.
# Check whether the variable is all digits.
shopt -s extglob
[[ $var == +([0-9]) ]]

A more complex case:

# Bash use case instead of [[ ]] for old ksh 
shopt -s extglob
[[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]] &&
  echo "foo is a floating-point number"

The leading test of $foo is to ensure that it contains at least one digit. The extended glob, by itself, would match the empty string, or a lone + or -, which may not be desirable behavior.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version, using egrep:

# Bourne
if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]*)?$' >/dev/null
then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
fi

(Like the extended globs, this extended regular expression will match a lone + or -. The initial test command only requires a non-empty string. Closing the last "bug" is left as an exercise for the reader, mostly because GreyCat is too damned lazy to learn expr(1).)

Bash version 3 and above have regular expression support in the [[ command.

# put it in a var for backward compatibility with versions <3.2
regexp='^[-+]?[0-9]*(\.[0-9]*)?$' 
if [[ $foo = *[0-9]* && $foo =~ $var ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

Using the parsing done by [ and printf

# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null;then
 echo "$foo is an integer"
fi

[ parses the variable and interprets it as in integer because of the -eq. If the parsing succeds the test is trivially true, if it fails [ prints an error message that 2>/dev/null hides and set a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression, see below to learn while.

You can use the same trick with printf:

#  posix 
if printf "%f" "$foo" >/dev/null 2>&1;then
  echo "$foo is a float"
fi

You can use %d to parse an integer. Take care that the parsing might (is supposed to be?) be locale dependant.

Using the integer type

If you just want to guarantee ahead of time that a variable contains an integer, without actually checking, you can give the variable the "integer" attribute.

declare -i foo
foo=-10+1; echo "$foo"    # prints -9

foo="hello"; echo "$foo"
# the value of the variable "hello" is evaluated; if unset, foo is 0

foo="Some random string"  # results in an error.

Any value assigned to a variable with the integer attribute set is evaluated as an arithmetic expression just like inside $(( )). Bash will raise an error if you try to assign an invalid arithmetic expression.

In Bash and ksh93, if a variable which has been declared integer is used in a read command, the user's input is treated as an arithmetic expression, as with assignment. In particular, if the user types an identifier, the variable will be set to the value of the variable with that name, and read will give no other indication of a problem.

# Bash (and ksh93, if you replace declare with typeset)
$ declare -i foo
$ read foo
hello
$ echo $foo    # prints 0; 'hello' is unset, so is treated as 0 for arithmetic purposes
$ hello=5
$ read foo     # user types hello again
hello
$ echo $foo    # prints 5, the value of 'hello' as an arithmetic expression

Pretty useless if you want to read only integers.

In the older Korn shell (ksh88), if a variable is declared integer and used in a read command, and the user types an invalid integer, the shell complains, the read command returns an error status, and the value of the variable is unchanged.

# ksh88
$ typeset -i foo
$ foo=42
$ read foo
hello
ksh: hello: bad number
$ echo $?
1
$ echo $foo
42

-  ⇤ ← Revision 11 as of 2008-03-06 14:13:49 → 
  Size: 5498
  Editor: GreyCat
  Comment: expand on "declare integer"
+   ← Revision 18 as of 2010-01-15 20:30:25 → ⇥
  Size: 5228
  Editor: ppp089210037022
  Comment: a bit of clean, add the solutions using [ and printf
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-[[Anchor(faq54)]]
+<<Anchor(faq54)>>
 Line 4:
-First, you have to define what you mean by "number".  The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign".  Or in other words, a string of all digits.
+=== Hand Parsing ===
First, you have to define what you mean by "number".  The most common case when people ask this seems to be "a non-negative integer, with no leading + sign".  Or in other words, a string of all digits.  This can be checked using standard [[glob|globs]]:
-Line 7:
+Line 8:
+# Bash
-Line 14:
+Line 16:
-This can be done in Korn and legacy Bourne shells as well, using {{{case}}}:
+The same thing can be done in Korn and POSIX shells as well, using {{{case}}}:
-Line 17:
+Line 19:
+# ksh, POSIX
-Line 23:
+Line 26:
-If what you actually mean is "a valid floating-point number" or something else more complex, then there are a few possible ways.  One of them is to use Bash's {{{extglob}}} capability:
+If what you actually mean is "a valid floating-point number" or something else more complex, then there are a few possible ways.  Standard globs aren't expressive enough to do this, but we can use [[glob|extended globs]]:
-Line 27:
+Line 29:
-# Bash example; extended globs are disabled by default
+# Bash -- extended globs must be enabled.
# Check whether the variable is all digits.
-Line 29:
+Line 32:
-[[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]] && echo "foo is a number"
+[[ $var == +([0-9]) ]]
}}}

A more complex case:

{{{
# Bash use case instead of [[ ]] for old ksh 
shopt -s extglob
[[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]] &&
  echo "foo is a floating-point number"
-Line 34:
+Line 46:
-The features enabled with {{{extglob}}} in Bash are also allowed in the Korn shell by default.  The difference here is that Ksh lacks Bash's {{{[[}}} and must use {{{case}}} instead:
+If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax.  Here is a portable version, using {{{egrep}}}:
-Line 37:
+Line 49:
-# Ksh example using extended globs
case $foo in
  *[0-9]*)
    case $foo in
        ?([+-])*([0-9])?(.*([0-9]))) echo "foo is a number";;
    esac;;
esac
}}}

Note that this uses the same extended glob as the Bash example before it; the third closing parenthesis at the end of it is actually part of the {{{case}}} syntax.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use a regular expression.  Here is a portable version, using {{{egrep}}}:

{{{
if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]*)?$' >/dev/null; then
+# Bourne
if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]*)?$' >/dev/null
then
 Line 58:
-(Like the extended globs, this extended regular expression matches a lone {{{+}}} or {{{-}}}, and the code may therefore require adjustment.  The initial {{{test}}} command only requires a non-empty string.  Closing the last "bug" is left as an exercise for the reader, mostly because GreyCat is too damned lazy to learn {{{expr(1)}}}.)
+(Like the extended globs, this [[RegularExpression|extended regular expression]] will match a lone {{{+}}} or {{{-}}}.  The initial {{{test}}} command only requires a non-empty string.  Closing the last "bug" is left as an exercise for the reader, mostly because GreyCat is too damned lazy to learn {{{expr(1)}}}.)
 Line 60:
-Bash version 3 and above have regular expression support in the [[ command.  However, due to serious bugs and syntax changes in Bash's [[ regex support, we '''do not recommend''' using it.  Nevertheless, if I simply omit all Bash regex answers here, someone will come along and fill them in -- and they probably won't work, or won't contain all the caveats necessary.  So, in the interest of preventing disasters, here are the Bash regex answers that you should not use.
+Bash version 3 and above have regular expression support in the [[ command.
 Line 63:
-if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*\(\.[0-9]*\)?$ ]]; then  # Bash 3.1 only!
+# put it in a var for backward compatibility with versions <3.2
regexp='^[-+]?[0-9]*(\.[0-9]*)?$' 
if [[ $foo = *[0-9]* && $foo =~ $var ]]; then
-Line 70:
+Line 72:
-Unfortunately, Bash changed the syntax of its regular expression support after version 3.1, so the following ''may'' work in some patched versions of Bash 3.2:
+=== Using the parsing done by [ and printf ===
-Line 73:
+Line 75:
-if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*(\.[0-9]*)?$ ]]; then    # **PATCHED** Bash 3.2 only!
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
+# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null;then
 echo "$foo is an integer"
fi
}}} 

[ parses the variable and interprets it as in integer because of
the -eq. If the parsing succeds the test is trivially true, if it fails [ prints an error message
that 2>/dev/null hides and set a status different from 0.
However this method fails if the shell is ksh, because
ksh evaluates the variable as an arithmetic expression, see below
to learn while.

You can use the same trick with printf:
{{{
#  posix 
if printf "%f" "$foo" >/dev/null 2>&1;then
  echo "$foo is a float"
-Line 80:
+Line 96:
-It fails rather spectacularly in bash 3.1 and in bash 3.2 without patches.
+You can use %d to parse an integer.
Take care that the parsing might (is supposed to be?) be locale dependant.
-Line 82:
+Line 99:
-Note that the parentheses in the {{{egrep}}} regular expression and the bash 3.2.patched regular expression don't require backslashes in front of them, whereas the ones in the bash 3.1 command do.

Stuffing the Bash regex into a variable, and then using {{{[[ $foo =~ $bar ]]}}}, may also be an effective workaround in some cases.  But this belongs in a separate FAQ....
+=== Using the integer type ===
-Line 91:
+Line 105:
-foo=-10+1; echo "$foo" #prints -9
foo="hello"; echo "$foo" # the value of the variable "hello" is evaluated, if unset foo is assigned 0
foo="Some random string" #result in an error.
+foo=-10+1; echo "$foo"    # prints -9

foo="hello"; echo "$foo"
# the value of the variable "hello" is evaluated; if unset, foo is 0

foo="Some random string"  # results in an error.
-Line 96:
+Line 113:
-Any value assigned to a variable with the integer attribute set is evaluated as an [:ArithmeticExpression:arithmetic expression] just like inside `$(( ))`.  Bash will raise an error if you try to assign an invalid arithmetic expression.
+Any value assigned to a variable with the integer attribute set is evaluated as an [[ArithmeticExpression|arithmetic expression]] just like inside `$(( ))`.  Bash will raise an error if you try to assign an invalid arithmetic expression.
-Line 98:
+Line 115:
-In Bash and ksh93, if a variable which has been declared integer is used in a `read` command, and the user types an invalid integer variable, the variable will have a value of 0 instead of whatever the user typed.  `read` will give no other indication of a problem.
+In Bash and ksh93, if a variable which has been declared integer is used in a `read` command, the user's input is treated as an [[ArithmeticExpression|arithmetic expression]],  as with assignment.  In particular, if the user types an identifier, the variable will be set to the value of the variable with that name, and `read` will give no other indication of a problem.
-Line 101:
+Line 118:
-# bash (and ksh93, if you replace declare with typeset)
declare -i foo
read foo     # user types hello; bash doesn't complain, and read returns success
echo $foo    # prints 0
+# Bash (and ksh93, if you replace declare with typeset)
$ declare -i foo
$ read foo
hello
$ echo $foo    # prints 0; 'hello' is unset, so is treated as 0 for arithmetic purposes
$ hello=5
$ read foo     # user types hello again
hello
$ echo $foo    # prints 5, the value of 'hello' as an arithmetic expression
-Line 107:
+Line 129:
-In the Korn shell (ksh88), if a variable is declared integer and used in a `read` command, and the user types an invalid integer, the shell complains, the read command returns an error status, and the value of the variable is unchanged.
+Pretty useless if you want to read only integers.

In the older Korn shell (ksh88), if a variable is declared integer and used in a `read` command, and the user types an invalid integer, the shell complains, the read command returns an error status, and the value of the variable is unchanged.