Diff for "BashFAQ/054"

Differences between revisions 6 and 45 (spanning 39 versions)

How can I tell whether a variable contains a valid number?

First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.

Hand parsing

If you're validating a simple "string of digits", you can do it with a glob:

# Bash
if [[ $foo != *[!0-9]* ]]; then
    echo "'$foo' is strictly numeric"
else
    echo "'$foo' has a non-digit somewhere in it"
fi

The same thing can be done in Korn and POSIX shells as well, using case:

# POSIX
case $var in
    '')
        printf 'var is empty\n';;
    *[!0-9]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s is strictly numeric\n' "$var";;
esac >&2

Of course, if all you care about is vald vs. invalid, you can combine cases:

# POSIX
case $var in
    '' | *[!0-9]*)
        echo "$0: $var: invalid digit" >&2; exit 1;;
esac

If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare:

# POSIX
case ${var#[-+]} in   # notice ${var#prefix} substitution to trim sign
    '')
        printf 'var is empty\n';;
    *.*.*)
        printf '%s has more than one decimal point in it\n' "$var";;
    *[!0-9.]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s looks like a valid float\n' "$var";;
esac >&2

Or in Bash, we can use extended globs:

# Bash -- extended globs must be enabled explicitly in versions prior to 4.1.
# Check whether the variable is all digits.
shopt -s extglob
[[ $var = +([0-9]) ]]

A more complex case:

# Bash / ksh
shopt -s extglob

if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then
  echo 'foo is a floating-point number'
fi

Optionally, case..esac may have been used in shells with extended pattern matching. The leading test of $foo is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version (explained in detail here), using egrep:

# Bourne
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then
    echo "'$foo' is a number"
else
    echo "'$foo' is not a number"
fi

Bash version 3 and above have regular expression support in the [[ command.

# Bash
# The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2

regexp='^[-+]?[0-9]*(\.[0-9]*)?$'
if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

Using the parsing done by [ and printf (or "using eq")

# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null; then
 echo "$foo is an integer"
fi

[ parses the variable and interprets it as an integer because of the -eq. If the parsing succeeds the test is trivially true; if it fails [ prints an error message that 2>/dev/null hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.

Be careful: the following trick with printf (no supported by all shells)

# BASH
if printf %f "$foo" >/dev/null 2>&1; then
  echo "$foo is a float"
fi

is broken: about the arguments of the a, A, e, E, f, F, g, or G format modifiers, POSIX specifies that if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote. Hence this fails when foo expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float.

You can use %d to parse an integer. Take care that the parsing might be (is supposed to be?) locale-dependent.

-  ⇤ ← Revision 6 as of 2007-07-28 04:53:56 → 
  Size: 2698
  Editor: ppp020-014
  Comment: add matching with extglob
+   ← Revision 45 as of 2015-03-01 16:03:15 → ⇥
  Size: 4661
  Editor: pdm-l03
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-[[Anchor(faq54)]]
+<<Anchor(faq54)>>
-Line 3:
+Line 4:
+First, you have to define what you mean by "number".  The most common case when people ask this seems to be "a non-negative integer, with no leading + sign".  Or in other words, a string of all digits.  Other times, people want to validate a floating-point input, with optional sign and optional decimal point.
-Line 4:
+Line 6:
-First, you have to define what you mean by "number".  The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign".
+=== Hand parsing ===
If you're validating a simple "string of digits", you can do it with a [[glob]]:
-Line 7:
+Line 10:
-if [[ $foo = *[^0-9]* ]]; then
   echo "'$foo' has a non-digit somewhere in it"
+# Bash
if [[ $foo != *[!0-9]* ]]; then
    echo "'$foo' is strictly numeric"
-Line 10:
+Line 14:
-   echo "'$foo' is strictly numeric"
+    echo "'$foo' has a non-digit somewhere in it"
-Line 13:
+Line 17:
-This can be done in Korn and legacy Bourne shells as well, using {{{case}}}:
+The same thing can be done in Korn and POSIX shells as well, using {{{case}}}:
-Line 17:
+Line 20:
-case "$foo" in
    *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
+# POSIX
case $var in
    '')
        printf 'var is empty\n';;
    *[!0-9]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s is strictly numeric\n' "$var";;
esac >&2
}}}
Of course, if all you care about is vald vs. invalid, you can combine cases:

{{{
# POSIX
case $var in
    '' | *[!0-9]*)
        echo "$0: $var: invalid digit" >&2; exit 1;;
-Line 22:
+Line 39:
-If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression.  Here is a portable version, using {{{egrep}}}:
+If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways.  Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare:
-Line 26:
+Line 42:
-if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]+)?$' >/dev/null; then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
+# POSIX
case ${var#[-+]} in   # notice ${var#prefix} substitution to trim sign
    '')
        printf 'var is empty\n';;
    *.*.*)
        printf '%s has more than one decimal point in it\n' "$var";;
    *[!0-9.]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s looks like a valid float\n' "$var";;
esac >&2
}}}
Or in Bash, we can use [[glob|extended globs]]:

{{{
# Bash -- extended globs must be enabled explicitly in versions prior to 4.1.
# Check whether the variable is all digits.
shopt -s extglob
[[ $var = +([0-9]) ]]
}}}
A more complex case:

{{{
# Bash / ksh
shopt -s extglob

if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then
  echo 'foo is a floating-point number'
-Line 32:
+Line 72:
+Optionally, `case..esac` may have been used in shells with extended pattern matching. The leading test of {{{$foo}}} is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself.
-Line 33:
+Line 74:
-The leading test of {{{"$foo"}}} is to ensure that it is not an empty string.  (An empty string would satisfy the regex, and changing the regex to avoid that is not worth the effort.)

Bash version 3 and above have regular expression support in the [[ command.  However, due to serious bugs and syntax changes in Bash's [[ regex support, we '''do not recommend''' using it.  Nevertheless, if I simply omit all Bash regex answers here, someone will come along and fill them in -- and they probably won't work, or won't contain all the caveats necessary.  So, in the interest of preventing disasters, here are the Bash regex answers that you should not use.
+If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax.  Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{egrep}}}:
-Line 38:
+Line 77:
-if [[ $foo && $foo =~ ^[-+]?[0-9]*\(\.[0-9]+\)?$ ]]; then  # Bash 3.1 only!
+# Bourne
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then
    echo "'$foo' is a number"
else
    echo "'$foo' is not a number"
fi
}}}
Bash version 3 and above have regular expression support in the [[ command.

{{{
# Bash
# The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2

regexp='^[-+]?[0-9]*(\.[0-9]*)?$'
if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then
-Line 44:
+Line 97:
+=== Using the parsing done by [ and printf (or "using eq") ===
{{{
# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null; then
 echo "$foo is an integer"
fi
}}}
`[` parses the variable and interprets it as an integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0.  However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.
-Line 45:
+Line 106:
-Unfortunately, bash changed the syntax of its regular expression support after version 3.1, so the following ''may'' work in some patched versions of bash 3.2:
+Be careful: the following trick with `printf` (no supported by all shells)
-Line 48:
+Line 109:
-if [[ $foo && $foo =~ ^[-+]?[0-9]*(\.[0-9]+)?$ ]]; then    # **PATCHED** Bash 3.2 only!
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
+# BASH
if printf %f "$foo" >/dev/null 2>&1; then
  echo "$foo is a float"
-Line 54:
+Line 114:
+is broken: about the arguments of the{{{ a}}}, {{{A}}}, {{{e}}}, {{{E}}}, {{{f}}}, {{{F}}}, {{{g}}}, or {{{G}}} format modifiers, POSIX specifies that ''if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote.'' Hence this fails when {{{foo}}} expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float.
-Line 55:
+Line 116:
-It fails rather spectacularly in bash 3.1 and in bash 3.2 without patches.

Note that the parentheses in the {{{egrep}}} regular expression and the bash 3.2.patched regular expression don't require backslashes in front of them, whereas the ones in the bash 3.1 command do.

Another possibility with bash is perhaps to use extglob:
{{{
shopt -s extglob
[[ +1234.43 = *([+-])+([0-9])*(.+([0-9])) ]] && echo "this is a number"
}}}
+You can use `%d` to parse an integer.  Take care that the parsing might be (is supposed to be?) [[locale]]-dependent.