Differences between revisions 2 and 45 (spanning 43 versions)
Revision 2 as of 2007-05-07 18:02:07
Size: 1556
Editor: GreyCat
Comment: [0-9]* before the decimal, not [0-9]+, so we can have .5 as a number. This requires checking non-blank-ness.
Revision 45 as of 2015-03-01 16:03:15
Size: 4661
Editor: pdm-l03
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq54)]] <<Anchor(faq54)>>
Line 3: Line 4:
First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.
Line 4: Line 6:
First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign". === Hand parsing ===
If you're validating a simple "string of digits", you can do it with a [[glob]]:
Line 7: Line 10:
if [[ $foo = *[^0-9]* ]]; then
   echo "'$foo' has a non-digit somewhere in it"
# Bash
if [[ $foo != *[!0-9]* ]]; then
    echo "'$foo' is strictly numeric"
Line 10: Line 14:
   echo "'$foo' is strictly numeric"   echo "'$foo' has a non-digit somewhere in it"
Line 13: Line 17:

This can be done in Korn and legacy Bourne shells as well, using {{{case}}}:
The same thing can be done in Korn and POSIX shells as well, using {{{case}}}:
Line 17: Line 20:
case "$foo" in
    *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
# POSIX
case $var in
    '')
        printf 'var is empty\n';;
    *[!0-9]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s is strictly numeric\n' "$var";;
esac >&2
}}}
Of course, if all you care about is vald vs. invalid, you can combine cases:

{{{
# POSIX
case $var in
    '' | *[!0-9]*)
        echo "$0: $var: invalid digit" >&2; exit 1;;
Line 22: Line 39:

If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression. Bash version 3 and above have regular expression support in the [[ command:
If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare:
Line 26: Line 42:
if [[ $foo && $foo =~ ^[-+]?[0-9]*\(\.[0-9]+\)?$ ]]; then # POSIX
case ${var#[-+]} in # notice ${var#prefix} substitution to trim sign
    '')
        printf 'var is empty\n';;
    *.*.*)
        printf '%s has more than one decimal point in it\n' "$var";;
    *[!0-9.]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s looks like a valid float\n' "$var";;
esac >&2
}}}
Or in Bash, we can use [[glob|extended globs]]:

{{{
# Bash -- extended globs must be enabled explicitly in versions prior to 4.1.
# Check whether the variable is all digits.
shopt -s extglob
[[ $var = +([0-9]) ]]
}}}
A more complex case:

{{{
# Bash / ksh
shopt -s extglob

if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then
  echo 'foo is a floating-point number'
fi
}}}
Optionally, `case..esac` may have been used in shells with extended pattern matching. The leading test of {{{$foo}}} is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax. Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{egrep}}}:

{{{
# Bourne
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then
    echo "'$foo' is a number"
else
    echo "'$foo' is not a number"
fi
}}}
Bash version 3 and above have regular expression support in the [[ command.

{{{
# Bash
# The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2

regexp='^[-+]?[0-9]*(\.[0-9]*)?$'
if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then
Line 32: Line 97:
=== Using the parsing done by [ and printf (or "using eq") ===
{{{
# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null; then
 echo "$foo is an integer"
fi
}}}
`[` parses the variable and interprets it as an integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.
Line 33: Line 106:
If you don't have bash version 3, then you would use {{{egrep}}}: Be careful: the following trick with `printf` (no supported by all shells)
Line 36: Line 109:
if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]+)?$' >/dev/null; then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
# BASH
if printf %f "$foo" >/dev/null 2>&1; then
  echo "$foo is a float"
Line 42: Line 114:
is broken: about the arguments of the{{{ a}}}, {{{A}}}, {{{e}}}, {{{E}}}, {{{f}}}, {{{F}}}, {{{g}}}, or {{{G}}} format modifiers, POSIX specifies that ''if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote.'' Hence this fails when {{{foo}}} expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float.
Line 43: Line 116:
Note that the parentheses in the {{{egrep}}} regular expression don't require backslashes in front of them, whereas the ones in the bash3 command do. Also, the leading test of {{{"$foo"}}} (in both versions) is to ensure that it is not an empty string. You can use `%d` to parse an integer. Take care that the parsing might be (is supposed to be?) [[locale]]-dependent.

How can I tell whether a variable contains a valid number?

First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.

Hand parsing

If you're validating a simple "string of digits", you can do it with a glob:

# Bash
if [[ $foo != *[!0-9]* ]]; then
    echo "'$foo' is strictly numeric"
else
    echo "'$foo' has a non-digit somewhere in it"
fi

The same thing can be done in Korn and POSIX shells as well, using case:

# POSIX
case $var in
    '')
        printf 'var is empty\n';;
    *[!0-9]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s is strictly numeric\n' "$var";;
esac >&2

Of course, if all you care about is vald vs. invalid, you can combine cases:

# POSIX
case $var in
    '' | *[!0-9]*)
        echo "$0: $var: invalid digit" >&2; exit 1;;
esac

If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare:

# POSIX
case ${var#[-+]} in   # notice ${var#prefix} substitution to trim sign
    '')
        printf 'var is empty\n';;
    *.*.*)
        printf '%s has more than one decimal point in it\n' "$var";;
    *[!0-9.]*)
        printf '%s has a non-digit somewhere in it\n' "$var";;
    *)
        printf '%s looks like a valid float\n' "$var";;
esac >&2

Or in Bash, we can use extended globs:

# Bash -- extended globs must be enabled explicitly in versions prior to 4.1.
# Check whether the variable is all digits.
shopt -s extglob
[[ $var = +([0-9]) ]]

A more complex case:

# Bash / ksh
shopt -s extglob

if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then
  echo 'foo is a floating-point number'
fi

Optionally, case..esac may have been used in shells with extended pattern matching. The leading test of $foo is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version (explained in detail here), using egrep:

# Bourne
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then
    echo "'$foo' is a number"
else
    echo "'$foo' is not a number"
fi

Bash version 3 and above have regular expression support in the [[ command.

# Bash
# The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2

regexp='^[-+]?[0-9]*(\.[0-9]*)?$'
if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

Using the parsing done by [ and printf (or "using eq")

# fails with ksh
if [ "$foo" -eq "$foo" ] 2>/dev/null; then
 echo "$foo is an integer"
fi

[ parses the variable and interprets it as an integer because of the -eq. If the parsing succeeds the test is trivially true; if it fails [ prints an error message that 2>/dev/null hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.

Be careful: the following trick with printf (no supported by all shells)

# BASH
if printf %f "$foo" >/dev/null 2>&1; then
  echo "$foo is a float"
fi

is broken: about the arguments of the a, A, e, E, f, F, g, or G format modifiers, POSIX specifies that if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote. Hence this fails when foo expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float.

You can use %d to parse an integer. Take care that the parsing might be (is supposed to be?) locale-dependent.

BashFAQ/054 (last edited 2022-08-01 10:02:57 by 89)