Size: 4661
Comment:
|
← Revision 56 as of 2025-01-23 23:55:16 ⇥
Size: 5745
Comment: undid rev 55 (quotes are unnecessary around the word following `case`)
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
Line 9: | Line 8: |
{{{ # Bash if [[ $foo != *[!0-9]* ]]; then echo "'$foo' is strictly numeric" |
{{{#!highlight bash # Bash / Ksh if [[ -n $foo && $foo != *[!0123456789]* ]]; then printf '"%s" is strictly numeric\n' "$foo" |
Line 14: | Line 13: |
echo "'$foo' has a non-digit somewhere in it" fi |
printf '"%s" has a non-digit somewhere in it or is empty\n' "$foo" fi >&2 |
Line 17: | Line 16: |
The same thing can be done in Korn and POSIX shells as well, using {{{case}}}: | |
Line 19: | Line 17: |
{{{ | Avoid `[0-9]` or `[[:digit:]]` which in some locales and some systems can match characters other than 0123456789. The same thing can be done in POSIX shells as well, using {{{case}}}: {{{#!highlight bash |
Line 24: | Line 26: |
*[!0-9]*) | *[!0123456789]*) |
Line 30: | Line 32: |
Of course, if all you care about is vald vs. invalid, you can combine cases: | Of course, if all you care about is valid vs. invalid, you can combine cases: |
Line 32: | Line 34: |
{{{ | {{{#!highlight bash |
Line 35: | Line 37: |
'' | *[!0-9]*) echo "$0: $var: invalid digit" >&2; exit 1;; |
'' | *[!0123456789]*) printf '%s\n' "$0: $var: invalid digit" >&2; exit 1;; |
Line 41: | Line 43: |
{{{ | {{{#!highlight bash |
Line 46: | Line 48: |
.) printf 'var is just a dot\n';; |
|
Line 47: | Line 51: |
printf '%s has more than one decimal point in it\n' "$var";; *[!0-9.]*) printf '%s has a non-digit somewhere in it\n' "$var";; |
printf '"%s" has more than one decimal point in it\n' "$var";; *[!0123456789.]*) printf '"%s" has a non-digit somewhere in it\n' "$var";; |
Line 51: | Line 55: |
printf '%s looks like a valid float\n' "$var";; | printf '"%s" looks like a valid float\n' "$var";; |
Line 56: | Line 60: |
{{{ | {{{#!highlight bash |
Line 60: | Line 64: |
[[ $var = +([0-9]) ]] | [[ $var = +([0123456789]) ]] |
Line 64: | Line 68: |
{{{ | {{{#!highlight bash |
Line 66: | Line 70: |
shopt -s extglob | shopt -s extglob # not necessary in ksh and bash 4.1 or newer |
Line 68: | Line 72: |
if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then | if [[ $foo = @(*[0123456789]*|!([+-]|)) && $foo = ?([+-])*([0123456789])?(.*([0123456789])) ]]; then |
Line 74: | Line 78: |
If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax. Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{egrep}}}: | If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax. Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{awk}}} (not `egrep` which is line-based so would be tricked by variables that contain newline characters): |
Line 76: | Line 80: |
{{{ | {{{#!highlight bash |
Line 78: | Line 82: |
if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then echo "'$foo' is a number" |
if awk -- 'BEGIN {exit !(ARGV[1] ~ /^[-+]?([0123456789]+\.?|[0123456789]*\.[0123456789]+)$/)}' "$foo"; then printf '"%s" is a number\n' "$foo" |
Line 81: | Line 86: |
echo "'$foo' is not a number" | printf '"%s" is not a number\n' "$foo" |
Line 84: | Line 89: |
Bash version 3 and above have regular expression support in the [[ command. | Bash version 3 and above have regular expression support in the `[[...]]` construct. |
Line 86: | Line 91: |
{{{ | {{{#!highlight bash |
Line 90: | Line 95: |
regexp='^[-+]?[0-9]*(\.[0-9]*)?$' if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then echo "'$foo' looks rather like a number" |
regexp='^[-+]?[0123456789]*(\.[0123456789]*)?$' if [[ $foo = *[0123456789]* && $foo =~ $regexp ]]; then printf '"%s" looks rather like a number\n' "$foo" |
Line 94: | Line 99: |
echo "'$foo' doesn't look particularly numeric to me" | printf '"%s" doesn'\''t look particularly numeric to me.\n' "$foo" |
Line 98: | Line 103: |
{{{ | {{{#!highlight bash |
Line 101: | Line 106: |
echo "$foo is an integer" | printf '"%s" is an integer\n' "$foo" |
Line 104: | Line 109: |
`[` parses the variable and interprets it as an integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression. | `[` parses the variable and interprets it a decimal integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression (and that would constitute an arbitrary command injection vulnerability). |
Line 106: | Line 111: |
Be careful: the following trick with `printf` (no supported by all shells) | Be careful: the following trick with `printf` (not supported by all shells, and the list of supported float representations varies with the shell as well; not to mention the command injection vulnerability in ksh or zsh) |
Line 108: | Line 113: |
{{{ # BASH |
{{{#!highlight bash |
Line 111: | Line 115: |
echo "$foo is a float" | printf '"%s" is a float\n' "$foo" |
Line 114: | Line 118: |
is broken: about the arguments of the{{{ a}}}, {{{A}}}, {{{e}}}, {{{E}}}, {{{f}}}, {{{F}}}, {{{g}}}, or {{{G}}} format modifiers, POSIX specifies that ''if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote.'' Hence this fails when {{{foo}}} expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float. | is broken: about the arguments of the {{{a}}}, {{{A}}}, {{{e}}}, {{{E}}}, {{{f}}}, {{{F}}}, {{{g}}}, or {{{G}}} format modifiers, POSIX specifies that ''if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote.'' Hence this fails when {{{foo}}} expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float. It also returns 0 when {{{foo}}} expands to a number with a leading {{{0x}}}, which is a valid number in a shell script but may not work elsewhere. |
How can I tell whether a variable contains a valid number?
First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.
Hand parsing
If you're validating a simple "string of digits", you can do it with a glob:
Avoid [0-9] or [[:digit:]] which in some locales and some systems can match characters other than 0123456789.
The same thing can be done in POSIX shells as well, using case:
Of course, if all you care about is valid vs. invalid, you can combine cases:
If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare:
1 # POSIX
2 case ${var#[-+]} in # notice ${var#prefix} substitution to trim sign
3 '')
4 printf 'var is empty\n';;
5 .)
6 printf 'var is just a dot\n';;
7 *.*.*)
8 printf '"%s" has more than one decimal point in it\n' "$var";;
9 *[!0123456789.]*)
10 printf '"%s" has a non-digit somewhere in it\n' "$var";;
11 *)
12 printf '"%s" looks like a valid float\n' "$var";;
13 esac >&2
Or in Bash, we can use extended globs:
A more complex case:
Optionally, case..esac may have been used in shells with extended pattern matching. The leading test of $foo is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself.
If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version (explained in detail here), using awk (not egrep which is line-based so would be tricked by variables that contain newline characters):
Bash version 3 and above have regular expression support in the [[...]] construct.
1 # Bash
2 # The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2
3
4 regexp='^[-+]?[0123456789]*(\.[0123456789]*)?$'
5 if [[ $foo = *[0123456789]* && $foo =~ $regexp ]]; then
6 printf '"%s" looks rather like a number\n' "$foo"
7 else
8 printf '"%s" doesn'\''t look particularly numeric to me.\n' "$foo"
9 fi
Using the parsing done by [ and printf (or "using eq")
[ parses the variable and interprets it a decimal integer because of the -eq. If the parsing succeeds the test is trivially true; if it fails [ prints an error message that 2>/dev/null hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression (and that would constitute an arbitrary command injection vulnerability).
Be careful: the following trick with printf (not supported by all shells, and the list of supported float representations varies with the shell as well; not to mention the command injection vulnerability in ksh or zsh)
is broken: about the arguments of the a, A, e, E, f, F, g, or G format modifiers, POSIX specifies that if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote. Hence this fails when foo expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float. It also returns 0 when foo expands to a number with a leading 0x, which is a valid number in a shell script but may not work elsewhere.
You can use %d to parse an integer. Take care that the parsing might be (is supposed to be?) locale-dependent.