The IFS variable is used in shells (Bourne, POSIX, ksh, bash) as the input field separator (or internal field separator). Essentially, it is a string of special characters which are to be treated as delimiters between words/fields when splitting a line of input.

The default value of IFS is space, tab, newline. (A three-character string.) If IFS is unset, it acts as though it were set to this default value. (This is presumably for simplicity in shells that do not support the $'...' syntax for special characters.) If IFS is set to an empty string (which is very different from unsetting it!) then no splitting will be performed.

This variable is used in a few different places. The semantics vary slightly:

There are special rules for handling whitespace characters in IFS, in any of the field-splitting situations above (the first three bullet points). Whitespace IFS characters at the beginning and end of a string are removed entirely (except in the special case noted above), and consecutive whitespace IFS characters inside a string are treated as a single delimiter. For example, consider the following:

IFS=: read -r user pwhash uid gid gecos home shell \
   <<< 'statd:x:105:65534::/var/lib/nfs:/bin/false'

IFS=$' \t\n' read -r one two three \
   <<< '   1      2  3'

In the first example, the gecos variable is assigned the empty string, which is the contents of the field between the two adjacent colons. The colons are not consolidated together; they are treated as separate delimiters. In the second example, the one variable gets the value 1, and the two variable gets the value 2. The leading whitespace is trimmed, and the internal whitespace is consolidated.

If IFS contains a mixture of whitespace and non-whitespace characters, it treats them differently. Any non-whitespace IFS character plus all adjacent IFS whitespace characters acts as a single field delimiter. (In addition, any sequence of one or more whitespace IFS characters also still counts.) For example:

$ IFS=' ,'
$ var='this, that   , the other'
$ printf '<%s> ' $var; echo
<this> <that> <the> <other> 

The comma and space after this are treated as a field delimiter, The comma plus the multiple spaces around it after that are the second field delimiter. Finally, the single space after the is the final field delimiter, yielding a total of four words (fields).

More random examples:

$ IFS= read -r a b c <<< 'the plain gold ring'
$ echo "=$a="
=the plain gold ring=

$ IFS=$' \t\n' read -r a b c <<< 'the plain gold ring'
$ echo "=$c="
=gold ring=

$ IFS=$' \t\n' read -r a b c <<< 'the    plain gold      ring'
$ echo "=$a= =$b= =$c="
=the= =plain= =gold      ring=

The first example above shows the lack of splitting when IFS is empty. The second shows the last variable-name given to a read command absorbing all the remaining words of input. The third shows that splitting and delimiter-consolidation are not performed on the remaining part of a line when assigning excess fields to the last variable.

$ IFS=: read -r a b c <<< '1:2:::3::4'
$ echo "=$a= =$b= =$c="
=1= =2= =::3::4=

Here's another look at having more input fields than variables. Note that out of the three consecutive colons which follow field 2, precisely one colon was removed in order to terminate field 2. The remaining two colons, as well as two more colons later on, were all left untouched, and assigned to variable c verbatim.

IFS=:
set -f
for dir in $PATH$IFS; do
   ...
done
set +f
unset -v IFS

This example iterates through the colon-separated elements of the PATH variable, presumably to look for a command. It is an overly-simplified example, but should serve to demonstrate how IFS may be used to control WordSplitting. For a better approach to finding commands, see FAQ #81.

Special note

Regarding the behavior of $* vs. "$*" (and analogously, ${array[*]} and ${!prefix*}), IFS is used in both the quoted and unquoted forms, but it is rendered impotent in the unquoted form. Consider:

$ set -- one two three
$ IFS=+
$ echo $*
one two three

What's actually happening here? At first glance, it appears $* is generating a list of words without any delimiter, and relying on the echo to put a space between them.

But that's not the case. $* is generating a single string with our IFS delimiter between elements, but then because it's not quoted, it is split into words before echo gets it.

$ tmp=$*
$ echo "$tmp"
one+two+three

Variable assignment skips word splitting (because there's no meaningful way to handle multiple words when there's only a single variable to put things into), so here we can see the actual value of the unquoted $*, without word splitting getting in the way.

By definition, the delimiter we're putting between the words will be part of IFS, and therefore it will be word-split away. There's no way to avoid that other than using a temporary variable to suppress the word-splitting, as shown above.

This is yet another reason why you should USE MORE QUOTES!


CategoryShell

IFS (last edited 2023-05-22 10:17:31 by emanuele6)