Differences between revisions 4 and 28 (spanning 24 versions)
Revision 4 as of 2008-05-31 06:37:22
Size: 2166
Editor: pgas
Comment: add a note about quoting, still not sure about how to explain concisely and simply word splitting...
Revision 28 as of 2012-02-24 15:03:18
Size: 3562
Editor: GreyCat
Comment: Remove most of the redunant "rename" crap -- that's in FAQ 30. Make indenting and markup consistent.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq20)]] <<Anchor(faq20)>>
Line 3: Line 3:
The preferred method is still to use [:UsingFind:find(1)]: First and foremost, to understand why you're having trouble, read [[Arguments]] to get a grasp on how the shell understands the statements you give it. It is vital that you grasp this matter well if you're going to be doing anything with the shell.

See also David Wheeler's essay (external link) [[http://www.dwheeler.com/essays/filenames-in-shell.html|Filenames and Pathnames in Shell: How to do it correctly]] for more discussion on this topic, including pointing out weaknesses of some proposed solutions.

The preferred method to deal with arbitrary filenames is still to use [[UsingFind|find(1)]]:
{{{
find ... -exec command {} \;
}}}

or, if you need to handle filenames ''en masse'':
{{{
find ... -exec command {} +
}}}

`xargs` is rarely ever more useful than the above, but if you ''really'' insist, remember to use `-0`:
{{{
# Requires GNU/BSD find and xargs
find ... -print0 | xargs -0 command

# Never use xargs without -0 or similar extensions!
}}}

Use one of these unless you ''really'' can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion ([[glob|globbing]]). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's `globstar`), but if you just need to process all the files in a single directory, it works fantastically well.

For example, this code renames all the `*.mp3` files in the current directory to use underscores in their names instead of spaces:
Line 6: Line 32:
    find ... -exec command {} \;
}}}

or, if you need to handle filenames ''en masse'', with GNU and recent BSD tools:

{{{
    find ... -print0 | xargs -0 command
}}}

or with POSIX {{{find}}}:

{{{
    find ... -exec command {} +
}}}

Use that unless you really can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion (["globbing"]). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well, but you need to '''quote all your [:BashFAQ/073:Parameter Expansions] using double
quotes''' "$file" "${file%.mp3}" ...If you don't use the quotes the expansion will be split into several words, which means
that the command will think you gave it several arguments instead of just one.

This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. It uses [:BashFAQ/073:Parameter Expansions] that will not work in the original BourneShell, but should be good in Korn and Bash.

{{{
for file in *.mp3; do
    mv "$file" "${file// /_}"
# Bash/ksh
for file in ./*.mp3; do
  mv "$file" "${file// /_}"
Line 35: Line 38:
You could do the same thing for all files (regardless of extension) by using For more examples of renaming files, see [[BashFAQ/030|FAQ #30]].

Remember, you need to '''quote all your [[BashFAQ/073|Parameter Expansions]] using double quotes'''. If you don't, the expansion will undergo WordSplitting (see also [[BashGuide/CommandsAndArguments#Argument_Splitting|argument splitting]] and BashPitfalls). Also, always prefix globs with "./"; otherwise, if there's a file with "-" as the first character, the expansions might be misinterpreted as options.

Another way to handle filenames recursively involves using the `-print0` option of `find` (a GNU/BSD extension), together with bash's `-d` option for read:
Line 38: Line 45:
for file in *\ *; do
}}}

instead of *.mp3.

Another way to handle filenames recursively involes using the {{{-print0}}} option of {{{find}}} (a GNU/BSD extension), together with bash's {{{-d}}} option for read:

{{{
# Bash
Line 47: Line 47:
while read -d $'\0' file; do while IFS= read -r -d $'\0' file; do
Line 52: Line 52:
The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing {{{read}}} to use the NUL byte (\0) as its word delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using {{{find -exec}}}. The preceding example reads all the files under `/tmp` (recursively) into an [[BashGuide/Arrays|array]], even if they have newlines or other whitespace in their names, by forcing `read` to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using `find -exec`. [[IFS|IFS=]] is required to avoid trimming leading/trailing whitespace, and `-r` is needed to avoid backslash processing. In fact, `$'\0'` is equivalent to `''` so we could also write it like this:

{{{
# Bash
unset a i
while IFS= read -r -d '' file; do
  a[i++]="$file"
done < <(find /tmp -type f -print0)
}}}

So, why doesn't this work?

{{{
# DOES NOT WORK
unset a i
find /tmp -type f -print0 | while IFS= read -r -d '' file; do
  a[i++]="$file"
done
}}}

Because of the pipeline, the entire `while` loop is executed in a SubShell and therefore the array assignments will be lost after the loop terminates. (For more details about this, see [[BashFAQ/024|FAQ #24]].)

----
CategoryShell

How can I find and deal with file names containing newlines, spaces or both?

First and foremost, to understand why you're having trouble, read Arguments to get a grasp on how the shell understands the statements you give it. It is vital that you grasp this matter well if you're going to be doing anything with the shell.

See also David Wheeler's essay (external link) Filenames and Pathnames in Shell: How to do it correctly for more discussion on this topic, including pointing out weaknesses of some proposed solutions.

The preferred method to deal with arbitrary filenames is still to use find(1):

find ... -exec command {} \;

or, if you need to handle filenames en masse:

find ... -exec command {} +

xargs is rarely ever more useful than the above, but if you really insist, remember to use -0:

# Requires GNU/BSD find and xargs
find ... -print0 | xargs -0 command

# Never use xargs without -0 or similar extensions!

Use one of these unless you really can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion (globbing). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's globstar), but if you just need to process all the files in a single directory, it works fantastically well.

For example, this code renames all the *.mp3 files in the current directory to use underscores in their names instead of spaces:

# Bash/ksh
for file in ./*.mp3; do
  mv "$file" "${file// /_}"
done

For more examples of renaming files, see FAQ #30.

Remember, you need to quote all your Parameter Expansions using double quotes. If you don't, the expansion will undergo WordSplitting (see also argument splitting and BashPitfalls). Also, always prefix globs with "./"; otherwise, if there's a file with "-" as the first character, the expansions might be misinterpreted as options.

Another way to handle filenames recursively involves using the -print0 option of find (a GNU/BSD extension), together with bash's -d option for read:

# Bash
unset a i
while IFS= read -r -d $'\0' file; do
  a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec. IFS= is required to avoid trimming leading/trailing whitespace, and -r is needed to avoid backslash processing. In fact, $'\0' is equivalent to '' so we could also write it like this:

# Bash
unset a i
while IFS= read -r -d '' file; do
  a[i++]="$file"
done < <(find /tmp -type f -print0)

So, why doesn't this work?

# DOES NOT WORK
unset a i
find /tmp -type f -print0 | while IFS= read -r -d '' file; do
  a[i++]="$file"
done

Because of the pipeline, the entire while loop is executed in a SubShell and therefore the array assignments will be lost after the loop terminates. (For more details about this, see FAQ #24.)


CategoryShell

BashFAQ/020 (last edited 2024-05-06 09:19:34 by StephaneChazelas)