3297
Comment: Removing the David Wheeler link which advices using crap like for file in $(find...)
|
4696
Fix error in loop for filenames, and add link to more details
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
== How can I find and deal with file names containing newlines, spaces or both? == | == How can I find and safely handle file names containing newlines, spaces or both? == |
Line 15: | Line 15: |
`xargs` is rarely ever more useful than the above, but if you ''really'' insist, remember to use `-0`: | `xargs` is rarely ever more useful than the above, but if you ''really'' insist, remember to use `-0` (`-0` is not in the POSIX standard, but is implemented by GNU and BSD systems): |
Line 18: | Line 18: |
find ... -print0 | xargs -0 command | find ... -print0 | xargs -r0 command |
Line 25: | Line 25: |
Another way to deal with files with spaces in their names is to use the shell's filename expansion ([[glob|globbing]]). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's `globstar`), but if you just need to process all the files in a single directory, it works fantastically well. | Another way to deal with files with spaces in their names is to use the shell's filename expansion ([[glob|globbing]]). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's `globstar`), and it normally does not include hidden files (filenames beginning with "."). But if you just need to process all the files in a single directory, and omitting hidden files is okay, it works fantastically well. Be sure that the glob expansion cannot begin with "-"; the easy way to do this is to begin a glob with "./" or "/". |
Line 27: | Line 27: |
For example, this code renames all the `*.mp3` files in the current directory to use underscores in their names instead of spaces: | For example, this code renames all the `*.mp3` files in the current directory to use underscores in their names instead of spaces (this uses the bash/ksh extension allowing "/" in parameter expansion): |
Line 31: | Line 31: |
for file in ./*.mp3; do mv "$file" "${file// /_}" |
for file in ./*\ *.mp3; do if [ -e "$file" ] ; then # Make sure it isn't an empty match mv "$file" "${file// /_}" fi |
Line 35: | Line 37: |
You can omit the "if..." and "fi" lines if you're certain that at least one path will match the glob. The problem is that if the glob doesn't match, instead of looping 0 times (as you might expect), the loop will execute once with the unexpanded pattern (which is usually not what you want). You can also use the bash extension "shopt -s nullglob" to make empty globs expand to nothing, and then again you can omit the if and fi. |
|
Line 40: | Line 44: |
Another way to handle filenames recursively involves using the `-print0` option of `find` (a GNU/BSD extension), together with bash's `-d` option for read: | Another way to handle filenames recursively involves using the `-print0` option of `find` (a GNU/BSD extension), together with bash's `-d` extended option for read: |
Line 50: | Line 54: |
The preceding example reads all the files under `/tmp` (recursively) into an [[BashGuide/Arrays|array]], even if they have newlines or other whitespace in their names, by forcing `read` to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using `find -exec`. [[IFS|IFS=]] is required to avoid trimming leading/trailing whitespace, and `-r` is needed to avoid backslash processing. In fact, `$'\0'` is equivalent to `''` so we could also write it like this: | The preceding example reads all the files under `/tmp` (recursively) into an [[BashGuide/Arrays|array]], even if they have newlines or other whitespace in their names, by forcing `read` to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using `find -exec`. [[IFS|IFS=]] is required to avoid trimming leading/trailing whitespace, and `-r` is needed to avoid backslash processing. In fact, `$'\0'` is actually the empty string (`bash` doesn't support passing NUL bytes to commands even built-in ones) so we could also write it like this: |
Line 72: | Line 76: |
You may want use the following near the beginning of your shell script. This way, if you forget to quote a variable expansion, embedded spaces (such as those in filenames) won't be split: {{{ IFS="$(printf '\n\t')" }}} For a longer discussion about handling filenames in shell, see [[http://www.dwheeler.com/essays/filenames-in-shell.html|Filenames and Pathnames in Shell: How to do it Correctly]]. |
How can I find and safely handle file names containing newlines, spaces or both?
First and foremost, to understand why you're having trouble, read Arguments to get a grasp on how the shell understands the statements you give it. It is vital that you grasp this matter well if you're going to be doing anything with the shell.
The preferred method to deal with arbitrary filenames is still to use find(1):
find ... -exec command {} \;
or, if you need to handle filenames en masse:
find ... -exec command {} +
xargs is rarely ever more useful than the above, but if you really insist, remember to use -0 (-0 is not in the POSIX standard, but is implemented by GNU and BSD systems):
# Requires GNU/BSD find and xargs find ... -print0 | xargs -r0 command # Never use xargs without -0 or similar extensions!
Use one of these unless you really can't.
Another way to deal with files with spaces in their names is to use the shell's filename expansion (globbing). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's globstar), and it normally does not include hidden files (filenames beginning with "."). But if you just need to process all the files in a single directory, and omitting hidden files is okay, it works fantastically well. Be sure that the glob expansion cannot begin with "-"; the easy way to do this is to begin a glob with "./" or "/".
For example, this code renames all the *.mp3 files in the current directory to use underscores in their names instead of spaces (this uses the bash/ksh extension allowing "/" in parameter expansion):
# Bash/ksh for file in ./*\ *.mp3; do if [ -e "$file" ] ; then # Make sure it isn't an empty match mv "$file" "${file// /_}" fi done
You can omit the "if..." and "fi" lines if you're certain that at least one path will match the glob. The problem is that if the glob doesn't match, instead of looping 0 times (as you might expect), the loop will execute once with the unexpanded pattern (which is usually not what you want). You can also use the bash extension "shopt -s nullglob" to make empty globs expand to nothing, and then again you can omit the if and fi.
For more examples of renaming files, see FAQ #30.
Remember, you need to quote all your Parameter Expansions using double quotes. If you don't, the expansion will undergo WordSplitting (see also argument splitting and BashPitfalls). Also, always prefix globs with "./"; otherwise, if there's a file with "-" as the first character, the expansions might be misinterpreted as options.
Another way to handle filenames recursively involves using the -print0 option of find (a GNU/BSD extension), together with bash's -d extended option for read:
# Bash unset a i while IFS= read -r -d $'\0' file; do a[i++]="$file" # or however you want to process each file done < <(find /tmp -type f -print0)
The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec. IFS= is required to avoid trimming leading/trailing whitespace, and -r is needed to avoid backslash processing. In fact, $'\0' is actually the empty string (bash doesn't support passing NUL bytes to commands even built-in ones) so we could also write it like this:
# Bash unset a i while IFS= read -r -d '' file; do a[i++]="$file" done < <(find /tmp -type f -print0)
So, why doesn't this work?
# DOES NOT WORK unset a i find /tmp -type f -print0 | while IFS= read -r -d '' file; do a[i++]="$file" done
Because of the pipeline, the entire while loop is executed in a SubShell and therefore the array assignments will be lost after the loop terminates. (For more details about this, see FAQ #24.)
You may want use the following near the beginning of your shell script. This way, if you forget to quote a variable expansion, embedded spaces (such as those in filenames) won't be split:
IFS="$(printf '\n\t')"
For a longer discussion about handling filenames in shell, see Filenames and Pathnames in Shell: How to do it Correctly.