3391
Comment: Add a "for" loop that skips filenames with control characters
|
3473
minor clean-up
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
The preferred method is still to use [[UsingFind|find(1)]]: | First and foremost, to understand why you're having trouble, read [[Arguments]] to get a grasp on how the shell understands the statements you give it. It is vital that you grasp this matter well if you're going to be doing anything with the shell. |
Line 5: | Line 5: |
The preferred method to deal with arbitrary filenames is still to use [[UsingFind|find(1)]]: | |
Line 9: | Line 10: |
or, if you need to handle filenames ''en masse'', with GNU and recent BSD tools: | or, if you need to handle filenames ''en masse'': {{{ find ... -exec command {} + }}} |
Line 11: | Line 15: |
`xargs` is rarely ever more useful than the above, but if you ''really'' insist, remember to use `-0`: | |
Line 15: | Line 20: |
or with POSIX {{{find}}}: | Use one of these unless you ''really'' can't. |
Line 17: | Line 22: |
{{{ find ... -exec command {} + }}} |
Another way to deal with files with spaces in their names is to use the shell's filename expansion ([[globbing]]). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's globstar), but if you just need to process all the files in a single directory, it works fantastically well. |
Line 21: | Line 24: |
Use that unless you really can't. Another way to deal with files with spaces in their names is to use the shell's filename expansion ([[globbing]]). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well. This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. It uses [[BashFAQ/073|Parameter Expansions]] that will not work in the original BourneShell or POSIX shell, but should be good in KornShell and [[BASH]]. |
This example [[BashFAQ/030|renames]] all the *.mp3 files in the current directory to use underscores in their names instead of spaces. It uses [[BashFAQ/073|Parameter Expansions]] that will not work in the original BourneShell or POSIX shell, but should be good in KornShell and [[BASH]]. |
Line 32: | Line 31: |
Remember, you need to '''quote all your [[BashFAQ/073|Parameter Expansions]] using double quotes'''. If you don't, the expansion will undergo WordSplitting (see also BashGuide/TheBasics/ArgumentSplitting and BashPitfalls). Also, always prefix globs with "./"; otherwise, if there's a file with "-" as the first character, the expansions might be misinterpreted as options. |
Remember, you need to '''quote all your [[BashFAQ/073|Parameter Expansions]] using double quotes'''. If you don't, the expansion will undergo WordSplitting (see also [[BashGuide/CommandsAndArguments#Argument_Splitting|argument splitting]] and BashPitfalls). Also, always prefix globs with "./"; otherwise, if there's a file with "-" as the first character, the expansions might be misinterpreted as options. |
Line 38: | Line 36: |
for file in ./*\ *; do | for file in ./*' '*; do |
Line 40: | Line 38: |
instead of *.mp3. |
instead of `*.mp3`. |
Line 52: | Line 49: |
The preceding example reads all the files under `/tmp` (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing {{{read}}} to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using {{{find -exec}}}. `IFS=` is required to avoid trimming leading/trailing whitespace, and `-r` is needed to avoid backslash processing. In fact, `$'\0'` is equivalent to `''` so we could also write it like this: |
The preceding example reads all the files under `/tmp` (recursively) into an [[BashGuide/Arrays|array]], even if they have newlines or other whitespace in their names, by forcing {{{read}}} to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using {{{find -exec}}}. [[IFS|IFS=]] is required to avoid trimming leading/trailing whitespace, and `-r` is needed to avoid backslash processing. In fact, `$'\0'` is equivalent to `''` so we could also write it like this: |
Line 63: | Line 59: |
Filenames with control characters (including newline, tab, and escape) are a pain to deal with, and can also be somewhat dangerous to display (since the control characters can end up controlling terminal emulators). To skip filenames with control characters, but process correctly other filenames (such as those with embedded spaces), you can use this portable approach (from [[http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html|Fixing Unix/Linux/POSIX Filenames]]): | So, why doesn't this work? |
Line 65: | Line 62: |
IFS=`printf '\n\t'` # Must remove space so spaces-in-filenames still work controlchars=`printf '*[\001-\037\177]*'` for file in `find . ! -name "$controlchars"'` ; do echo "$file" # etc, be sure to quote "$file" or it'll be globbed done |
# DOES NOT WORK unset a i find /tmp -type f -print0 | while IFS= read -r -d '' file; do a[i++]="$file" done |
Line 71: | Line 68: |
Because of the pipeline, the entire `while` loop is executed in a SubShell and therefore the array assignments will be lost after the loop terminates. (For more details about this, see [[BashFAQ/024|FAQ #24]].) ---- CategoryShell |
How can I find and deal with file names containing newlines, spaces or both?
First and foremost, to understand why you're having trouble, read Arguments to get a grasp on how the shell understands the statements you give it. It is vital that you grasp this matter well if you're going to be doing anything with the shell.
The preferred method to deal with arbitrary filenames is still to use find(1):
find ... -exec command {} \;
or, if you need to handle filenames en masse:
find ... -exec command {} +
xargs is rarely ever more useful than the above, but if you really insist, remember to use -0:
find ... -print0 | xargs -0 command
Use one of these unless you really can't.
Another way to deal with files with spaces in their names is to use the shell's filename expansion (globbing). This has the disadvantage of not working recursively (except with zsh's extensions or bash 4's globstar), but if you just need to process all the files in a single directory, it works fantastically well.
This example renames all the *.mp3 files in the current directory to use underscores in their names instead of spaces. It uses Parameter Expansions that will not work in the original BourneShell or POSIX shell, but should be good in KornShell and BASH.
for file in ./*.mp3; do mv "$file" "${file// /_}" done
Remember, you need to quote all your Parameter Expansions using double quotes. If you don't, the expansion will undergo WordSplitting (see also argument splitting and BashPitfalls). Also, always prefix globs with "./"; otherwise, if there's a file with "-" as the first character, the expansions might be misinterpreted as options.
You could do the same thing for all files with spaces in their names (regardless of extension) by using
for file in ./*' '*; do
instead of *.mp3.
Another way to handle filenames recursively involves using the -print0 option of find (a GNU/BSD extension), together with bash's -d option for read:
# Bash unset a i while IFS= read -r -d $'\0' file; do a[i++]="$file" # or however you want to process each file done < <(find /tmp -type f -print0)
The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its line delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec. IFS= is required to avoid trimming leading/trailing whitespace, and -r is needed to avoid backslash processing. In fact, $'\0' is equivalent to '' so we could also write it like this:
# Bash unset a i while IFS= read -r -d '' file; do a[i++]="$file" done < <(find /tmp -type f -print0)
So, why doesn't this work?
# DOES NOT WORK unset a i find /tmp -type f -print0 | while IFS= read -r -d '' file; do a[i++]="$file" done
Because of the pipeline, the entire while loop is executed in a SubShell and therefore the array assignments will be lost after the loop terminates. (For more details about this, see FAQ #24.)