11186
Comment:
|
12715
simplify awk (the environment variables are always set, and even if they weren't it would not be a problem)
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
There are a number of tools available for this. Which one to use depends on a lot of factors, the biggest of which is of course ''what we're editing''. === Variables === If it's a variable, this can (and should) be done very simply with parameter expansion. Forking an external tool for string manipulation is extremely slow and unnecessary. {{{ var='some string'; search=some; rep=another # Bash var=${var//"$search"/$rep} # POSIX function # usage: string_rep SEARCH REPL STRING # replaces all instances of SEARCH with REPL in STRING string_rep() { # initialize vars in=$3 unset out # SEARCH must not be empty [[ $1 ]] || return while true; do # break loop if SEARCH is no longer in "$in" case "$in" in *"$1"*) : ;; *) break;; esac # append everything in "$in", up to the first instance of SEARCH, and REP, to "$out" out=$out${in%%"$1"*}$2 # remove everything up to and including the first instance of SEARCH from "$in" in=${in#*"$1"} done # append whatever is left in "$in" after the last instance of SEARCH to out, and print printf '%s%s\n' "$out" "$in" } var=$(string_rep "$var" "$search" "$rep") # Note: POSIX does not have a way to localize variables. Most shells (even dash and # busybox), however, do. Feel free to localize the variables if your shell supports # it. EVen if it does not, if you call the function with var=$(string_rep ...), the # function will be run in a subshell and any assignments it makes will not persist. }}} In the bash example, the quotes around "$search" prevent the contents of the variable to be treated as a shell pattern (also called a "glob"). Of course, if pattern matching is intended, do not include the quotes. If "$rep" were quoted, however, the quotes would be treated as literal. Parameter expansions like this are discussed in more detail in [[BashFAQ/100|Faq #100]]. === Streams === If it's a file or a stream, things get a little bit trickier. The standard tools available for this are `sed` or `AWK` (for streams), and `ed` (for files). Of course, you could do it in bash itself, by combining the previous method with [[BashFAQ/001|Faq #1]]: {{{ search=foo; rep=bar while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" done < <(some_command) some_command | while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" done }}} If you want to do more processing than just a simple search/replace, this may be the best option. Note that the last example runs the loop in a subshell. See [[BashFAQ/024|Faq #24]] for more information on that. Another option would, of course, be `sed`: {{{ # replaces all instances of "search" with "replace" in the output of "some_command" some_command | sed 's/search/replace/g' }}} `sed` uses [[RegularExpression|regular expressions]]. Unlike the bash, "search" and "replace" would have to be rigorously escaped in order to treat the values as literal strings. This is very impractical, and attempting to do so will make your code extremely prone to bugs. Embedding shell variables in sed is '''never''' a good idea. You may notice, however, that the bash loop above is very slow for large data sets. So how do we find something faster, that can replace literal strings? Well, you could use `AWK`. The following function replaces all instances of STR with REP, reading from stdin and writing to stdout. {{{ # usage: gsub_literal STR REP # replaces all instances of STR with REP. reads from stdin and writes to stdout. gsub_literal() { # STR cannot be empty [[ $1 ]] || return # string manip needed to escape '\'s, so awk doesn't expand '\n' and such awk -v str="${1//\\/\\\\}" -v rep="${2//\\/\\\\}" ' # get the length of the search string BEGIN { len = length(str); } { # empty the output string out = ""; # continue looping while the search string is in the line while (i = index($0, str)) { # append everything up to the search string, and the replacement string out = out substr($0, 1, i-1) rep; # remove everything up to and including the first instance of the # search string from the line $0 = substr($0, i + len); } # append whatever is left out = out $0; print out; } ' } some_command | gsub_literal "$search" "$rep" # condensed as a one-liner: some_command | awk -v s="${search//\\/\\\\}" -v r="${rep//\\/\\\\}" 'BEGIN {l=length(s)} {o="";while (i=index($0, s)) {o=o substr($0,1,i-1) r; $0=substr($0,i+l)} print o $0}' }}} |
There are a number of techniques for this. Which one to use depends on many factors, the biggest of which is ''what we're editing''. This page also contains contradictory advice from multiple authors. This is a deeply ''ugly'' topic, and there are no universally right answers (but plenty of universally ''wrong'' ones). <<TableOfContents>> |
Line 132: | Line 9: |
Actually editing files gets even trickier. The only tool listed that actually edits a file is `ed`. The other methods could be used, but to do so involves a temp file and `mv` (or POSIX extensions). `ed` is the standard UNIX command-based editor. Here are some commonly-used syntaxes for replacing the string `olddomain.com` by the string `newdomain.com` in a file named `file`. All four commands do the same thing, with varying degrees of portability and efficiency: {{{ |
Before you start, be warned that [[http://backreference.org/2011/01/29/in-place-editing-of-files/|editing files is a really bad idea]]. The preferred way to modify a file is to create a new file within the same file system, write the modified content into it, and then `mv` it to the original name. This is the '''only''' way to prevent data loss in the event of a crash while writing. However, using a temp file and `mv` means that you break hardlinks to the file (unavoidably), that you would convert a symlink to hard file, and that you may need to take extra steps to transfer the ownership and permissions (and possibly other metadata) of the original file to the new file. Some people prefer to roll the dice and accept the tiny possibility of data loss versus the greater possibility of hardlink loss and the inconvenience of `chown`/`chmod` (and potentially `setfattr`, `setfacl`, `chattr`...). The other major problem you're going to face is that all of the standard Unix tools for editing files expect some kind of regular expression as the search pattern. If you're passing input ''you did not create'' as the search pattern, it may contain syntax that breaks the program's parser, which can lead to failures, or CodeInjection exploits. ==== Just Tell Me What To Do ==== If your search string or your replacement string comes from an external source (environment variable, argument, file, user input) and is therefore not under your control, then this is your best choice: {{{ in="$search" out="$replace" perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' ./* }}} That will operate on all of the files in the current directory. If you want to operate on a full hierarchy (recursively), then: {{{ in="$search" out="$replace" find . -type f -exec \ perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' {} + }}} You may of course supply additional options to `find` to restrict which files are replaced; see UsingFind for more information. The critical reader may note that these commands use `perl` which is not a standard tool. That's because none of the standard tools can do this task safely. If you're stuck using standard tools due to a restricted execution environment, then you'll have to weigh the options below and choose the one that will do the least amount of damage to your files. ==== Using a file editor ==== The only standard tools that actually edit a file are `ed` and `ex` (`vi` is the visual mode for `ex`). `ed` is the standard UNIX command-based editor. `ex` is another standard command-line editor. Here are some commonly-used syntaxes for replacing the string `olddomain.com` by the string `newdomain.com` in a file named `file`. All four commands do the same thing, with varying degrees of portability and efficiency: {{{ ## Ex ex -sc '%s/olddomain\.com/newdomain.com/g|x' file ## Ed |
Line 153: | Line 61: |
To replace a string in all files of the current directory: | To replace a string in all files of the current directory, just wrap one of the above in a loop: |
Line 164: | Line 72: |
for file in ./**/*; do | # Bash 4+ (shopt -s globstar) for file in ./**; do |
Line 172: | Line 81: |
find . -type f -exec bash -c 'printf "%s\n" "g/old/s//new/g" w q | ed -s "$1"' _ {} \; }}} `sed` is a '''Stream EDitor''', not a '''file''' editor. Nevertheless, people everywhere tend to abuse it for trying to edit files. It doesn't edit files. GNU `sed` (and some BSD `sed`s) have a `-i` option that makes a copy and replaces the original file with the copy. An expensive operation, but if you enjoy unportable code, I/O overhead and bad side effects (such as destroying symlinks), this would be an option: {{{ sed -i 's/old/new/g' ./* # GNU sed -i '' 's/old/new/g' ./* # BSD # POSIX sed, uses a temp file and mv: # remove all temp files on exit, in case sed fails and they weren't moved trap 'rm -f "${temps[@]}"' EXIT temps=() for file in ./*; do if [[ -f $file ]]; then tmp=$(mktemp) || exit temps+=("$tmp") sed 's/old/new/g' "$file" > "$tmp" && mv "$tmp" "$file" fi done |
find . -type f -exec sh -c 'for f do ed -s "$f" <<! g/old/s//new/g w q ! done' sh {} + }}} Since `ex` takes its commands from the command-line, it's less painful to invoke from `find`: {{{ find . -type f -exec ex -sc '%s/old/new/g|x' {} \; }}} Beware though, if your `ex` is provided by `vim`, it may get stuck for files that don't contain an `old`. In that case, you'd add the `e` option to ignore those files. When `vim` is your `ex`, you can also use `argdo` and `find`'s `{} +` to minimize the amount of `ex` processes to run: {{{ # Bash 4+ (shopt -s globstar) ex -sc 'argdo %s/old/new/ge|x' ./** # Bourne find . -type f -exec ex -sc 'argdo %s/old/new/ge|x' {} + }}} You can also ask for confirmation for every replacement from `A` to `B`. You will need to type `y` or `n` every time. Please note that the `A` is used twice in the command. This approach is good when wrong replacements may happen (working with a natural language, for example) and the data set is small enough. {{{ find . -type f -name '*.txt' -exec grep -q 'A' {} \; -exec vim -c '%s/A/B/gc' -c 'wq' {} \; }}} ==== Using a temporary file ==== If shell variables are used as the search and/or replace strings, `ed` is not suitable. Nor is `sed`, or any tool that uses regular expressions. Consider using the `awk` code at the bottom of this FAQ with redirections, and `mv`. {{{ gsub_literal "$search" "$rep" < "$file" > tmp && mv -- tmp "$file" }}} {{{ # Using GNU tools to preseve ownership/group/permissions gsub_literal "$search" "$rep" < "$file" > tmp && chown --reference="$file" tmp && chmod --reference="$file" tmp && mv -- tmp "$file" }}} ==== Using nonstandard tools ==== `sed` is a '''Stream EDitor''', not a '''file''' editor. Nevertheless, people everywhere tend to abuse it for trying to edit files. It doesn't edit files. GNU `sed` (and some BSD `sed`s) have a `-i` option that makes a copy and replaces the original file with the copy. An expensive operation, but if you enjoy unportable code, I/O overhead and bad side effects (such as destroying symlinks), and CodeInjection exploits, this would be an option: {{{ sed -i 's/old/new/g' ./* # GNU, OpenBSD sed -i '' 's/old/new/g' ./* # FreeBSD |
Line 225: | Line 161: |
---- All of the tools listed above use regular expressions, which means they have the same issue as the sed code earlier; trying to embed shell variables in them is a terrible idea, and treating an arbitrary value as a literal string is painful at best. This brings us back to our while read loop, or the awk function above. The while read loop: {{{ # overwrite a single file tmp=$(mktemp) || exit trap 'rm -f "$tmp"' EXIT |
All of the examples above use regular expressions, which means they have the same issue as the sed code earlier; trying to embed shell variables in them is a terrible idea, and treating an arbitrary value as a literal string is painful at best. If the inputs are not under your direct control, you can pass them as variables into both search and replace strings with no unquoting or potential for conflict with sigil characters: {{{ in="$search" out="$replace" perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' ./* }}} Or, wrapped in a useful shell function: {{{ # Bash # usage: replace FROM TO [file ...] replace() { local in="$1" out="$2"; shift 2 in="$in" out="$out" perl -p ${1+-i} -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' "$@" } }}} This wrapper passes perl's `-i` option if there are any filenames, so that they are "edited in-place" (or at least as far as perl does such a thing -- see the perl documentation for details). === Variables === If you want to replace content within a variable, this can (and should) be done very simply with Bash's parameter expansion: {{{ var='some string'; search=some; rep=another # Bash var=${var//"$search"/$rep} }}} It's a lot harder in POSIX: {{{ # POSIX function # usage: string_rep SEARCH REPL STRING # replaces all instances of SEARCH with REPL in STRING string_rep() { # initialize vars in=$3 unset -v out # SEARCH must not be empty case $1 in '') return; esac while # break loop if SEARCH is no longer in "$in" case "$in" in *"$1"*) ;; *) break;; esac do # append everything in "$in", up to the first instance of SEARCH, and REP, to "$out" out=$out${in%%"$1"*}$2 # remove everything up to and including the first instance of SEARCH from "$in" in=${in#*"$1"} done # append whatever is left in "$in" after the last instance of SEARCH to out, and print printf '%s%s\n' "$out" "$in" } var=$(string_rep "$search" "$rep" "$var") # Note: POSIX does not have a way to localize variables. Most shells (even dash and # busybox), however, do. Feel free to localize the variables if your shell supports # it. Even if it does not, if you call the function with var=$(string_rep ...), the # function will be run in a subshell and any assignments it makes will not persist. }}} In the bash example, the quotes around `"$search"` prevent the contents of the variable to be treated as a shell pattern (also called a [[glob]]). Of course, if pattern matching is intended, do not include the quotes. If `"$rep"` were quoted, however, the quotes would be treated as literal. Parameter expansions like this are discussed in more detail in [[BashFAQ/100|Faq #100]]. === Streams === If you wish to modify a stream, and if your search and replace strings are known in advance, then use the '''s'''tream '''ed'''itor: {{{ some_command | sed 's/foo/bar/g' }}} `sed` uses [[RegularExpression|regular expressions]]. In our example, `foo` and `bar` are literal strings. If they were variables (e.g. user input), they would have to be rigorously escaped in order to prevent errors. This is very impractical, and attempting to do so will make your code extremely prone to bugs. Embedding shell variables in sed commands is '''never''' a good idea -- it is a prime source of CodeInjection bugs. You could also do it in Bash itself, by combining a parameter expansion with [[BashFAQ/001|Faq #1]]: {{{ search=foo rep=bar |
Line 238: | Line 252: |
done < "$file" > "$tmp" && mv "$tmp" "$file" }}} Replaces all files in a directory: {{{ trap 'rm -f "${temps[@]}"' EXIT temps=() for f in ./*; do if [[ -f $f ]]; then tmp=$(mktemp) || exit temps+=("$tmp") while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" done < "$f" > "$tmp" && mv "$tmp" "$f" fi |
done < <(some_command) # or some_command | while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" |
Line 260: | Line 261: |
The above glob could be changed to './**/*' in order to use globstar (mentioned above) to be recursive, or of course we could use `find`: {{{ # this example uses GNU find's -print0. Using POSIX find -exec is left as an exercise to the reader trap 'rm -f "${temps[@]}"' EXIT temps=() while IFS= read -rd '' f <&3; do tmp=$(mktemp) || exit temps+=("$tmp") while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" done < "$f" > "$tmp" && mv "$tmp" "$f" done 3< <(find . -type f -print0) }}} And of course, we can adapt the `AWK` function above. The following function replaces all instances of STR with REP in FILE, actually overwriting FILE: {{{ # usage: gsub_literal_f STR REP FILE # replaces all instances of STR with REP in FILE gsub_literal_f() { local tmp # make sure FILE exists, is a regular file, and is readable and writable if ! [[ -f $3 && -r $3 && -w $3 ]]; then printf '%s does not exist or is not readable or writable\n' "$3" >&2 return 1 fi |
If you want to do more processing than just a simple search/replace, this may be the best option. Note that the last example runs the loop in a SubShell. See [[BashFAQ/024|Faq #24]] for more information on that. You may notice, however, that the bash loop above is very slow for large data sets. So how do we find something faster, that can replace literal strings? Well, you could use `awk`. The following function replaces all instances of STR with REP, reading from stdin and writing to stdout. {{{ # usage: gsub_literal STR REP # replaces all instances of STR with REP. reads from stdin and writes to stdout. gsub_literal() { |
Line 294: | Line 272: |
tmp=$(mktemp) || return trap 'rm -f "$tmp"' RETURN # string manip needed to escape '\'s, so awk doesn't expand '\n' and such awk -v str="${1//\\/\\\\}" -v rep="${2//\\/\\\\}" ' |
str=$1 rep=$2 awk ' |
Line 301: | Line 275: |
str = ENVIRON["str"] rep = ENVIRON["rep"] |
|
Line 323: | Line 299: |
' "$3" > "$tmp" && mv "$tmp" "$3" | ' |
Line 325: | Line 301: |
}}} This function, of course, could be called on all of the files in a dir, or recursively. ---- '''Notes:''' For more information on `sed` or `awk`, you can visit the '''##sed''' and '''#awk''' channels on freenode, respectively. ''mktemp(1)'', used in many of the examples above, is not completely portable. While it will work on most systems, more information on safely creating temp files can be found in [[BashFAQ|Faq #62]]. |
some_command | gsub_literal "$search" "$rep" # condensed as a one-liner: some_command | s=$search r=$rep awk 'BEGIN {s=ENVIRON["s"]; r=ENVIRON["r"]; l=length(s)} {o=""; while (i=index($0, s)) {o=o substr($0,1,i-1) r; $0=substr($0,i+l)} print o $0}' }}} |
How can I replace a string with another string in a variable, a stream, a file, or in all the files in a directory?
There are a number of techniques for this. Which one to use depends on many factors, the biggest of which is what we're editing. This page also contains contradictory advice from multiple authors. This is a deeply ugly topic, and there are no universally right answers (but plenty of universally wrong ones).
Contents
Files
Before you start, be warned that editing files is a really bad idea. The preferred way to modify a file is to create a new file within the same file system, write the modified content into it, and then mv it to the original name. This is the only way to prevent data loss in the event of a crash while writing. However, using a temp file and mv means that you break hardlinks to the file (unavoidably), that you would convert a symlink to hard file, and that you may need to take extra steps to transfer the ownership and permissions (and possibly other metadata) of the original file to the new file. Some people prefer to roll the dice and accept the tiny possibility of data loss versus the greater possibility of hardlink loss and the inconvenience of chown/chmod (and potentially setfattr, setfacl, chattr...).
The other major problem you're going to face is that all of the standard Unix tools for editing files expect some kind of regular expression as the search pattern. If you're passing input you did not create as the search pattern, it may contain syntax that breaks the program's parser, which can lead to failures, or CodeInjection exploits.
Just Tell Me What To Do
If your search string or your replacement string comes from an external source (environment variable, argument, file, user input) and is therefore not under your control, then this is your best choice:
in="$search" out="$replace" perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' ./*
That will operate on all of the files in the current directory. If you want to operate on a full hierarchy (recursively), then:
in="$search" out="$replace" find . -type f -exec \ perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' {} +
You may of course supply additional options to find to restrict which files are replaced; see UsingFind for more information.
The critical reader may note that these commands use perl which is not a standard tool. That's because none of the standard tools can do this task safely.
If you're stuck using standard tools due to a restricted execution environment, then you'll have to weigh the options below and choose the one that will do the least amount of damage to your files.
Using a file editor
The only standard tools that actually edit a file are ed and ex (vi is the visual mode for ex).
ed is the standard UNIX command-based editor. ex is another standard command-line editor. Here are some commonly-used syntaxes for replacing the string olddomain.com by the string newdomain.com in a file named file. All four commands do the same thing, with varying degrees of portability and efficiency:
## Ex ex -sc '%s/olddomain\.com/newdomain.com/g|x' file ## Ed # Bash ed -s file <<< $'g/olddomain\\.com/s//newdomain.com/g\nw\nq' # Bourne (with printf) printf '%s\n' 'g/olddomain\.com/s//newdomain.com/g' w q | ed -s file printf 'g/olddomain\\.com/s//newdomain.com/g\nw\nq' | ed -s file # Bourne (without printf) ed -s file <<! g/olddomain\\.com/s//newdomain.com/g w q !
To replace a string in all files of the current directory, just wrap one of the above in a loop:
for file in ./*; do [[ -f $file ]] && ed -s "$file" <<< $'g/old/s//new/g\nw\nq' done
To do this recursively, the easy way would be to enable globstar in bash 4 (shopt -s globstar, a good idea to put this in your ~/.bashrc) and use:
# Bash 4+ (shopt -s globstar) for file in ./**; do [[ -f $file ]] && ed -s "$file" <<< $'g/old/s//new/g\nw\nq' done
If you don't have bash 4, you can use find. Unfortunately, it's a bit tedious to feed ed stdin for each file hit:
find . -type f -exec sh -c 'for f do ed -s "$f" <<! g/old/s//new/g w q ! done' sh {} +
Since ex takes its commands from the command-line, it's less painful to invoke from find:
find . -type f -exec ex -sc '%s/old/new/g|x' {} \;
Beware though, if your ex is provided by vim, it may get stuck for files that don't contain an old. In that case, you'd add the e option to ignore those files. When vim is your ex, you can also use argdo and find's {} + to minimize the amount of ex processes to run:
# Bash 4+ (shopt -s globstar) ex -sc 'argdo %s/old/new/ge|x' ./** # Bourne find . -type f -exec ex -sc 'argdo %s/old/new/ge|x' {} +
You can also ask for confirmation for every replacement from A to B. You will need to type y or n every time. Please note that the A is used twice in the command. This approach is good when wrong replacements may happen (working with a natural language, for example) and the data set is small enough.
find . -type f -name '*.txt' -exec grep -q 'A' {} \; -exec vim -c '%s/A/B/gc' -c 'wq' {} \;
Using a temporary file
If shell variables are used as the search and/or replace strings, ed is not suitable. Nor is sed, or any tool that uses regular expressions. Consider using the awk code at the bottom of this FAQ with redirections, and mv.
gsub_literal "$search" "$rep" < "$file" > tmp && mv -- tmp "$file"
# Using GNU tools to preseve ownership/group/permissions gsub_literal "$search" "$rep" < "$file" > tmp && chown --reference="$file" tmp && chmod --reference="$file" tmp && mv -- tmp "$file"
Using nonstandard tools
sed is a Stream EDitor, not a file editor. Nevertheless, people everywhere tend to abuse it for trying to edit files. It doesn't edit files. GNU sed (and some BSD seds) have a -i option that makes a copy and replaces the original file with the copy. An expensive operation, but if you enjoy unportable code, I/O overhead and bad side effects (such as destroying symlinks), and CodeInjection exploits, this would be an option:
sed -i 's/old/new/g' ./* # GNU, OpenBSD sed -i '' 's/old/new/g' ./* # FreeBSD
Those of you who have perl 5 can accomplish the same thing using this code:
perl -pi -e 's/old/new/g' ./*
Recursively using find:
find . -type f -exec perl -pi -e 's/old/new/g' {} \; # if your find doesn't have + yet find . -type f -exec perl -pi -e 's/old/new/g' {} + # if it does
If you want to delete lines instead of making substitutions:
# Deletes any line containing the perl regex foo perl -ni -e 'print unless /foo/' ./*
To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...:
find . -type f -exec perl -i.bak -pne \ 's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g' {} \;
All of the examples above use regular expressions, which means they have the same issue as the sed code earlier; trying to embed shell variables in them is a terrible idea, and treating an arbitrary value as a literal string is painful at best.
If the inputs are not under your direct control, you can pass them as variables into both search and replace strings with no unquoting or potential for conflict with sigil characters:
in="$search" out="$replace" perl -pi -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' ./*
Or, wrapped in a useful shell function:
# Bash # usage: replace FROM TO [file ...] replace() { local in="$1" out="$2"; shift 2 in="$in" out="$out" perl -p ${1+-i} -e 's/\Q$ENV{"in"}/$ENV{"out"}/g' "$@" }
This wrapper passes perl's -i option if there are any filenames, so that they are "edited in-place" (or at least as far as perl does such a thing -- see the perl documentation for details).
Variables
If you want to replace content within a variable, this can (and should) be done very simply with Bash's parameter expansion:
var='some string'; search=some; rep=another # Bash var=${var//"$search"/$rep}
It's a lot harder in POSIX:
# POSIX function # usage: string_rep SEARCH REPL STRING # replaces all instances of SEARCH with REPL in STRING string_rep() { # initialize vars in=$3 unset -v out # SEARCH must not be empty case $1 in '') return; esac while # break loop if SEARCH is no longer in "$in" case "$in" in *"$1"*) ;; *) break;; esac do # append everything in "$in", up to the first instance of SEARCH, and REP, to "$out" out=$out${in%%"$1"*}$2 # remove everything up to and including the first instance of SEARCH from "$in" in=${in#*"$1"} done # append whatever is left in "$in" after the last instance of SEARCH to out, and print printf '%s%s\n' "$out" "$in" } var=$(string_rep "$search" "$rep" "$var") # Note: POSIX does not have a way to localize variables. Most shells (even dash and # busybox), however, do. Feel free to localize the variables if your shell supports # it. Even if it does not, if you call the function with var=$(string_rep ...), the # function will be run in a subshell and any assignments it makes will not persist.
In the bash example, the quotes around "$search" prevent the contents of the variable to be treated as a shell pattern (also called a glob). Of course, if pattern matching is intended, do not include the quotes. If "$rep" were quoted, however, the quotes would be treated as literal.
Parameter expansions like this are discussed in more detail in Faq #100.
Streams
If you wish to modify a stream, and if your search and replace strings are known in advance, then use the stream editor:
some_command | sed 's/foo/bar/g'
sed uses regular expressions. In our example, foo and bar are literal strings. If they were variables (e.g. user input), they would have to be rigorously escaped in order to prevent errors. This is very impractical, and attempting to do so will make your code extremely prone to bugs. Embedding shell variables in sed commands is never a good idea -- it is a prime source of CodeInjection bugs.
You could also do it in Bash itself, by combining a parameter expansion with Faq #1:
search=foo rep=bar while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" done < <(some_command) # or some_command | while IFS= read -r line; do printf '%s\n' "${line//"$search"/$rep}" done
If you want to do more processing than just a simple search/replace, this may be the best option. Note that the last example runs the loop in a SubShell. See Faq #24 for more information on that.
You may notice, however, that the bash loop above is very slow for large data sets. So how do we find something faster, that can replace literal strings? Well, you could use awk. The following function replaces all instances of STR with REP, reading from stdin and writing to stdout.
# usage: gsub_literal STR REP # replaces all instances of STR with REP. reads from stdin and writes to stdout. gsub_literal() { # STR cannot be empty [[ $1 ]] || return str=$1 rep=$2 awk ' # get the length of the search string BEGIN { str = ENVIRON["str"] rep = ENVIRON["rep"] len = length(str); } { # empty the output string out = ""; # continue looping while the search string is in the line while (i = index($0, str)) { # append everything up to the search string, and the replacement string out = out substr($0, 1, i-1) rep; # remove everything up to and including the first instance of the # search string from the line $0 = substr($0, i + len); } # append whatever is left out = out $0; print out; } ' } some_command | gsub_literal "$search" "$rep" # condensed as a one-liner: some_command | s=$search r=$rep awk 'BEGIN {s=ENVIRON["s"]; r=ENVIRON["r"]; l=length(s)} {o=""; while (i=index($0, s)) {o=o substr($0,1,i-1) r; $0=substr($0,i+l)} print o $0}'