Size: 3714
Comment:
|
← Revision 37 as of 2023-01-26 22:54:33 ⇥
Size: 4194
Comment: sed before awk
|
Deletions are marked like this. | Additions are marked like this. |
Line 9: | Line 9: |
{{{ | {{{#!highlight bash |
Line 11: | Line 11: |
grep foo "$myfile" | grep bar # for those who need the hand-holding | grep foo -- "$myfile" | grep bar # for those who need the hand-holding |
Line 16: | Line 16: |
{{{ | {{{#!highlight bash |
Line 22: | Line 22: |
{{{ | {{{#!highlight bash |
Line 29: | Line 29: |
{{{ | {{{#!highlight sh |
Line 31: | Line 31: |
multimatch() { # usage: multimatch pattern... | multimatch() { # usage: multimatch pattern... |
Line 51: | Line 51: |
{{{ | {{{#!highlight bash |
Line 53: | Line 53: |
}}} Or you can separate the patterns with newlines: {{{#!highlight bash grep 'foo bar' |
|
Line 57: | Line 64: |
{{{ | {{{#!highlight bash |
Line 65: | Line 72: |
{{{ | {{{#!highlight bash sed -n -e '/foo/{ p; d; }' -e '/bar/{ p; d; }' |
Line 73: | Line 81: |
{{{ | {{{#!highlight bash |
Line 75: | Line 83: |
}}} Or using {{{sed}}}, or {{{awk}}}: {{{#!highlight bash sed -e '/foo/d' -e '/bar/d' awk '!/foo|bar/' |
|
Line 81: | Line 95: |
{{{ grep -q foo "$myfile" && grep -q bar "$myfile" && echo "Found both" |
{{{#!highligh bash if grep -q foo "$myfile" && grep -q bar "$myfile"; then printf 'Found both\n' fi |
Line 88: | Line 104: |
{{{ awk '/foo/{a=1} /bar/{b=1} a&&b{print "both found";exit} END{if (a&&b){ exit 0} else{exit 1}}' |
{{{#!highligh bash if awk '/foo/{a=1} /bar/{b=1} a&&b{exit} END{if(a&&b){exit 0};exit 1}' "$myfile"; then printf 'Found both\n' fi |
Line 97: | Line 115: |
{{{ | {{{#!highligh bash |
Line 105: | Line 123: |
{{{ | {{{#!highligh bash |
How can I grep for lines containing foo AND bar, foo OR bar? Or for files containing foo AND bar, possibly on separate lines? Or files containing foo but NOT bar?
This is really four different questions, so we'll break this answer into parts.
foo AND bar on the same line
The easiest way to match lines that contain both foo AND bar is to use two grep commands:
It can also be done with one grep, although (as you can probably guess) this doesn't really scale well to more than two patterns:
1 grep -E 'foo.*bar|bar.*foo'
If you prefer, you can achieve this in one sed or awk statement:
If you need to scale the awk solution to an arbitrary number of patterns, you can write a function like this:
foo OR bar on the same line
There are lots of ways to match lines containing foo OR bar. grep can be given multiple patterns with -e:
1 grep -e 'foo' -e 'bar'
Or you can separate the patterns with newlines:
Or you can construct one pattern with grep -E:
1 grep -E 'foo|bar'
(You can't use the | union operator with plain grep. | is only available in Extended Regular Expressions.)
It can also be done with sed, awk, etc.
The awk approach has the advantage of letting you use awk's other features on the matched lines, such as extracting only certain fields.
To match lines that do not contain "foo" AND do not contain "bar":
1 grep -E -v 'foo|bar'
Or using sed, or awk:
foo AND bar in the same file, not necessarily on the same line
If you want to match files (rather than lines) that contain both "foo" and "bar", there are several possible approaches. The simplest (although not necessarily the most efficient) is to read the file twice:
if grep -q foo "$myfile" && grep -q bar "$myfile"; then printf 'Found both\n' fi
The double grep -q solution has the advantage of stopping each read whenever it finds a match; so if you have a huge file, but the matched words are both near the top, it will only read the first part of the file. Unfortunately, if the matches are near the bottom (worst case: very last line of the file), you may read the whole file two times.
Another approach is to read the file once, keeping track of what you've seen as you go along. In awk:
if awk '/foo/{a=1} /bar/{b=1} a&&b{exit} END{if(a&&b){exit 0};exit 1}' "$myfile"; then printf 'Found both\n' fi
It reads the file one time, stopping when both patterns have been matched. No matter what happens, the END block is then executed, and the exit status is set accordingly.
If you want to do additional checking of the file's contents, this awk solution can be adapted quite easily.
A perl one-liner that scales to any number of patterns, while also reading each input file only once:
perl -e '@pat=("foo","bar"); local $/; L: for $f (@ARGV){open(FH,,$f); $a=<FH>; for(@pat){next L unless $a =~ $_} print "$f\n"}'
foo but NOT bar in the same file, possibly on different lines
This is a variant of the previous case. The advantage here is that if we find "bar", we can stop reading. Here's an awk solution:
awk '/foo/{good=1} /bar/{good=0;exit} END{exit !good}'