Differences between revisions 6 and 7
Revision 6 as of 2012-05-19 01:19:01
Size: 2150
Editor: EricPruitt
Comment: timestamps with date and xargs
Revision 7 as of 2012-05-19 12:23:00
Size: 2093
Editor: GreyCat
Comment:
Deletions are marked like this. Additions are marked like this.
Line 14: Line 14:
And another one that's even worse: And another one that's even slower:
Line 19: Line 19:
I just came up with this one a few minutes ago that may be little less worse than the above and also less portable: And a third one, which is slightly faster, but which may mangle some of the input lines:
Line 25: Line 25:
The obvious disadvantage to both of the above examples is that we are forking a subshell, and then executing the external `date` command, for every line of input. If we only get a line every couple seconds, that may be acceptable. But if we're trying to timestamp a stream that gets dozens of lines per second, we may not even be able to keep up with the writer. The obvious disadvantage to all of the above examples is that we are executing the external `date` command for every line of input. If we only get a line every couple seconds, that may be acceptable. But if we're trying to timestamp a stream that gets dozens of lines per second, we may not even be able to keep up with the writer.

How do I add a timestamp to every line of a stream?

There are numerous ways to do this, but all of them are either limited by the available tools, or slow. We'll show a few examples.

Let's start with the slow, portable way first and get it over with:

# POSIX
while IFS= read -r line; do
  echo "$(date +%Y%m%d-%H:%M:%S) $line"
done

And another one that's even slower:

awk '{system("printf \"`date +%T ` \">&2")}$0'

And a third one, which is slightly faster, but which may mangle some of the input lines:

xargs -I@ -n1 date "+%T @"

The obvious disadvantage to all of the above examples is that we are executing the external date command for every line of input. If we only get a line every couple seconds, that may be acceptable. But if we're trying to timestamp a stream that gets dozens of lines per second, we may not even be able to keep up with the writer.

There are various ways to do it without forking for every line, but they all require nonstandard tools or specific shells. Bash 4.2 can do it with printf:

# Bash 4.2
while read -r; do
  printf "%(%Y%m%d-%H:%M:%S)T %s\n" -1 "$REPLY"
done

The %(...)T format specifier is new in bash 4.2. The argument of -1 tells it to use the current time, rather than a time passed as an argument. See the man page for details.

Another way is to write a perl one-liner:

perl -p -e '@l=localtime; printf "%04d%02d%02d-%02d:%02d:%02d ", 1900+$l[5], $l[4], $l[3], $l[2], $l[1], $l[0]'

I'm sure someone will come up with a 7-byte alternative that does the same thing using some magic perl syntax I've never seen before and can't understand....

There are other tools available specifically for timestamping logfiles and the like. One of them is multilog from daemontools; but its timestamping format is TAI64N which is not human-readable. Another is ts from the moreutils package.

BashFAQ/107 (last edited 2020-12-21 19:30:23 by GreyCat)