On Feb 27, 12:56=A0am, Janis <janis_papanag...@[EMAIL PROTECTED]
> wrote:
> On 27 Feb., 08:29, "DanielC" <dnlc...@[EMAIL PROTECTED]
> wrote:
>
> > "Ed Morton" <mor...@[EMAIL PROTECTED]
> wrote in message
> >news:47C25043.1000302@[EMAIL PROTECTED]
>
> > > On 2/24/2008 5:34 PM, DanielC wrote:
>
> > > [ for the second time - please don't top-post, fixed below]
>
> > What does this mean?
>
> http://www.caliburn.nl/topposting.html
>
> [...]
>
>
>
> > A question: how do we use awk to process a very large file (ie. 500000
> > lines) by reading file from bottom to top?
> > I know we can use array to reverse a small file, and we can also apply
i=
t on
> > the big file. However that costs too much, and it is not suitable for
a
> > frequent job.
>
> Yes, to efficiently process large files you would take a tool that is
> more appropriate for the task. On Unix you may find 'tac' for that.
>
> Janis
# time tac p20080229.log | awk '
BEGIN
{ now=3Dsystime(); tgt=3D120* 60 }
{ then=3Dmktime(gensub(/\[(....)-(..)-(..) (..):(..):(..).*/,"\\1 \\2 \
\3 \\4 \\5 \\6","")) }
{ if ((now - then) < tgt) {count[$5]++} else exit }
END
{ for (pid in count) print pid,count[pid] }' | sort -k1
449 18
547 20
548 13
549 5
550 9
551 10
552 8
553 21
554 6
555 24
556 21
557 11
558 11
559 15
560 19
561 27
562 25
563 11
579 6
581 1
582 17
real 0m0.019s
user 0m0.014s
sys 0m0.005s
# ls -l p20080229.log
-rw-rw-rw- 1 root root 33854251 Mar 1 00:00 p20080229.log
# wc -l p20080229.log
573838 p20080229.log
# date
Sat Mar 1 02:00:30 GMT 2008
It works great! Thanks Ed and Janis.
DanielC


|