On 2/7/2008 8:04 AM, di98mase wrote:
> Hi all,
>
> I have a big log file that looks like this:
>
> /I/Start/info/logId(1)
> /I/Start/info/logId(1)
> /W/other stuff/
> /I/Start/info/logId(2)
> /I/Start/info/logId(2)
> :
> :
> /W/more stuff/
> /W/other stuff/
> /I/End/info/logId(1)
> /I/End/info/logId(1)
> /W/more stuff/
> /W/other stuff
> :
> /I/End/info/logId(2)
> /I/End/info/logId(2)
>
> the 4 rows that have Start and End in the beginning are all parts of
> the same log item if they have the same logId(n). In my example above
> there are 2 log pairs (logId(1) and logId(2)).
>
> So I need to gather information from these 4 lines and put store this
> as information in another file. So my idea is to create a awk program
> that will:
> serach for regex '/I/Start' and read the logId
> browse the rest of the log file for the End logs with the same logId
> when all log lines are found and the information stored and printed
> continue on in the log file. Just make sure that the logId has not
> been used before.
>
> My problem is to understand how I shall do the "browsing" in the input
> file. this is what I mean: if I find the first Start log on line 5, if
> I use getline to parse the rest of the input file and if I find the
> End log online 34, how do I continue from line 6 for the next
> "search"? as far as I understand using getline will have the effect
> that the program will look at line 35 for the next Start log thus
> missing all logs between 6 and 33, or did I missunderstand?
>
> /di98mase
Using getline is rarely the right answer.
Are there always 2 start and end lines for each logId?
Do you want both in the output?
Do you want separate output files for each logId?
Take a look at the output from this:
awk -F/ '
$0 ~ "/I/Start" {ids[$NF]}
{for (id in ids) print $0 > id}
$0 ~ "/I/End" {delete ids[$NF]}
' file
to see how to create separate output files for each "logId". It'll only
print
one "End" line for each. Easily fixed, but I don't want to do more without
fully
understanding your requirements and this may be all the hint you need. If
not,
posting the desired output for your sample input would help.
Ed.


|