On Jan 24, 5:40=A0pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
> On 1/24/2008 4:35 PM, z.entropic wrote:
>
>
>
>
>
> > On Jan 24, 3:33 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>On 1/24/2008 2:25 PM, z.entropic wrote:
>
> >>>On Jan 24, 2:41 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>>>On 1/24/2008 1:19 PM, z.entropic wrote:
> >>>><snip>
>
> >>>>>Here is a fragment of my input file:
>
> >>>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >>>>>100 =A0 =A0 =A0 =A024479.33 =A0 =A0 =A0 =A014399.09 =A0 =A0 =A0
=A01/=
23/2008 19:55 6 =A0 =A0 =A0 1 =A0 =A0 =A0 0 =A0 =A0 =A0 0 =A0 =A0 =A0
3.293 =
=A0 1.287
> >>>>>101 =A0 =A0 =A0 =A024480.25 =A0 =A0 =A0 =A014400.01 =A0 =A0 =A0
=A01/=
23/2008 19:55 6 =A0 =A0 =A0 1 =A0 =A0 =A0 0 =A0 =A0 =A0 0 =A0 =A0 =A0
3.296 =
=A0 1.288
> >>>>>102 =A0 =A0 =A0 =A024480.36 =A0 =A0 =A0 =A00.11 =A0 =A01/23/2008
19:5=
5 7 =A0 =A0 =A0 1 =A0 =A0 =A0 0 =A0 =A0 =A0 -0.00185954 =A0 =A0 3.167 =A0
1.=
287
> >>>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> >>>>>Thus, if field $6 in line 100 is equal to 6 AND field $6 in line
101
> >>>>>is equal to 7, store the value 14399.09 in array1[1] and 1.287 in
> >>>>>array2[1]. When the next match is found, store the two values in
> >>>>>array1[2] and array2[2], etc. =A0Basically, I'm comparing the same
> >>>>>fields in consecutive rows.
>
> >>>>Try this:
>
> >>>>awk '($6=3D=3D7)&&(p6=3D=3D6){array1[++n]=3Dp3;array2[n]=3Dp11}
{p=3D$=
6;p3=3D$3;p11=3D$11}' file
>
> >>>>You should use an array for the "p" (previous) field values if you
to =
need to
> >>>>access more of them.
>
> >>>> =A0 =A0 =A0 Ed.
>
> >>>I think my example is a bit confusing due to poor formatting (copying
> >>>from Excel with the wrong date format didn't help...) =A0In essence,
I
> >>>can't get the script working even after some changes etc., so let me
> >>>explain again as best as I can. =A0Here is an interesting section frm
> >>>one of my data files, this time with proper formatting that awk would
> >>>see (tab-separated fields):
>
> >>>100 24479.32 14399.08 1/23/2008 7:55:39 PM 6 1 0 =A00 =A0 =A0 =A0 =A0
=
=A03.293399
> >>>101 24480.25 14400.01 1/23/2008 7:55:40 PM 6 1 0 =A00 =A0 =A0 =A0 =A0
=
=A03.293234
> >>>102 24480.36 =A0 =A0 0.10 1/23/2008 7:55:41 PM 7 1 0 -0.00185954
3.1668=
26
> >>>103 24480.46 =A0 =A0 0.21 1/23/2008 7:55:41 PM 7 1 0 -0.00185932
3.0348=
36
>
> >>>Simply put, I want to find pairs of lines in which the counter in
> >>>field $7 changes, here from 6 to 7, and then store in array array1[1]
> >>>the value found in field $11 (3.293234, line 101). The next pair of
> >>>found lines would change the array counter to 2 (array[2]).
>
> >>So now we're back to one array? ok, look:
>
> >>$ awk '($7=3D=3D7)&&(p7=3D=3D6){array[++n]=3Dp11} {p7=3D$7;p11=3D$11}
EN=
D{for (i in array)
> >>print i, array[i]}' file
> >>1 3.293234
>
> >>>Once I figure out with your help how to do that, I'll try to expand
> >>>this script to store more values, including some from line 102 in the
> >>>example above.
>
> >>If the above still isn't what you're looking for either, maybe posting
a=
little
> >>more sample input and some expected output would help.
>
> >> =A0 =A0 =A0 =A0Ed.- Hide quoted text -
>
> >>- Show quoted text -
>
> > Your script works--in part, probably because I underspecified the
> > requirements. =A0I think the problem is a bit more complex; I'll
provide=
> > a larger =A0example of input and output.
>
> OK, but if it's just that you want to get output every time the 7th
field
> changes rather than when it specifically changes from 6 to 7, then all
you=
'd
> need is:
>
> $ awk 'p7&&($7!=3Dp7){array[++n]=3Dp11} {p7=3D$7;p11=3D$11} END{for (i
in =
array) print i
> , array[i]}' file
> 1 3.293234
>
> so also see if that's what you're really looking for....
>
>
>
> > I believe this kind of a problem may be of interest to a wider group
> > of readers and awk users as it concerns data extraction and processing
> > that I, at least, often encounter.
>
> > z.e.- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -
I think this is the closes so far to my goal--the $11 values printed
out are those I am after, and the lines I'm interested in always are
those where one of the fields, a loop counter of sorts, changes a
value. Now, three questions on the modification of the latest script
to expand its functionality:
1. how to store the value in an aditional field, e.g., $10, and print
it out on the same line? I've tried
awk 'p7&&($7!=3Dp7){V[++n]=3Dp11}{c[++m]=3D=3Dp10}
{p7=3D$7;p10=3D$10;p11=3D=
$11} END
{for (i in V) print i, V[i],c[i]}'
but obviously this ex[pression doesn't work as intended (the for loop
is incomplete...) Should I use two independent loops and a \n at the
end of the first statement? The n and m indices are always the same,
but I can't use n twice as its increases in both expressions...
2. how could I store and print out $11 from the next line (with an
already changed $7?)
3. I'd like to store and print the source FILENAME on each line.
4. I'd like to skip the first 5 or 10 lines (I think I know how to do
that...)
5. I assume that if I wanted more complex conditions, I could combine
them as in
awk '(p7&&($7!=3Dp7))&&(p8&&($8!=3Dp8))...'
but what if I'd like to use $8 on the next line, with a changed value
of $7?
Hmmm... this is getting more complex than I initially expected...
z.e.


|