On Jan 28, 9:56=A0am, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
> On 1/24/2008 8:56 PM, z.entropic wrote:
>
>
>
> > On Jan 24, 5:40 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>On 1/24/2008 4:35 PM, z.entropic wrote:
>
> >>>On Jan 24, 3:33 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>>>On 1/24/2008 2:25 PM, z.entropic wrote:
>
> >>>>>On Jan 24, 2:41 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>>>>>On 1/24/2008 1:19 PM, z.entropic wrote:
> >>>>>><snip>
>
> >>>>>>>Here is a fragment of my input file:
>
> >>>>>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >>>>>>>100 =A0 =A0 =A0 =A024479.33 =A0 =A0 =A0 =A014399.09 =A0 =A0 =A0
=A0=
1/23/2008 19:55 6 =A0 =A0 =A0 1
>
> =A0 0 =A0 =A0 =A0 0 =A0 =A0 =A0 3.293 =A0 1.287>>>>>>>101 =A0 =A0 =A0
=A02=
4480.25 =A0 =A0 =A0 =A014400.01 =A0 =A0 =A0 =A01/23/2008 19:55 6 =A0 =A0
=A0=
1
>
> =A0 0 =A0 =A0 =A0 0 =A0 =A0 =A0 3.296 =A0 1.288>>>>>>>102 =A0 =A0 =A0
=A02=
4480.36 =A0 =A0 =A0 =A00.11 =A0 =A01/23/2008 19:55 7 =A0 =A0 =A0 1 =A0 =A0
=
=A0 0
>
> =A0 -0.00185954 =A0 =A0 3.167 =A0 1.287
>
>
>
>
>
> >>>>>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> >>>>>>>Thus, if field $6 in line 100 is equal to 6 AND field $6 in line
10=
1
> >>>>>>>is equal to 7, store the value 14399.09 in array1[1] and 1.287 in
> >>>>>>>array2[1]. When the next match is found, store the two values in
> >>>>>>>array1[2] and array2[2], etc. =A0Basically, I'm comparing the
same
> >>>>>>>fields in consecutive rows.
>
> >>>>>>Try this:
>
> >>>>>>awk '($6=3D=3D7)&&(p6=3D=3D6){array1[++n]=3Dp3;array2[n]=3Dp11}
{p=
=3D$6;p3=3D$3;p11=3D$11}' file
>
> >>>>>>You should use an array for the "p" (previous) field values if you
t=
o need to
> >>>>>>access more of them.
>
> >>>>>> =A0 =A0 =A0Ed.
>
> >>>>>I think my example is a bit confusing due to poor formatting
(copying=
>
> >>>>>from Excel with the wrong date format didn't help...) =A0In
essence, =
I
>
> >>>>>can't get the script working even after some changes etc., so let
me
> >>>>>explain again as best as I can. =A0Here is an interesting section
frm=
> >>>>>one of my data files, this time with proper formatting that awk
would=
> >>>>>see (tab-separated fields):
>
> >>>>>100 24479.32 14399.08 1/23/2008 7:55:39 PM 6 1 0 =A00 =A0 =A0 =A0
=A0=
=A03.293399
> >>>>>101 24480.25 14400.01 1/23/2008 7:55:40 PM 6 1 0 =A00 =A0 =A0 =A0
=A0=
=A03.293234
> >>>>>102 24480.36 =A0 =A0 0.10 1/23/2008 7:55:41 PM 7 1 0 -0.00185954
3.16=
6826
> >>>>>103 24480.46 =A0 =A0 0.21 1/23/2008 7:55:41 PM 7 1 0 -0.00185932
3.03=
4836
>
> >>>>>Simply put, I want to find pairs of lines in which the counter in
> >>>>>field $7 changes, here from 6 to 7, and then store in array
array1[1]=
> >>>>>the value found in field $11 (3.293234, line 101). The next pair of
> >>>>>found lines would change the array counter to 2 (array[2]).
>
> >>>>So now we're back to one array? ok, look:
>
> >>>>$ awk '($7=3D=3D7)&&(p7=3D=3D6){array[++n]=3Dp11}
{p7=3D$7;p11=3D$11} =
END{for (i in array)
> >>>>print i, array[i]}' file
> >>>>1 3.293234
>
> >>>>>Once I figure out with your help how to do that, I'll try to expand
> >>>>>this script to store more values, including some from line 102 in
the=
> >>>>>example above.
>
> >>>>If the above still isn't what you're looking for either, maybe
posting=
a little
> >>>>more sample input and some expected output would help.
>
> >>>> =A0 =A0 =A0 Ed.- Hide quoted text -
>
> >>>>- Show quoted text -
>
> >>>Your script works--in part, probably because I underspecified the
> >>>requirements. =A0I think the problem is a bit more complex; I'll
provid=
e
> >>>a larger =A0example of input and output.
>
> >>OK, but if it's just that you want to get output every time the 7th
fiel=
d
> >>changes rather than when it specifically changes from 6 to 7, then all
y=
ou'd
> >>need is:
>
> >>$ awk 'p7&&($7!=3Dp7){array[++n]=3Dp11} {p7=3D$7;p11=3D$11} END{for (i
i=
n array) print i
> >>, array[i]}' file
> >>1 3.293234
>
> >>so also see if that's what you're really looking for....
>
> >>>I believe this kind of a problem may be of interest to a wider group
> >>>of readers and awk users as it concerns data extraction and
processing
> >>>that I, at least, often encounter.
>
> >>>z.e.- Hide quoted text -
>
> >>- Show quoted text -- Hide quoted text -
>
> >>- Show quoted text -
>
> > I think this is the closes so far to my goal--the $11 values printed
> > out are those I am after, and the lines I'm interested in always are
> > those where one of the fields, a loop counter of sorts, changes a
> > value. Now, three questions on the modification of the latest script
> > to expand its functionality:
>
> > 1. how to store the value in an aditional field, e.g., $10, and print
> > it out on the same line? =A0I've tried
>
> > awk 'p7&&($7!=3Dp7){V[++n]=3Dp11}{c[++m]=3D=3Dp10}
{p7=3D$7;p10=3D$10;p1=
1=3D$11} END
>
> ITYM c[++m]=3Dp10 instead of c[++m]=3D=3Dp10.
>
> > {for (i in V) print i, V[i],c[i]}'
>
> > but obviously this ex[pression doesn't work as intended (the for loop
> > is incomplete...) =A0Should I use two independent loops and a \n at
the
> > end of the first statement? =A0The n and m indices are always the
same,
> > but I can't use n twice as its increases in both expressions...
>
> If you don't want n to increase twice, just don't increment it twice:
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10}
{p7=3D$7;p10=3D$10;p11=3D$11}=
> END{for (i in V) print i, V[i],c[i]}'
>
>
>
> > 2. how could I store and print out $11 from the next line (with an
> > already changed $7?)
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10;d[n]=3D$11}
{p7=3D$7;p10=3D$10=
;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i]}'
>
> > 3. I'd like to store and print the source FILENAME on each line.
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10;d[n]=3D$11,e[n]=3DFILENAME}
> {p7=3D$7;p10=3D$10;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i],e[i]}'
>
> but you don't need to store it if it's just one input file:
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10;d[n]=3D$11}
{p7=3D$7;p10=3D$10=
;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i],FILENAME}'
>
> > 4. I'd like to skip the first 5 or 10 lines (I think I know how to do
> > that...)
>
> awk 'NR<=3D10{next}
> p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3D=3Dp10;d[n]=3D$11,e[n]=3DFILENAME}
{p7=
=3D$7;p10=3D$10;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i],e[i]}'
>
> > 5. I assume that if I wanted more complex conditions, I could combine
> > them as in
>
> > awk '(p7&&($7!=3Dp7))&&(p8&&($8!=3Dp8))...'
>
> > but what if I'd like to use $8 on the next line, with a changed value
> > of $7?
>
> I don't know what you mean by that.
>
> > Hmmm... this is getting more complex than I initially expected...
>
> Just follow the pattern....
>
> =A0 =A0 =A0 =A0 Ed.- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -
Ed,
I have plenty to digest now. Your wonderful advice is greatly
appreciated!
z.entropic


|