On Jan 28, 9:56=A0am, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
> On 1/24/2008 8:56 PM, z.entropic wrote:
>
>
>
> > On Jan 24, 5:40 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>On 1/24/2008 4:35 PM, z.entropic wrote:
>
> >>>On Jan 24, 3:33 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>>>On 1/24/2008 2:25 PM, z.entropic wrote:
>
> >>>>>On Jan 24, 2:41 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>>>>>On 1/24/2008 1:19 PM, z.entropic wrote:
> >>>>>><snip>
>
> >>>>>>>Here is a fragment of my input file:
>
> >>>>>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >>>>>>>100 =A0 =A0 =A0 =A024479.33 =A0 =A0 =A0 =A014399.09 =A0 =A0 =A0
=A0=
1/23/2008 19:55 6 =A0 =A0 =A0 1
>
> =A0 0 =A0 =A0 =A0 0 =A0 =A0 =A0 3.293 =A0 1.287>>>>>>>101 =A0 =A0 =A0
=A02=
4480.25 =A0 =A0 =A0 =A014400.01 =A0 =A0 =A0 =A01/23/2008 19:55 6 =A0 =A0
=A0=
1
>
> =A0 0 =A0 =A0 =A0 0 =A0 =A0 =A0 3.296 =A0 1.288>>>>>>>102 =A0 =A0 =A0
=A02=
4480.36 =A0 =A0 =A0 =A00.11 =A0 =A01/23/2008 19:55 7 =A0 =A0 =A0 1 =A0 =A0
=
=A0 0
>
> =A0 -0.00185954 =A0 =A0 3.167 =A0 1.287
>
>
>
>
>
> >>>>>>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> >>>>>>>Thus, if field $6 in line 100 is equal to 6 AND field $6 in line
10=
1
> >>>>>>>is equal to 7, store the value 14399.09 in array1[1] and 1.287 in
> >>>>>>>array2[1]. When the next match is found, store the two values in
> >>>>>>>array1[2] and array2[2], etc. =A0Basically, I'm comparing the
same
> >>>>>>>fields in consecutive rows.
>
> >>>>>>Try this:
>
> >>>>>>awk '($6=3D=3D7)&&(p6=3D=3D6){array1[++n]=3Dp3;array2[n]=3Dp11}
{p=
=3D$6;p3=3D$3;p11=3D$11}' file
>
> >>>>>>You should use an array for the "p" (previous) field values if you
t=
o need to
> >>>>>>access more of them.
>
> >>>>>> =A0 =A0 =A0Ed.
>
> >>>>>I think my example is a bit confusing due to poor formatting
(copying=
>
> >>>>>from Excel with the wrong date format didn't help...) =A0In
essence, =
I
>
> >>>>>can't get the script working even after some changes etc., so let
me
> >>>>>explain again as best as I can. =A0Here is an interesting section
frm=
> >>>>>one of my data files, this time with proper formatting that awk
would=
> >>>>>see (tab-separated fields):
>
> >>>>>100 24479.32 14399.08 1/23/2008 7:55:39 PM 6 1 0 =A00 =A0 =A0 =A0
=A0=
=A03.293399
> >>>>>101 24480.25 14400.01 1/23/2008 7:55:40 PM 6 1 0 =A00 =A0 =A0 =A0
=A0=
=A03.293234
> >>>>>102 24480.36 =A0 =A0 0.10 1/23/2008 7:55:41 PM 7 1 0 -0.00185954
3.16=
6826
> >>>>>103 24480.46 =A0 =A0 0.21 1/23/2008 7:55:41 PM 7 1 0 -0.00185932
3.03=
4836
>
> >>>>>Simply put, I want to find pairs of lines in which the counter in
> >>>>>field $7 changes, here from 6 to 7, and then store in array
array1[1]=
> >>>>>the value found in field $11 (3.293234, line 101). The next pair of
> >>>>>found lines would change the array counter to 2 (array[2]).
>
> >>>>So now we're back to one array? ok, look:
>
> >>>>$ awk '($7=3D=3D7)&&(p7=3D=3D6){array[++n]=3Dp11}
{p7=3D$7;p11=3D$11} =
END{for (i in array)
> >>>>print i, array[i]}' file
> >>>>1 3.293234
>
> >>>>>Once I figure out with your help how to do that, I'll try to expand
> >>>>>this script to store more values, including some from line 102 in
the=
> >>>>>example above.
>
> >>>>If the above still isn't what you're looking for either, maybe
posting=
a little
> >>>>more sample input and some expected output would help.
>
> >>>> =A0 =A0 =A0 Ed.- Hide quoted text -
>
> >>>>- Show quoted text -
>
> >>>Your script works--in part, probably because I underspecified the
> >>>requirements. =A0I think the problem is a bit more complex; I'll
provid=
e
> >>>a larger =A0example of input and output.
>
> >>OK, but if it's just that you want to get output every time the 7th
fiel=
d
> >>changes rather than when it specifically changes from 6 to 7, then all
y=
ou'd
> >>need is:
>
> >>$ awk 'p7&&($7!=3Dp7){array[++n]=3Dp11} {p7=3D$7;p11=3D$11} END{for (i
i=
n array) print i
> >>, array[i]}' file
> >>1 3.293234
>
> >>so also see if that's what you're really looking for....
>
> >>>I believe this kind of a problem may be of interest to a wider group
> >>>of readers and awk users as it concerns data extraction and
processing
> >>>that I, at least, often encounter.
>
> >>>z.e.- Hide quoted text -
>
> >>- Show quoted text -- Hide quoted text -
>
> >>- Show quoted text -
>
> > I think this is the closes so far to my goal--the $11 values printed
> > out are those I am after, and the lines I'm interested in always are
> > those where one of the fields, a loop counter of sorts, changes a
> > value. Now, three questions on the modification of the latest script
> > to expand its functionality:
>
> > 1. how to store the value in an aditional field, e.g., $10, and print
> > it out on the same line? =A0I've tried
>
> > awk 'p7&&($7!=3Dp7){V[++n]=3Dp11}{c[++m]=3D=3Dp10}
{p7=3D$7;p10=3D$10;p1=
1=3D$11} END
>
> ITYM c[++m]=3Dp10 instead of c[++m]=3D=3Dp10.
>
> > {for (i in V) print i, V[i],c[i]}'
>
> > but obviously this ex[pression doesn't work as intended (the for loop
> > is incomplete...) =A0Should I use two independent loops and a \n at
the
> > end of the first statement? =A0The n and m indices are always the
same,
> > but I can't use n twice as its increases in both expressions...
>
> If you don't want n to increase twice, just don't increment it twice:
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10}
{p7=3D$7;p10=3D$10;p11=3D$11}=
> END{for (i in V) print i, V[i],c[i]}'
>
>
>
> > 2. how could I store and print out $11 from the next line (with an
> > already changed $7?)
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10;d[n]=3D$11}
{p7=3D$7;p10=3D$10=
;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i]}'
>
> > 3. I'd like to store and print the source FILENAME on each line.
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10;d[n]=3D$11,e[n]=3DFILENAME}
> {p7=3D$7;p10=3D$10;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i],e[i]}'
>
> but you don't need to store it if it's just one input file:
>
> awk 'p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3Dp10;d[n]=3D$11}
{p7=3D$7;p10=3D$10=
;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i],FILENAME}'
>
> > 4. I'd like to skip the first 5 or 10 lines (I think I know how to do
> > that...)
>
> awk 'NR<=3D10{next}
> p7&&($7!=3Dp7){V[++n]=3Dp11;c[n]=3D=3Dp10;d[n]=3D$11,e[n]=3DFILENAME}
{p7=
=3D$7;p10=3D$10;p11=3D$11}
> END{for (i in V) print i, V[i],c[i],d[i],e[i]}'
>
> > 5. I assume that if I wanted more complex conditions, I could combine
> > them as in
>
> > awk '(p7&&($7!=3Dp7))&&(p8&&($8!=3Dp8))...'
>
> > but what if I'd like to use $8 on the next line, with a changed value
> > of $7?
>
> I don't know what you mean by that.
>
> > Hmmm... this is getting more complex than I initially expected...
>
> Just follow the pattern....
>
> =A0 =A0 =A0 =A0 Ed.- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -
I took your latest script, cleaned it up a bit for clarity, changed
some letters (to make them more meaningful for me during the debugging
and learning process--and it almost works the way I would want it to
work!
( NR < 8 ) && ( $7 < 6 ) { next } s7 && ( $7 !=3D s7 ) { V[++n] =3D V11;
c[n] =3D c10; U[n] =3D $11; f[n] =3D FILENAME } { s7 =3D $7; c10 =3D $10;
V1=
1 =3D
$11 } END { for (i in V ) print i, f[i], c[i], V[i], U[i] }
However, I still have a few problems:
1. the first two inequalities seem to be disregarded, and unwanted
data are stored in the array and then printed out.
2. the data are printed out in reverse order (from the highest i to
1... how come?
3. how to impose an i>6 condition in the last 'for' printout loop (see
#1).
z.e.


|