On 1/24/2008 8:56 PM, z.entropic wrote:
> On Jan 24, 5:40 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
>>On 1/24/2008 4:35 PM, z.entropic wrote:
>>
>>
>>
>>
>>
>>
>>>On Jan 24, 3:33 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>>
>>>>On 1/24/2008 2:25 PM, z.entropic wrote:
>>>
>>>>>On Jan 24, 2:41 pm, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>>>>
>>>>>>On 1/24/2008 1:19 PM, z.entropic wrote:
>>>>>><snip>
>>>>>
>>>>>>>Here is a fragment of my input file:
>>>>>>
>>>>>>>============
>>>>>>>100 24479.33 14399.09 1/23/2008 19:55 6
1
0 0 3.293 1.287
>>>>>>>101 24480.25 14400.01 1/23/2008 19:55 6
1
0 0 3.296 1.288
>>>>>>>102 24480.36 0.11 1/23/2008 19:55 7 1
0
-0.00185954 3.167 1.287
>>>>>>>=============
>>>>>>
>>>>>>>Thus, if field $6 in line 100 is equal to 6 AND field $6 in line
101
>>>>>>>is equal to 7, store the value 14399.09 in array1[1] and 1.287 in
>>>>>>>array2[1]. When the next match is found, store the two values in
>>>>>>>array1[2] and array2[2], etc. Basically, I'm comparing the same
>>>>>>>fields in consecutive rows.
>>>>>>
>>>>>>Try this:
>>>>>
>>>>>>awk '($6==7)&&(p6==6){array1[++n]=p3;array2[n]=p11}
{p=$6;p3=$3;p11=$11}' file
>>>>>
>>>>>>You should use an array for the "p" (previous) field values if you
to need to
>>>>>>access more of them.
>>>>>
>>>>>> Ed.
>>>>>
>>>>>I think my example is a bit confusing due to poor formatting (copying
>>>>
>>>>>from Excel with the wrong date format didn't help...) In essence, I
>>>>
>>>>>can't get the script working even after some changes etc., so let me
>>>>>explain again as best as I can. Here is an interesting section frm
>>>>>one of my data files, this time with proper formatting that awk would
>>>>>see (tab-separated fields):
>>>>
>>>>>100 24479.32 14399.08 1/23/2008 7:55:39 PM 6 1 0 0 3.293399
>>>>>101 24480.25 14400.01 1/23/2008 7:55:40 PM 6 1 0 0 3.293234
>>>>>102 24480.36 0.10 1/23/2008 7:55:41 PM 7 1 0 -0.00185954 3.166826
>>>>>103 24480.46 0.21 1/23/2008 7:55:41 PM 7 1 0 -0.00185932 3.034836
>>>>
>>>>>Simply put, I want to find pairs of lines in which the counter in
>>>>>field $7 changes, here from 6 to 7, and then store in array array1[1]
>>>>>the value found in field $11 (3.293234, line 101). The next pair of
>>>>>found lines would change the array counter to 2 (array[2]).
>>>>
>>>>So now we're back to one array? ok, look:
>>>
>>>>$ awk '($7==7)&&(p7==6){array[++n]=p11} {p7=$7;p11=$11} END{for (i in
array)
>>>>print i, array[i]}' file
>>>>1 3.293234
>>>
>>>>>Once I figure out with your help how to do that, I'll try to expand
>>>>>this script to store more values, including some from line 102 in the
>>>>>example above.
>>>>
>>>>If the above still isn't what you're looking for either, maybe posting
a little
>>>>more sample input and some expected output would help.
>>>
>>>> Ed.- Hide quoted text -
>>>
>>>>- Show quoted text -
>>>
>>>Your script works--in part, probably because I underspecified the
>>>requirements. I think the problem is a bit more complex; I'll provide
>>>a larger example of input and output.
>>
>>OK, but if it's just that you want to get output every time the 7th
field
>>changes rather than when it specifically changes from 6 to 7, then all
you'd
>>need is:
>>
>>$ awk 'p7&&($7!=p7){array[++n]=p11} {p7=$7;p11=$11} END{for (i in array)
print i
>>, array[i]}' file
>>1 3.293234
>>
>>so also see if that's what you're really looking for....
>>
>>
>>
>>
>>>I believe this kind of a problem may be of interest to a wider group
>>>of readers and awk users as it concerns data extraction and processing
>>>that I, at least, often encounter.
>>
>>>z.e.- Hide quoted text -
>>
>>- Show quoted text -- Hide quoted text -
>>
>>- Show quoted text -
>
>
> I think this is the closes so far to my goal--the $11 values printed
> out are those I am after, and the lines I'm interested in always are
> those where one of the fields, a loop counter of sorts, changes a
> value. Now, three questions on the modification of the latest script
> to expand its functionality:
>
> 1. how to store the value in an aditional field, e.g., $10, and print
> it out on the same line? I've tried
>
> awk 'p7&&($7!=p7){V[++n]=p11}{c[++m]==p10} {p7=$7;p10=$10;p11=$11} END
ITYM c[++m]=p10 instead of c[++m]==p10.
> {for (i in V) print i, V[i],c[i]}'
>
> but obviously this ex[pression doesn't work as intended (the for loop
> is incomplete...) Should I use two independent loops and a \n at the
> end of the first statement? The n and m indices are always the same,
> but I can't use n twice as its increases in both expressions...
If you don't want n to increase twice, just don't increment it twice:
awk 'p7&&($7!=p7){V[++n]=p11;c[n]=p10} {p7=$7;p10=$10;p11=$11}
END{for (i in V) print i, V[i],c[i]}'
>
> 2. how could I store and print out $11 from the next line (with an
> already changed $7?)
awk 'p7&&($7!=p7){V[++n]=p11;c[n]=p10;d[n]=$11} {p7=$7;p10=$10;p11=$11}
END{for (i in V) print i, V[i],c[i],d[i]}'
> 3. I'd like to store and print the source FILENAME on each line.
awk 'p7&&($7!=p7){V[++n]=p11;c[n]=p10;d[n]=$11,e[n]=FILENAME}
{p7=$7;p10=$10;p11=$11}
END{for (i in V) print i, V[i],c[i],d[i],e[i]}'
but you don't need to store it if it's just one input file:
awk 'p7&&($7!=p7){V[++n]=p11;c[n]=p10;d[n]=$11} {p7=$7;p10=$10;p11=$11}
END{for (i in V) print i, V[i],c[i],d[i],FILENAME}'
> 4. I'd like to skip the first 5 or 10 lines (I think I know how to do
> that...)
awk 'NR<=10{next}
p7&&($7!=p7){V[++n]=p11;c[n]==p10;d[n]=$11,e[n]=FILENAME}
{p7=$7;p10=$10;p11=$11}
END{for (i in V) print i, V[i],c[i],d[i],e[i]}'
> 5. I assume that if I wanted more complex conditions, I could combine
> them as in
>
> awk '(p7&&($7!=p7))&&(p8&&($8!=p8))...'
>
> but what if I'd like to use $8 on the next line, with a changed value
> of $7?
I don't know what you mean by that.
> Hmmm... this is getting more complex than I initially expected...
Just follow the pattern....
Ed.


|