On 3/3/2008 10:25 AM, Radu wrote:
> On Mar 3, 4:37 am, Luuk <L...@[EMAIL PROTECTED]
> wrote:
>
>>Ed Morton schreef:
>>
>>
>>
>>
>>
>>
>>>On 3/2/2008 1:57 PM, Radu wrote:
>>>
>>>>On Mar 2, 2:24 pm, Luuk <L...@[EMAIL PROTECTED]
> wrote:
>>>
>>>>>Radu schreef:
>>>>
>>>>>>On Mar 2, 12:34 pm, Luuk <L...@[EMAIL PROTECTED]
> wrote:
>>>>>
>>>>>>>Radu schreef:
>>>>>>>
>>>>>>>>Hi everybody,
>>>>>>>>Input file:
>>>>>>>>ID |First name |Last name |Address
|
>>>>>>>>Phone |
>>>>>>>>--------------------------------------------------------------------------------------------------------------
>>>>>>>><--10-->|<----15----->|<----20----->|
>>>>>>>><--------30------------------------>|<---14------------->|
>>>>>>>>--------------------------------------------------------------------------------------------------------------
>>>>>>>>4568 |Michael |Moore |350 Kensington Rd. |
>>>>>>>>(514) 567-1234 |
>>>>>>>>63542 |James |Joyce |220 London Blv.
|
>>>>>>>>(450) 234-1456 |
>>>>>>>>I need to trim all blanks of each fixed-length field (notice there
are
>>>>>>>>blanks within each field that may be seen as field separators by
awk
>>>>>>>>if not treated correctly). The output file I'm looking for (a CSV
>>>>>>>>file) would be like:
>>>>>>>>4568,Michael,Moore, 350 Kensington Rd.,(514) 567-1234
>>>>>>>>63542,James,Joyce,220 London Blv.,(450) 234-1456
>>>>>>>>Thanks,
>>>>>>>>Radu
>>>>>>>
>>>>>>>awk '{ gsub(/\|/,","); print ; gsub(" \+"," "); gsub(" ,",",");
print'
>>>>>>>inputfile
>>>>>>>--
>>>>>>>Luuk
>>>>>>
>>>>>>Thanks Luuk,
>>>>>>All seems fine, only I did a "small" mistake.
>>>>>>The fields are not separated by "|" but by blanks.
>>>>>>So the record will be something like
>>>>>>4568 Michael Moore 350 Kensington Rd. (514)
>>>>>>567-1234
>>>>>>All I have is the fixed-length of each field. I guess I need to trim
>>>>>>all leading and trailing spaces (but not the inside ones, even if
they
>>>>>>repeat on several positions) and separate the fields with ",".
>>>>>>Thanks again,
>>>>>>Radu
>>>>>
>>>>>awk 'BEGIN { OFS=","; }
>>>>> { id=substr($0,1,10);
>>>>> first=substr($0,11,15);
>>>>> last=substr($0,26,20);
>>>>> address=substr($0,46,30);
>>>>> phone=substr($0,76,14);
>>>>> gsub(" \+"," ",id);
>>>>> gsub(" \+"," ",first);
>>>>> gsub(" \+"," ",last);
>>>>> gsub(" \+"," ",address);
>>>>> gsub(" \+"," ",phone);
>>>>> print id, first, last, address, phone;
>>>>
>>>>>}' inputfile
>>>>
>>>>>Fields may still end with 1 space....
>>>>
>>>>>--
>>>>>Luuk
>>>>
>>>>Thanks a lot Luuke,
>>>
>>>>Excellent idea. It worked
>>>
>>>>Radu
>>>
>>>You might want to take a look at GNU awks fixed witdh field handling:
>>
>>>$ echo "4568 Michael Moore 350 Kensington Rd.
>>> (514) 567-1234" | gawk -v FIELDWIDTHS="10 15 20 30 14" -v OFS=","
'{ for
>>>(i=1;i<=NF;i++) sub(/ *$/,"",$i) }1'
>>>4568,Michael,Moore,350 Kensington Rd.,(514) 567-1234
>>
>>>Regards,
>>
>>> Ed.
>>
>>indeed, i might do that, thanks....
>>
>>but i miss the relation FIELDWITH <--> NameOfField which i find more
>>clearer in my example, and i hate to write docs, so if i look at my
>>example in one year from now, i have a quicker understanding of what it
>>supposed to do...,
>>
>>--
>>Luuk
>
>
> Hi guys,
>
> Ed, can you please tell me what is }1 for at the end of your gawk?
> I used Luuk's method, only my real file has 21 fields, and doing it
> manually is a real pain (i.e adding up all the field length) whereas
> your solution is way quicker.
"}" is just the end of the action section that started with "{".
"1" is a true condition which invokes the default action which is to print
the
current record. It's a common awk idiom instead of typing "{ print $0 }".
Ed.


|