Ed Morton schreef:
>
> On 3/3/2008 10:25 AM, Radu wrote:
>> On Mar 3, 4:37 am, Luuk <L...@[EMAIL PROTECTED]
> wrote:
>>
>>> Ed Morton schreef:
>>>
>>>
>>>
>>>
>>>
>>>
>>>> On 3/2/2008 1:57 PM, Radu wrote:
>>>>
>>>>> On Mar 2, 2:24 pm, Luuk <L...@[EMAIL PROTECTED]
> wrote:
>>>>>> Radu schreef:
>>>>>>> On Mar 2, 12:34 pm, Luuk <L...@[EMAIL PROTECTED]
> wrote:
>>>>>>>> Radu schreef:
>>>>>>>>
>>>>>>>>> Hi everybody,
>>>>>>>>> Input file:
>>>>>>>>> ID |First name |Last name |Address
|
>>>>>>>>> Phone |
>>>>>>>>>
--------------------------------------------------------------------------------------------------------------
>>>>>>>>> <--10-->|<----15----->|<----20----->|
>>>>>>>>> <--------30------------------------>|<---14------------->|
>>>>>>>>>
--------------------------------------------------------------------------------------------------------------
>>>>>>>>> 4568 |Michael |Moore |350 Kensington Rd.
|
>>>>>>>>> (514) 567-1234 |
>>>>>>>>> 63542 |James |Joyce |220 London Blv.
|
>>>>>>>>> (450) 234-1456 |
>>>>>>>>> I need to trim all blanks of each fixed-length field (notice
there are
>>>>>>>>> blanks within each field that may be seen as field separators by
awk
>>>>>>>>> if not treated correctly). The output file I'm looking for (a
CSV
>>>>>>>>> file) would be like:
>>>>>>>>> 4568,Michael,Moore, 350 Kensington Rd.,(514) 567-1234
>>>>>>>>> 63542,James,Joyce,220 London Blv.,(450) 234-1456
>>>>>>>>> Thanks,
>>>>>>>>> Radu
>>>>>>>> awk '{ gsub(/\|/,","); print ; gsub(" \+"," "); gsub(" ,",",");
print'
>>>>>>>> inputfile
>>>>>>>> --
>>>>>>>> Luuk
>>>>>>> Thanks Luuk,
>>>>>>> All seems fine, only I did a "small" mistake.
>>>>>>> The fields are not separated by "|" but by blanks.
>>>>>>> So the record will be something like
>>>>>>> 4568 Michael Moore 350 Kensington Rd. (514)
>>>>>>> 567-1234
>>>>>>> All I have is the fixed-length of each field. I guess I need to
trim
>>>>>>> all leading and trailing spaces (but not the inside ones, even if
they
>>>>>>> repeat on several positions) and separate the fields with ",".
>>>>>>> Thanks again,
>>>>>>> Radu
>>>>>> awk 'BEGIN { OFS=","; }
>>>>>> { id=substr($0,1,10);
>>>>>> first=substr($0,11,15);
>>>>>> last=substr($0,26,20);
>>>>>> address=substr($0,46,30);
>>>>>> phone=substr($0,76,14);
>>>>>> gsub(" \+"," ",id);
>>>>>> gsub(" \+"," ",first);
>>>>>> gsub(" \+"," ",last);
>>>>>> gsub(" \+"," ",address);
>>>>>> gsub(" \+"," ",phone);
>>>>>> print id, first, last, address, phone;
>>>>>> }' inputfile
>>>>>> Fields may still end with 1 space....
>>>>>> --
>>>>>> Luuk
>>>>> Thanks a lot Luuke,
>>>>> Excellent idea. It worked
>>>>> Radu
>>>> You might want to take a look at GNU awks fixed witdh field handling:
>>>> $ echo "4568 Michael Moore 350 Kensington
Rd.
>>>> (514) 567-1234" | gawk -v FIELDWIDTHS="10 15 20 30 14" -v OFS=","
'{ for
>>>> (i=1;i<=NF;i++) sub(/ *$/,"",$i) }1'
>>>> 4568,Michael,Moore,350 Kensington Rd.,(514) 567-1234
>>>> Regards,
>>>> Ed.
>>> indeed, i might do that, thanks....
>>>
>>> but i miss the relation FIELDWITH <--> NameOfField which i find more
>>> clearer in my example, and i hate to write docs, so if i look at my
>>> example in one year from now, i have a quicker understanding of what
it
>>> supposed to do...,
>>>
>>> --
>>> Luuk
>>
>> Hi guys,
>>
>> Ed, can you please tell me what is }1 for at the end of your gawk?
>> I used Luuk's method, only my real file has 21 fields, and doing it
>> manually is a real pain (i.e adding up all the field length) whereas
>> your solution is way quicker.
>
> "}" is just the end of the action section that started with "{".
> "1" is a true condition which invokes the default action which is to
print the
> current record. It's a common awk idiom instead of typing "{ print $0
}".
>
> Ed.
>
>
It's a common, but undocumented, awk idiom instead of typing "{ print $0
}".
i think its a side-effect of the way awk works, and not an intended way
of doing things...
sorry Ed, ... ;-)
--
Luuk


|