Luuk wrote:
> Ed Morton schreef:
>
>> On 2/8/2008 6:50 AM, Luuk wrote:
>>
>>> "Ed Morton" <morton@[EMAIL PROTECTED]
> schreef in bericht
>>> news:47AB6DD4.5060009@[EMAIL PROTECTED]
>>>
>>>> On 2/7/2008 2:37 PM, Kurda Yon wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have the following problem. In my text-file each line has the
>>>>> following format:
>>>>>
>>>>> field_1 field_2 ... field_n (tf. field_1a, field_2a ... field_ka)
>>>>>
>>>>> And I need to extract field_1a, field_2a, ...and field_ka. Here I
see
>>>>> several subproblems which I cannot solve:
>>>>> 1. Different lines have different number of fields before the
>>>>> (tf. ... ) block.
>>>>> 2. (tf. ... ) blocks also contain different number of fields.
>>>>> 3. There is no space between "field_ka" and ")". And I want to
remove
>>>>> ")".
>>>>>
>>>>> Can this problem be easily solved in awk?
>>>>
>>>> Yes:
>>>>
>>>> $ cat file
>>>> field_1 field_2 ... field_n (tf. field_1a, field_2a ... field_ka)
>>>> $ awk 'gsub(/.*\(....|\)$/,"")1' file
>>>> field_1a, field_2a ... field_ka
>>>>
>>>> Regards,
>>>>
>>>> Ed.
>>>>
>>>
>>> could someone explain the '1' in "$ awk 'gsub(/.*\(....|\)$/,"")1'
>>> file" ?
>>
>>
>> It makes sure that even if the input record is empty (in which case
>> gsub() will
>> return 0) the eventual condition being tested by awk is
>> non-zero/non-null so
>> that printing the current record occurs even in that case.
>>
>> The operator used to combine the result of the gsub() with the "1" is
>> string-concatenation so you can put anything after the gsub() to get
>> a non-null
>> resultant string, even zero (to get the string "00") or the null
>> string (to get
>> the string "0" as opposed to the number zero).
>>
>>> awk does not seem to do anything with it...
>>> or is it just a typo?
>>
>>
>> No. Look:
>>
>> $ cat file
>> a
>>
>> c
>> $ awk 'sub(/./,NR)' file
>> 1
>> 3
>> $ awk 'sub(/./,NR)1' file
>> 1
>>
>> 3
>>
>>> but awk also does not complain when i type:
>>> $ awk 'gsub(/.*\(....|\)$/,"")g' file
>>
>>
>> Right. In that case it evaluates the unassigned variable "g" to the
>> null string
>> "" which is string-concatenated with the zero result of sub() to give
>> a non-null
>> "0" string:
>>
>> $ awk 'sub(/./,NR)g' file
>> 1
>>
>> 3
>>
>> Regards,
>>
>> Ed.
>>
>
>
> i must have skipped that part of the man-page....
>
> normally i use to do:
> awk '{ sub(/./,NR); print $0 }' file
>
> which i indeed something longer... ;-)
>
You can shorten that a bit without sacrificing the action block by
awk '{ sub(/./,NR) } 1' file
Personally I consider the concatenation of sub() and 1
awk 'sub(/./,NR) 1' file
as a hack; it's an unnecessary level of obfuscation[*]. A bit more
verbose but IMO conceptually clearer (no implicit casts) might be
awk 'sub(/./,NR) || 1' file
OTOH, the case where you just want those lines printed where you
actually substituted something, the expression
awk 'sub(/./,NR)' file
seems more natural compared to introducing a block (and maybe an
unnecessary if statement).
Janis
[*] Mixing integral expressions and "invisible" operators, having
implicit type conversions, and just for a boolean condition result.


|