Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: Less greedy...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 9 of 33 Topic 2223 of 2236
Post > Topic >>

Re: Less greedy pattern match

by "Rajan" <svrajan@[EMAIL PROTECTED] > Apr 17, 2008 at 07:20 PM

"Ed Morton" <morton@[EMAIL PROTECTED]
> wrote in message 
news:4807A618.5020501@[EMAIL PROTECTED]
>
>
> On 4/17/2008 1:32 PM, pk wrote:
>> On Wednesday 16 April 2008 05:23, Ed Morton wrote:
>>
>>
>>>There's no "less greedy" operator (to be honest, I'm not sure what that
>>>really means or why it'd be more useful than a different RE)
>>
>>
>> Well, perl REs have non-greedy match. It can be useful, for example,
when
>> parsing html or xml. The classical example is something like this
(please
>> note that I'm not a perl guru - alas -):
>>
>> $ cat file.html
>> <tag><section><t1>foo</t1><t2>blah</t2><t1>bar
>> </t1></section>
>> <section><t1>baz
>> </t1><t2>blah</t2><t1>baz</t1></section></tag>
>>
>> Suppose you want to remove only what's inside <t1> tags (tags included)

>> and
>> keep everything else, without apriori knowledge of how the lines are
>> formatted.
>>
>> The simple perl one-liner:
>>
>> $ perl -p0e 's%<t1>.+?</t1>%%gs' file.html
>> <tag><section><t2>blah</t2></section>
>> <section><t2>blah</t2></section></tag>
>>
>> does what can't be done with sed. The "+?" is the non-greedy notation
>> for "one or more", just like "*?" is the non-greedy operator for "zero
or
>> more", etc.
>>
>> Note that I'm not saying that that cannot be done using other 
>> methods...that
>> was just an example to demonstrate how non-greedy match can be useful.
>>
>
> Yeah, that is a bit simpler than the awk equivalent:
>
> $ cat file
> <tag><section><t1>foo</t1><t2>blah</t2><t1>bar</t1></section>
> <section><t1>baz</t1><t2>blah</t2><t1>baz</t1></section></tag>
>
> $ awk
'{o="</t1>";n=SUBSEP;gsub(o,n);gsub("<t1>[^"n"]*"n,"");gsub(n,o)}1' 
> file
> <tag><section><t2>blah</t2></section>
> <section><t2>blah</t2></section></tag>
>
> or something else to chop up and reassemble the record. It is a pity
there 
> isn't
> a "not" operator for RE elements other than characters within [...].
>
> Oh well...
>
> Ed.
>

Wouldn't this do?
gawk -v RS='</*t1>' -v ORS=""  'RT!="</t1>"{print}'




 33 Posts in Topic:
Less greedy pattern match
Prateek <prateek.a@[EM  2008-04-15 19:59:12 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-15 22:23:49 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-16 00:02:06 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-17 20:32:10 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-17 21:12:47 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-17 14:33:44 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-17 22:01:46 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-17 22:12:04 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-17 19:20:05 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-17 23:13:31 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-17 23:36:59 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-18 09:50:29 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-18 09:27:00 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-18 19:05:23 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-18 19:24:27 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-18 22:06:17 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-18 21:19:26 
Re: Less greedy pattern match
Cesar Rabak <csrabak@[  2008-04-19 13:15:05 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-20 08:36:44 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-20 09:58:54 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-20 10:21:46 
Re: Less greedy pattern match
Janis Papanagnou <Jani  2008-04-19 18:53:19 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-20 14:32:54 
Re: Less greedy pattern match
Janis Papanagnou <Jani  2008-04-20 16:30:25 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-21 09:16:17 
Re: Less greedy pattern match
pk <pk@[EMAIL PROTECTE  2008-04-22 10:09:12 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-22 06:14:10 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-18 06:28:14 
Re: Less greedy pattern match
Prateek <prateek.a@[EM  2008-04-15 21:21:23 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-16 00:54:18 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-16 07:06:52 
Re: Less greedy pattern match
"Rajan" <svr  2008-04-16 18:01:13 
Re: Less greedy pattern match
Ed Morton <morton@[EMA  2008-04-16 23:48:21 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 6:41:33 CDT 2008.