Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: parsing a t...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 6 of 17 Topic 2217 of 2236
Post > Topic >>

Re: parsing a text file

by "Anton Treuenfels" <atreuenfels@[EMAIL PROTECTED] > Apr 7, 2008 at 10:31 PM

"Janis Papanagnou" <Janis_Papanagnou@[EMAIL PROTECTED]
> wrote in message
news:fte63p$nif$1@[EMAIL PROTECTED]
>
>      { gsub (/[,.;:]/, " ", $0) }
>      /<.*>/ { for (i = 1; i <= NF; i++)
>                 if ($i~/<.*>/) { s = substr ($i, 2, length($i)-2)
>                   c = 0
>                   for (j = i-1; j > 0 && c != 3; j--)
>                     if (length($j)>2) { s = $j FS s ; c++ }
>                   c = 0
>                   for (j = i+1; j <= NF && c != 3; j++)
>                     if (length($j)>2) { s = s FS $j ; c++ }
>                 }
>               print s
>      }
>
> Will produce...
>
> for some sea bass
> The bass part the song
> proposed elaborate program public works This
> the program required several hundred

Now that a working solution has been posted I can't help playing  with
it...

/<.*>/ {
    gsub(/[,.;:]/, " ", $0)
    gsub(/[ \t]+[^ \t][^ \t]?[ \t]+/, " ", $0)
    for ( i = 1; $i !~ /<.*>/; i++ )
        ;
    t = substr($i, 2, length($i)-2 )
    j = i - 3 > 1 ? i - 3 : 1
    e = i + 3 < NF ? i + 3 : NF
    s = i != j ? $j : t
    while ( ++j <= e )
        s = s FS (i != j) ? $j : t
    print s
}

....the idea being to simplify by getting rid of all the "words" of one or
two characters right at the start (not tested). There's still a whole
bunch
of assumptions, including that "<" and ">" always are the first and last
characters of the target word and that there is at most one target word
per
line.

- Anton Treuenfels




 17 Posts in Topic:
parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 10:01:05 
Re: parsing a text file
gazelle@[EMAIL PROTECTED]  2008-04-07 17:17:26 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 14:29:48 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 00:15:18 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 00:10:01 
Re: parsing a text file
"Anton Treuenfels&qu  2008-04-07 22:31:43 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-07 17:14:10 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 15:36:52 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 16:00:54 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 01:05:40 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 19:07:38 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-08 07:16:06 
Re: parsing a text file
Janis <janis_papanagno  2008-04-08 05:47:41 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-08 07:58:29 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-08 08:03:47 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-08 14:39:18 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-09 12:38:49 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 8:56:04 CDT 2008.