Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: parsing a t...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 12 of 17 Topic 2217 of 2236
Post > Topic >>

Re: parsing a text file

by Ed Morton <morton@[EMAIL PROTECTED] > Apr 8, 2008 at 07:16 AM

On 4/7/2008 9:07 PM, ric wrote:
> On Apr 7, 5:05 pm, Janis Papanagnou <Janis_Papanag...@[EMAIL PROTECTED]
>
> wrote:
> 
>>ric wrote:
>>
>>>On Apr 7, 4:15 pm, Janis Papanagnou <Janis_Papanag...@[EMAIL PROTECTED]
>
>>>wrote:
>>
>>>>[ Please don't top-post.]
>>>
>>>>ric wrote:
>>>
>>>>>Hi Kenny,
>>>>
>>>>>Well, I'm a AWK newbie, this is my actual code(some friend try to
help
>>>>>me):
>>>>
>>>>>~ric)cat script2.awk
>>>>>#!/usr/bin/awk -f
>>>>
>>>>>BEGIN { nStackPtr = -1;}
>>>>
>>>>>function push(x) {  stack[++nStackPtr] = x; }
>>>>
>>>>>function pop() {
>>>>> if (nStackPtr == -1) return  nStackPtr;
>>>>
>>>>> return stack[nStackPtr--];
>>>>>              }
>>>>> {
>>>>
>>>>>      for (nField = 1;nField <= NF;nField++)
>>>>>      if (match($nField, "<")) {
>>>>>         nPrevious = 0;
>>>>>         for (nLoop = nField-0;nLoop > 0 && nPrevious < 3;nLoop--)
>>>>>               if (!match($nLoop, "<")) {
>>>>>                 push($nLoop);
>>>>>                 nPrevious++;
>>>>>                                         }
>>>>>         for (nLoop = 0;nLoop < nPrevious;nLoop++)
>>>>>           printf("%s ", pop());
>>>>>           printf("%s", $nField);
>>>>
>>>>>       nFollowing = 0;
>>>>>       for (nLoop = nField+0;nLoop < NF && nFollowing < 3;nLoop++)
>>>>>         if (!match($nLoop, "<")) {
>>>>>               printf(" %s",  $nLoop);
>>>>>               nFollowing++;
>>>>>                                   }
>>>>>         printf("\n");
>>>>
>>>>>}
>>>>
>>>>>} # main
>>>>>~ric)
>>>>
>>>>The main difference between your approach and the program I posted
>>>>elsethread seems to be that you have introduced some unnecessary
>>>>stack while my solution builds the result by concatenating strings.
>>>>(There seem to be a couple of bugs in your program as well.)
>>>
>>>>Janis- Hide quoted text -
>>>
>>>>- Show quoted text -
>>>
>>>hi Janis,
>>>Ops,sorry for top-post :)
>>
>>> Your solution,it's much better than the first one.
>>
>>>>>Your data format makes a solution somewhat ugly (see missing space
after the sentence in " works.This ").
>>>>
>>>I change the data format, have a space:
>>
>>>#cat file
>>>I went fishing for some sea <bass> .
>>>The <bass> part of the song is very moving .
>>>he proposed an elaborate <program> of public works . This information
>>>was taken from the magazine.
>>>the <program> required several hundred lines of code .
>>
>>>The only problem is: is printing  after the "."
>>
>>>"proposed elaborate program public works This"
>>
>>>and should said:
>>>"proposed elaborate program public works"
>>
>>Okay, as you like...
>>
>>     { gsub (/[,;:]/, " ", $0) ; gsub (/[.]/, " . ", $0) }
>>     /<.*>/ { for (i = 1; i <= NF; i++)
>>                if ($i~/<.*>/) { s = substr ($i, 2, length($i)-2)
>>                  c = 0
>>                  for (j = i-1; j > 0 && c != 3 && $j != "." ; j--)
>>                    if (length($j)>2) { s = $j FS s ; c++ }
>>                  c = 0
>>                  for (j = i+1; j <= NF && c != 3 && $j != "." ; j++)
>>                    if (length($j)>2) { s = s FS $j ; c++ }
>>                }
>>              print s
>>     }
>>
>>Janis
>>
>>
>>
>>
>>>Thanks very much,really
>>>   -ric
>>
> 
> Hi Janis,
> It's awesome!,your solution works perfectly.
> 
> Just the last help,
> Base on your code, I try to use "<head>" and "</head>",instate of "<"
> and ">" from the text file.
> 
> bash# cat
> file
> I went fishing for some sea <head>bass</head> .
> The <head>bass</head> part of the song is very moving .
> he proposed an elaborate <head>program</head> of public works . This
> information was taken from the magazine.
> the <head>program</head> required several hundred lines of code .
> bash#
> 
> I just try to adapt this in your awk code,but it's lil hard to
> understand, hehe :$
> 
> ...
> /<.*>/
> ...
> if ($i~/<.*>/) { s = substr ($i, 2, length($i)-2)
> ...
> 
> 
> regards,
>    -ric

The posted code so far handles sentences that end in ".". What if they end
in
"!" or "?"? Are there any other punctuation characters that have meaning?
The
posted code also assumes that newlines have meaning wrt when to start/stop
counting "words". Do they really or is a period the true "end of record"
character?

	Ed.




 17 Posts in Topic:
parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 10:01:05 
Re: parsing a text file
gazelle@[EMAIL PROTECTED]  2008-04-07 17:17:26 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 14:29:48 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 00:15:18 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 00:10:01 
Re: parsing a text file
"Anton Treuenfels&qu  2008-04-07 22:31:43 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-07 17:14:10 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 15:36:52 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 16:00:54 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 01:05:40 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 19:07:38 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-08 07:16:06 
Re: parsing a text file
Janis <janis_papanagno  2008-04-08 05:47:41 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-08 07:58:29 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-08 08:03:47 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-08 14:39:18 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-09 12:38:49 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 8:45:22 CDT 2008.