Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re: parsing a t...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 15 of 17 Topic 2217 of 2236
Post > Topic >>

Re: parsing a text file

by ric <ricardo7@[EMAIL PROTECTED] > Apr 8, 2008 at 08:03 AM

On 8 abr, 06:58, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
> On 4/8/2008 7:47 AM, Janis wrote:
>
> > On 8 Apr., 14:16, Ed Morton <mor...@[EMAIL PROTECTED]
> wrote:
>
> >>The posted code so far handles sentences that end in ".". What if they
e=
nd in
> >>"!" or "?"? Are there any other punctuation characters that have
meaning=
? The
> >>posted code also assumes that newlines have meaning wrt when to
start/st=
op
> >>counting "words". Do they really or is a period the true "end of
record"=
 character?
>
> > I'd take the pragmatic approach, add ? and ! to the character
> > set if necessary, and wait for new requirements as soon as the
> > OP gets aware of any. Experience shows that it needs a lot of
> > forth-and-back postings to get somewhat complete requirements,
> > but mostly that isn't necessary here and can be fixed on the fly.
>
> Yes, but I'm thinking the approach should probably be changed to use an
RS=

> that's whatever set of characters really represent the end of a
"sentence"=
 which
> would introduce a fair amount of churn in the script and may warrant a
> gawk-specific solution so it's worth poking at the requirements a bit
befo=
re
> going any further.
>
> > BTW, initially I had used [[:punct:]] in the program I posted
> > but for apparent reasons (< and >) that was not appropriate.
>
> and now I'm thinking that after peeling the requirements onion a bit
more =
we MAY
> end up suggesting xmlawk or some such instead...
>
> =A0 =A0 =A0 =A0 Ed.


Goord Morning Janis and Ed

I think you are right, I prefer change the file format to some xml
tagged style, to avoid all this little problems, looks xml it's the
best for this jobs.

#cat newfile
<instance id=3D"bass.v.bnc.001" docsrc=3D"BNC">
<context>
I went fishing for some sea <head>bass</head> .
</context>
</instance>

<instance id=3D"bass.v.bnc.002" docsrc=3D"BNC">
<context>                                                      <---
it's ok,can contain a point
The <head>bass</head> part of the song is very moving.
</context>
</instance>

<instance id=3D"program.v.bnc.001" docsrc=3D"BNC">
<context>                                                      <---can
finisht without point too
he proposed an elaborate <head>program</head> of public works . This
information was taken
</context>
</instance>

<instance id=3D"program.v.bnc.002" docsrc=3D"BNC">
<context>
the <head>program</head> required several hundred lines of code .
</context>
</instance>

<instance id=3D"smell.v.bnc.001" docsrc=3D"BNC">
<context>                                        <--in a single
line,and ends with "?"
It 's  making me annoyed .I did n't want to stay there and I did n't
want to go to Combe Court , cos I hate it and it <head>smells</head>
and the Captain slobbers in his food and Christmas is horrible with no
good prezzies and Annie not there . Why did n't you visit me ?  Why
not ?
</context>
</instance>



Returning exactly this:
#----------------------------------
for some sea bass
The bass part the song
proprosed elaborate program public works
the program required several hundred
cos hate and smells and the Captain


best regards from central america! ;)




 17 Posts in Topic:
parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 10:01:05 
Re: parsing a text file
gazelle@[EMAIL PROTECTED]  2008-04-07 17:17:26 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 14:29:48 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 00:15:18 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 00:10:01 
Re: parsing a text file
"Anton Treuenfels&qu  2008-04-07 22:31:43 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-07 17:14:10 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 15:36:52 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 16:00:54 
Re: parsing a text file
Janis Papanagnou <Jani  2008-04-08 01:05:40 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-07 19:07:38 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-08 07:16:06 
Re: parsing a text file
Janis <janis_papanagno  2008-04-08 05:47:41 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-08 07:58:29 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-08 08:03:47 
Re: parsing a text file
ric <ricardo7@[EMAIL P  2008-04-08 14:39:18 
Re: parsing a text file
Ed Morton <morton@[EMA  2008-04-09 12:38:49 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri May 16 8:42:53 CDT 2008.