Talk About Network



Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Awk > Re (2): Can awk...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 4 of 7 Topic 2193 of 2236
Post > Topic >>

Re (2): Can awk easily simulate an AFSM

by problems@[EMAIL PROTECTED] Mar 18, 2008 at 05:46 AM

In article <645afqF29q67nU1@[EMAIL PROTECTED]
>,
=?ISO-8859-1?Q?J=FCrgen_Kahrs?= <Juergen.KahrsDELETETHIS@[EMAIL PROTECTED]
> wrote:


> problems@[EMAIL PROTECTED]
 schrieb:
> 
> >  I've just looked at awk's docos for the first time and it seems
> > suitable for implememting an augumented-finite-state-machine.
> 
> Right.
> 
> > At each/most node there's an action to be performed.
> > Some nodes are not terminals/leaves and just lead to 
> > another nested syntax-diagram via push & return via pop.
> 
> What you describe is a recursive-descent compiler.
> Terminals are consumed by the scanner.
> Non-terminals are handled by a dedicated function.
>  
Well in terms of traversing the AFSM, at terminals the corresponding
actions are performed, which is why I thought awk might be suitable,
and at non-terminals, eg. after the "BEGIN" of a procedures outer
block, the train-on-the-tracks [=prog.counter] jumps to the 
<STATEMENT> sub-diagram, which just consists of a sequence of
lines of: (token, action set); somewhere else on the AFSM 
representation.    Importantly this AFSM representation is just
data, which is interpreted, by traversing the paths and doing the
corresponding action.

> > I've built several p-code compilers like this.
> 
> Me too, with AWK.
> 
> > From what I see from awks docos, it's very suitable for
> > doing the [token,action] task, as the source-code is
> > scanned into tokens and passed to awk.
> 
> Right, but remember that the traditional distinction
> between a low-level "scanner" (with regular expressions)
> and a high-level "parser" (doing recursive-descent parsing)
> should be made.
> 
At the time I did it [70s] I didn't know about regex, and the
p-code compiler was multipass, so 'scanning' meant converting
each token to either a 
* single-char for reserved-words, or
* a-number: in an array of numbers [the desktop HP programmable
calculator did not distinguish between interger & real types, it
just auto converted where appropriate] or
* a-name: store in an array of names.

> > As I remember, at the branch-points a one token look ahead
> > only was needed for the pascal-like syntax. 
> 
> Yes, it was called the LL(1). Wirth always designed
> his languages so that they can be parsed with a
> lookahead of one symbol.
> 
At that time I didn't know about LL(1), nor have I studied 
compliler theory up to now.  What I liked about this recursive
descent simplicity, was that while compiling [but less so with 
my multipass HP-calculator version] the source can/is just 
displayed, and where it stops IS the error-found's location.

> > Can awk handle this ?
> > 
> > How would it handle the branching, eg. the sub-diagram for
> > statement would be like:
> > Case TOKEN
> >    'IF': next state := IF-node;
> >    'WHILE: next state := WHILE-node;
> > ...
> 
> I have implemented a compiler for PL/0 and I
> did it this way:
> 
>   http://en.wikipedia.org/wiki/PL/0
> 
OK, I fetch and examine that.

> function Statement(     m, n) {
>   if (EatSym("ident", -1)) {
>     n =  FindIdent(tokenprev, "variable")
>     EatSym(":=", 14)
>     Expression()
>     Encode(0xb3, n, 2)
>   } else if (EatSym("CALL", -1)) {
>     EatSym("ident", 12)
>     Encode(0xb8, FindIdent(tokenprev, "procedure"), 2)
>   } else if (EatSym("BEGIN", -1)) {
>     do { Statement() } while (EatSym(";", -1))
>     EatSym("END", 9)
>   } else if (EatSym("IF", -1)) {
>     Condition()
>     EatSym("THEN", 10)
>     n = pc
>     Encode(0xa0, 0, 2)
>     Statement()
>     PatchAddr(n, pc)
>   } else if (EatSym("WHILE", -1)) {
>     n = pc
>     Condition()
>     EatSym("DO", 1)
>     m = pc
>     Encode(0xa0, 0, 2)
>     Statement()
>     Encode(0xa7, n-pc, 2)   # goto
>     PatchAddr(m, pc)
>   }
> }
> 
OK, apparently "Encode" is the code generation.
Since I haven't worked with my stuff for many years,
I only now realise that awk would give me a program
driven implementation, rather than a data-driven AFSM,
which is what I want.

So, looking at your version I'd have had:
....
IF: Condition()
THEN: n = pc; Encode(0xa0, 0, 2); Statement(); PatchAddr(n, pc);
ELSE: <set of actions>
  and AFAI-can remember, each node lead to the next-one-down,
  by default, except when the field/column, perhaps left-most/
before the token, was not empty and gave the next-node/line's
number.

What brought me onto this query, is that I need to clean-up
texts which I fetch everyday, by deleting eg. all text between:
"["<1 to 3 numerics> "]" "Reply" <rest of line>
<between 3 an 5 lines>
<start of line> "author"  <rest of line>.

And it seems that awk can easily handle this.
And I don't want to invest in learning awk just to do this task.
So I thought perhaps I could use awk to do other usefull tasks.

Apparently I need first to get familiar with *nix regex to
clean-up my fetched texts ?

Apparently my ideas of an AFSM compiler is closer to the 
table-driven compilers ?

clean-up my fetched texts ?

Thanks for any feedback,

==Chris Glur.




 7 Posts in Topic:
Can awk easily simulate an AFSM
problems@[EMAIL PROTECTED  2008-03-16 11:51:49 
Re: Can awk easily simulate an AFSM
=?ISO-8859-1?Q?J=FCrgen_K  2008-03-16 20:24:41 
Re: Can awk easily simulate an AFSM
Janis Papanagnou <Jani  2008-03-18 04:00:05 
Re (2): Can awk easily simulate an AFSM
problems@[EMAIL PROTECTED  2008-03-18 05:46:17 
Re: Re (2): Can awk easily simulate an AFSM
=?ISO-8859-1?Q?J=FCrgen_K  2008-03-18 19:35:53 
Re (3): Can awk easily simulate an AFSM
problems@[EMAIL PROTECTED  2008-03-20 06:02:39 
Re: Re (3): Can awk easily simulate an AFSM
=?ISO-8859-1?Q?J=FCrgen_K  2008-03-20 20:28:16 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sat May 17 1:01:04 CDT 2008.