Doug Hoffman <no.spam> wrote:
> Jonah Thomas wrote:
>
> > A local name works only inside the one definition and not in
> > factored parts of it. So each part you factor out has to get its
> > data marshaled onto the stack and converted into fresh locals. That
> > makes it harder to factor.
>
> Perhaps an example is needed. I simply don't follow this.
Well, here's an example although the code itself isn't very good. It's
something I was playing with and I haven't exactly gotten it in the sort
of shape I'd like to publish, but what the hell....
I was doing something that caused me enough stack trouble that I used a
global variable. The main problem was I was using one or two strings of
the form address&count and they required double stack operations and
ROTs and so on, and it was a lot of trouble. Here's the biggest routine
after I factored it some. A lot of the factoring was toward improving
readability.
: check-a-node ( len' c-addr len -- len' false | xt mode )
compare-substrings split-3flag \ len' -1|0 -n|0|n
IF go-left-or-right-node 0 EXIT THEN
DROP DUP split-3flag IF
IF go-middle-node ELSE no-match THEN
ELSE
DROP node @[EMAIL PROTECTED]
get-match-mode
?DUP IF
node @[EMAIL PROTECTED]
get-match-xt SWAP
ELSE
no-match
THEN
THEN ;
This probably doesn't make much sense without a lot of explanation, it's
a sort of tree where each node has up to three daughter nodes, left
middle and right, and each of them has a string associated with it.
Depending on how the string on the stack compares to the reference
string you choose one of the nodes. If you run out of string you're done
and your search has succeeded or failed. If it succeeds it returns two
values.
Here are some of the factored routines:
: compare-substrings ( c-addr len -- -1|0|1 )
node @[EMAIL PROTECTED]
get-reference-string-length MIN
node @[EMAIL PROTECTED]
get-reference-string-addr OVER COMPARE ;
: split-3flag ( -1|0|1 -- -1|0 -1|0|1 )
DUP 0< SWAP ;
\ right node address is 2 cells beyond left node address
: go-left-or-right-node ( flag -- node )
0< 0= 2 CELLS AND \ n|0
node @[EMAIL PROTECTED]
+ @[EMAIL PROTECTED]
node ! ;
: no-match ( -- false )
0 DUP node ! ;
It would be even bigger without factoring. Maybe there's a better way to
do it? The node variable was an obvious choice. I started out trying to
put it on the return stack but it gets used in a lot of the factors, and
that means doing R@[EMAIL PROTECTED]
before each factor and then likely as not another >R
inside the called routine. It was a lot of bother and also I find it
harder to debug routines that use the return stack. Fine if they're all
so simple they're obviously right, but....
I have the idea that locals would not improve this much. I would still
have to distribute copies of all needed values to each factor. It would
be a little easier because I wouldn't have to jockey them in place, I
could just name them in the preferred order. But I expect it would not
be very useful. I could do
\ this local is no worse than the global I used
: compare-substrings' ( c-addr len node -- -1|0|1 ) { node }
node get-reference-string-length MIN
node get-reference-string-addr OVER COMPARE ;
\ this one is a little better
: go-left-or-right-node ( node flag -- node )
0< 0= 2 CELLS AND \ n|0
+ @[EMAIL PROTECTED]
;
: no-match -- can't be factored with locals. Simple enough it doesn't
really need it.
\ not bad.
: check-a-node' ( len' c-addr len node -- len' node' false | xt mode )
{ len' addr len node }
addr len node compare-substrings split-3flag \ -1|0 -n|0|n
IF len' node ROT go-left-or-right-node 0 EXIT THEN
DROP len' split-3flag IF
IF go-middle-node ELSE len' 0 0 THEN
ELSE
DROP node get-match-mode
?DUP IF
node get-match-xt SWAP
ELSE
len' 0 0
THEN
THEN ;
I could still do most things with locals, without much more trouble than
a global variable and it's good practice to put the value on the stack
instead of passing implicitly with a global variable, it's more obvious
what's getting used. The locals version doesn't really look any worse
than the other, maybe a bit better.
But this is still a big definition, with doubly-nested IFs. If I wasn't
focused so much on details I might have done pseudocode like this:
: check-a-node''
substrings-different? IF
choose-left-or-right-node ELSE
lengths-equal? IF
node-is-leaf? IF
item-found ELSE
item-not-found
THEN
THEN
ELSE
end-of-string? IF
item-not-found ELSE
choose-middle-mode
THEN
THEN ;
Then it would be obvious I want to break it into two big chunks. And my
scope for locals would be much reduced.
I tend to think the reason locals work so well for me this time is that
I'm already doing a bad job of writing the code.


|