> >> http://cs-www.cs.yale.edu/homes/dvm/papers/lisp05.pdf
> > I have no way to read PDF files here.
> From: moi <r...@[EMAIL PROTECTED]
>
> I dumped the PDF into ASCII for you:
Hey, thanks. The way you did it is much better than the way Google
does it. Your way, it's actually legible! How did you do it?
> During Lisp software development, it is normal to revise and
> reload programs and data structures continually.
Although that's vaguely similar to what I do, there's a major
difference: I do all my editing on my Macintosh, then copy&paste
across dialup modem to Unix, the only place where CMUCL is
available for me to use, where I do all line-by-line testing, all
full-function testing, and all R&D testing. When I need to restart
Lisp, because I lost my dialup connection or it's another day and
I'm logging in again, I upload (to Unix) all the files that have
changed locally since last uploaded, then start Lisp and run the
initialization sequence to get all active source files loaded into
Lisp. Then I can start re-building whatever data I have in the Lisp
environment. In the past I needed to recompute everything from
original input, so I ran a script for doing that. Now with my
automatic dataflow software, I just call a few functions to request
bringing up-to-date the input for whatever I'm in the middle of
developing, which automatically loads expensively-computed data
from disk instead of recomputing it. Thus I have no need to
automatically keep track of write-date on sourcefiles, and it
wouldn't do any good because there's no way to automatically upload
files from Macintosh to Unix. I only need to watch timestamps on
the data, not the sourcefiles. The system described in this PDF
(now ASCII text) do***ent deals with both sourcecode and data.
A "chunk" per the paper is similar to a "control point" as I used
the term. Perhaps "chunk" is the data itself and "control point" is
the logical record that tells about the "chunk" and determines how
the data will be automatically brought up to date if it isn't
already. Thus from a nitpicky technical point, chunks and control
points are different aspects of the same process, but they match
1-1 so we can talk about either just the same without going astray.
Most of the time a piece of data either is or is not generated or
loaded already, and the timestamp is overkill. The main place where
timestamps would be useful is if I change the definition of how
some data in the chain is computed, such as if I change the
ProxHash algorithm to use a different random number generator. I
would simply delete the backup copy of that one data value from
both Lisp memory and disk, thereby forcing the dataflow system to
re-compute it and re-save it. At that point, since the timestamp is
the date saved, all data dependent on it would show as obsolete and
needing re-computing if I ever ask for them. The timestampes would
save me the burden of trying to manually invalidate each later data
value, and possibly overlook one of them resulting in inconsistent
values.
> The result is that the state of the Lisp process can become
> <E2>incoherent,<E2> with updates to <E2>sup****ting chunks<E2>
> coming after updates to they chunks they sup****t.
Yes, that's the basic problem expressed nicely.
> The word chunk is used here to mean any entity, content, or
> entity association, or anything else modelable as up to date or out
> of date. To maintain coherence requires explicit management of an
> acyclic network of chunks, which can depend on conjunctions and
> disjunctions of other chunks;
Yes. The key is that it's acylcic, else it's impossible to
terminate recursion. This kind of dataflow is very different from
the feedback loops to converge on fixed points of functions during
interval arithmetic calculations.
Conjunctions are easy to understand: One resultant chunk depends
on two sup****ting chunks. (One control point has two inputs.)
Disjunctions aren't so obvious. Is this like when there might be a
backup copy of the data on disk, whereby the value can either be
re-computed or loaded from backup file depending on which is more
recent, but if output is as recent as latest of sup****ting and
saved chunk then neither re-compute nor load is needed?
> the built-in facilities of Lisp do not address the coherence
> problem in any systematic way.
Agreed.
> For example, although it is easy to reload a file after making
> some bug fixes, it often happens that the reloaded file initialized
> some table, and entries were made in it by files loaded later.
I don't write my code that way. Loading a file doesn't initialize a
table. Instead, loading a file merely makes available the functions
needed to initialize the table, and the functions which determine
under what cir***stances the table would need to be initialized. If
the table has already been initialized, the reloaded software won't
have any reason to require it to be initialized again.
> There is nothing really wrong with restarting. It often requires
> you to take special measures to get back to the point you were at
> before the restart.
This is why I used to have a script that computed all the values
that were needed for my current R&D work, and why *now* I have
instead the automatic dataflow to compute or reload those same
values in a more optimal way.
Note that several years ago I had a weaker form of automatic
dataflow. It used timestamps to load or recompute data as needed,
but if I wanted to save to disk I needed to call that function
manually. It also used two globals per control point, one of which
was the actual data value, and one of which was the timestamp and
other info, both on the value cell of the global symbol. The new
automatic dataflow is essentially a refactoring of that old code to
have only a single symbol per control point, using properties
rather than value cell to store timestamp and data value (and
eventually other info about the control point).


|