Robert Maas, see http://tinyurl.com/uh3t
wrote:
>> >> ... Mathematica 6 addresses this problem by
>> >> rejecting all recursive definitions. ...
>
> OK, I took this to mean that Mathematica 6 rejects all recursive
> definitions, absolutely, not just its compiler (which had the *bug*
> in earlier versions), presumably so that interpreted and compiled
> behaviour will be consistent. On that basis I wrote:
Sorry, I should have been more explicit when I wrote that.
>> > IMO that's a showstopper, a complete language rejector!
>
>> From: Jon Harrop <use...@[EMAIL PROTECTED]
>
>> To be fair, this only affects compiled anonymous functions.
>> Mathematica can still interpret recursive functions but it is
>> very slow (~1,000x slower than a compiled language in general).
>
> Are you talking about the earlier versions that merely had a bug,
> or the new major-version 6 which "rejects all recursive
> definitions" in the words of the other poster?
Previous versions hung, the latest version bails with an error message
stating that Compile does not sup****t recursion (AFAIK).
>> >> My PhD was largely on wavelets so I obtained Wolfram Research's
>> >> own ($595!) WaveletExplorer add-on only to discover that they deal
>> >> solely with discrete wavelets and completely ignore continuous
>> >> wavelets.
>> > Why didn't you ask around on newsgroups for consumer feedback, like
>> > the book reviews on Amazon.com, before making such a significant
>> > purchase for which you'd need a requisition rather than take
>> > (shake?) out of the petty cash jar (piggy bank?)?
>> Probably wouldn't have done much good, as it happens. Computer
>> scientists seem to be blissfully unaware of continuous wavelets and
>> often teach and learn that only discrete wavelets are significant.
>
> Well, with nearly six hundred dollars to spend, you should at least
> have *tried* to get some useful info prior to purchase...
That's exactly what I did. I never actually bought the software. I used
WRI's do***entation to find an obscure file name from its distro and
Googled for a user who had been stupid enough to ex****t their copy of the
software. Then I downloaded it and installed it in order to try it out.
Finding out that it didn't provide any of the wavelet-related features I
required, I deleted it and didn't buy it.
>> I ****ted my ray tracer benchmark to Mathematica for a laugh and
>> it runs 30x slower than OCaml's interpreted bytecode, let alone its
>> optimizing native-code compiler.
>
> I take it your benchmark is recursive? Is there any reason you
> couldn't have re-written it slightly to emulate stacks by linked
> lists, thereby avoided recursion, so that you could have gotten it
> compiled?
The benchmark is recursive but Mathematica actually spends a lot of its
time
evaluating ray-sphere intersections which is a straightforward block of
code (no recursion). Mathematica's Compile does work on that and it does
increase performance a bit but still nothing like as fast as a native-code
executable.
> Note that there are multiple levels of interpreting. Reference to
> Lisp and Java and Basic are useful here:
> - Source-text-level interpreting, as with the original BASIC
> ...
> - Parse-tree interepreting, such as what was done by Lisp EVAL
> ...
> - Byte-code dispatching/coding, such as the JVM (Java Virtual
> ...
Mathematica's generic term rewriting is most closely approximated by what
you just called "Parse-tree interepreting", also known as "term-level
interpreting".
OCaml bytecode is an example of interpreted bytecode using an optimized
interpreter written in C.
> Consequently the "let alone ..." remark may be out of line. If
> byte-code is the fastest available for large software programs
> where cache performance dominates over instruction execution per
> se, then perhaps the benchmark is running only 30x as slow as the
> fastest possible on that architecture? Do you have a direct
> comparison of byte-code and native-code (both with OCaml) to tease
> out the full truth of these comparisons? (Just curious.)
Sure, I've still got the code here. OCaml's native code compiler is 17x
faster than OCaml's bytecode interpreter on this benchmark. So Mathematica
is ~500x slower than native-code compiled OCaml.
>> I would copy the OCaml implementation and use immutable balanced
>> binary trees.
>
> I presume you mean the algorithms where trees are updated by
> creating new structure for all the parts that changed and sharing
> any sub-trees that haven't changed? Thus you can collate two trees
> (for all the usual purposes: all the nontrivial boolean functions:
> and ior xor eqv nand nor andc1 andc2 orc1 orc2; all the nontrivial
> predicates: equal notequal subset notsubset backsubset
> notbacksubset eithersubset neithersubset) making just one pass over
> most/all of the nodes (in some cases you can skip a whole sub-tree
> in one of the given trees if there's a gap between adjacent
> elements in the other tree)?
Exactly, yes.
> I suppose in theory you can re-write those algorithms to emulate
> the stack via linked lists, then you will be able to compile the
> code in Mathematica? That would be a big pain of course, but has
> anybody tried it?
Ugh. Yes, I think you're right. I've never seen that done though.
> It would actually be of general interest to compare true
> machine-stack recursion against emulation of recursion via pu****ng
> and popping on linked lists, where both modes are available for use
> by compiled code, to see whether there's any signficant speed
> advantage to true machine-stack recursion.
I'm actually interested in something similar because I'd like to generate
garbage collected programs using LLVM. They provide a "shadow stack"
algorithm that does exactly that. I still haven't found the time to play
with it but LLVM is an awesome project...
> It would be especially
> interesting to try the comparison on systems that have multiple
> processors, where the application programmer has control over when
> thread-splitting may occor in the algorithm:
> ...
> Threads are a standard part of Java. Maybe I should try writing
> this in Java? First I need to implement thread-split-let of course!
That would be interesting but I'd reach for F# rather than Java (assuming
you have a Windows box). You might also be interested in the SML/NJ
compiler for SML which does something similar (transformation to
continuation passing style) to eliminate all system stack usage.
> For my benchmark, I need an algorithm where traversing the two
> trees, not performing calculations, dominates the process. Thus my
> divide-and-conquer version of factorial would *not* be a good
> example. Maybe something like give two large sets (as trees), or
> other large collection of data stored in a tree, which are almost
> identical, find all the (very small number of) differences. To make
> reasonable data, I'd read in a corpus of text, building a
> left-context tree from it, then read in a little bit more text,
> which includes just a few additinal left contexts, then compare the
> old and new tree to show what new contexts were added? Only that
> final comparison (find all differences) would be timed.
Yes, I think that would be an awesome benchmark for such things. I was
actually talking to Simon Peyton-Jones about this recently, in the context
of benchmarking GHC's parallel performance.
There is a practically-relevant program that uses this technique from my
book "OCaml for Scientists", the "n"th-nearest neighbor code:
http://www.ffconsultancy.com/products/ocaml_for_scientists/complete/?cll
That would be much more compelling that Fibonacci or factorial numbers.
You
can even do a funky interactive real-time 3D visualization of the results
as I do in my new book "F# for Scientists". I'll upload the code to this
page ASAP:
http://www.ffconsultancy.com/products/fsharp_for_scientists/?cll
> In any case, four comparisons on any given machine architecture:
> - Normal recursion
> - Thread-splitting recursion
> - Stack emulation of recursion, single thread
> - Stack emulation of recursion, splitting threads
> Does anybody have access to a quad-core or larger machine running
> the JVM which could be applied to this benchmark, where all cores
> are kept busy all the time whenever enough threads are active, if I
> ever get around to writing it in Java?
The biggest multicore I can buy is on my shopping list... ;-)
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u


|