michele.simionato@[EMAIL PROTECTED]
<michele.simionato@[EMAIL PROTECTED]
> wrote:
> I am not sure about what to do with introspection in SML.
SML doesn't directly sup****t introspection. In cases where something like
introspection is really needed, it is usually encoded with combinators.
For example, my generic programming library
http://mlton.org/cgi-bin/viewsvn.cgi/mltonlib/trunk/com/ssh/generic/unstable/
can be seen as a form of introspection. Basically, using a set of
combinators, the programmer encodes the shapes (or structure) of types.
Generic functions, such as serialization, are then implemented based on
that encoding.
> It seems that introspection capabilities are there, since the
> interactive prompt is able to print the signature of any object, but
> they are somewhat not exposed to the programmer.
Yes, what you see is a capability of the interactive REPL. Some SML
implementations, e.g. MLton and MLKit, do not offer such a REPL.
> Case in point: I was thinking about writing a test runner (*).
I've gone through the same exercise.
I think that my first idea was to just have tests as side-effecting
computations at the top-level:
testThat "testOnePlusOne" (fn () => 1+1=2) ;
The side-effect are in the testThat function.
As a generalization of this approach, you can collect tests into functors,
so you can define test cases next to the artifacts you are testing
fun foo ... = ...
fun bar ... = ...
functor TestEm () = struct
val () = test "foo" (fn () => ...) ;
val () = test "bar" (fn () => ...) ;
end
and then invoke the tests later
structure ? = TestEm ()
I think I noticed this idea, using functors to collect tests, later from
MLton's source code, which uses it in some places.
> My idea was to collect tests in structures; each test would have a name
> starting with "test" and a signature unit->bool, returning true if the
> test passed and false otherwise.
I think that my second idea was also to collect tests into a structure.
However, as you have noticed, just collecting tests into a structure
doesn't get you very far.
> The runner should be able to execute all the functions in the structure
> matching the signature and with a name starting with "test".
I personally find that approach an ugly hack. It is only slightly
better than grepping through the source code for functions starting
with the same prefix. Aside from the fragility of grepping function
names, it makes tests second-class entities and makes it harder to
provide new abstractions for specifying tests.
> One possibility is to parse the source code with a regular expression,
> to find the tests and to generate the source code for the runner
Instead of grepping the source code, one could parse the signatures
re****ted by the compiler. Parsing only the signature language is much
simpler than parsing the full grammar and you would get more robust
identification of the test functions. This doesn't mean that I would
recommend this approach.
> but it certainly does not look clean
Yes, I think that any approach based on identifying tests by looking
for functions with a particular kind of name, whether by grepping
source code or grepping function names through introspection, is an
ugly hack.
> another solution is to register the test names in a list, but that
> requires to add a registration call for each test and duplicating the
> names, in that case I would better off just calling the tests directly
> in the runner. This is disturbing, since each time I add a test I must
> change the runner, and if I change the name of a test I have to change
> it twice.
Now you are finally getting to the actual issue, which has little to do
with reflection per se. What you really want is to be able to specify a
test in one place. Adding or removing an individual test should not
require you to make a corresponding change elsewhere. This is a
fundamental principle of good program organization. For example, I've
been paid to maintain a largish code base (not SML code) where one of the
earlier (lead) programmers (I think his title was "Software Architect")
had this programming pattern of adding comments with "ADDNOTE" in places
where you needed to change things when you made a particular kind of
addition (several "ADDNOTE"s per kind of change). This is, of course,
silly. It is better to organize the code so that all the logic related to
a particular kind of thing is given as a unit so you don't have to go
through the source code grepping for other places you might need to
change. So, at the office, we referred to him as 'John "AddNote" Doe'.
> I could go with an association list like the following:
> testList = [
> ("testOnePlusOne", fn () => 1+1=2),
> ("testTwoPlusTwo", fn () => 2+2=4)
> ]
> bit it just looks ugly compared to
> fun testOnePlusOne () = 1+1=2 (* I added the ()'s here *)
> fun testTwoPlusTwo () = 2+2=4 (* and indented the code *)
Personally I don't the list approach that ugly. The amount of extra
verbiage per test is not large. If you don't like the parentheses, you
could use an infix constructor (here "is") as syntactic sugar:
val () = tests
[ "testOnePlusOne" is (fn () => 1+1=2)
, "testTwoPlusTwo" is (fn () => 2+2=4)
]
> The most disturbing thing is that the compiler already knows all the
> names in a structure: why I cannot extract them from the signature? It
> looks like a serious restriction to me (for instance, how do you write
> do***entation tools without introspection, expecially if you have only
> the compiled form of a library?)
Introspection, alone, isn't sufficient to implement a do***entation tool.
You'd also need to have sup****t for doc strings. Note that introspection
is not exactly free. Having introspection means that a lot of metadata
needs to included in the binary and that many optimizations cannot be
performed (or are hindered). For example, if SML provided run-time
introspection to allow one to iterate over the values in a structure, you
could not, for example, safely eliminate unused values from a structure or
inline a definition at all use sites and then eliminate the definition.
The compiler also couldn't make data representation optimization, such as
eliminating unused fields from records or unused variants from datatypes.
All of the previous kind of optimizations (and more) are performed
aggressively by MLton, for example. So, introspection isn't just a
blessing --- it is also a limitation.
Personally, I don't find introspection crucial. I've used
introspection quite heavily in Java, for example, but mostly as a
workaround for some other deficiency. In particular, when you have
lightweight anonymous functions, infix operators and lightweight
syntax for calling functions, many of the uses of introspection can be
encoded concisely.
> but maybe there is some trick I am not aware of. Please illuminate me!
For specifying tests, I've settled on using the Fold technique
(http://mlton.org/Fold).
Here is the signature of my test framework:
http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mltonlib/trunk/com/ssh/unit-test/unstable/public/unit-test.sig
Here is an example of a traditional xUnit style testing:
http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mltonlib/trunk/com/ssh/async/unstable/test/async.sml
Here is an example of a QuickCheck -style testing:
http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mltonlib/trunk/com/ssh/unit-test/unstable/example/qc-test.sml
If you look at the examples (the first one in particular), you can see
that test specification begins by calling "unitTest":
val () = let
open UnitTest
in
unitTest
then a title (or name) for subsequent tests can be specified with "title":
(title "The Title")
then individual tests are registered with "test" (or some other test
specifier):
(test (fn () => (* The test code, which raises an exception on
failure *)))
finally test registration is terminated with
$
end
From the above kind of specification, the test framework can produce a
re****t telling whether each test produced an error or ran correctly.
Multiple tests may have the same title. Individual tests are then
differentiated with index numbers. The first test after a title change
gets the number 1 and the next one is 2 and so on.
Note that this approach is not without advantages over the ad hoc approach
of grepping for functions named as "test"s. In particular, it allows you
to conveniently define new test registration specifiers. See here for an
example:
http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mltonlib/trunk/com/ssh/generic/unstable/test/pretty.sml
At the beginning the specifier "tst" is defined as a shorthand for a
particular form of test used throughout the test case. Another example
would be here:
http://mlton.org/cgi-bin/viewsvn.cgi/*checkout*/mltonlib/trunk/com/ssh/generic/unstable/test/pickle.sml
At the beginning the specifiers testSeq, testAllSeq and testTypeMismatch
are defined for specifying tests very concisely.
-Vesa Karvonen


|