"Robbert Haarman" <comp.lang.misc@[EMAIL PROTECTED]
> wrote in message
news:20071209081451.GD4281@[EMAIL PROTECTED]
> Since I don't have much experience with most of the technologies you
> mentioned, I am just going to provide you with some general opinions.
>
> First and foremost, I think C is an awful language for concurrent
> programming. Almost everything you do alters state, from iterating over
> the elements of an array with a for loop to large ****tions of the
> standard library. The more altering state there is, the easier it is for
> both developers and the compiler to introduce errors, _especially_ in
> the face of concurrency.
>
well, that one is true, but for some of us, there are certain strong
reasons
to stick with C...
> Having said that, I do see some places where concurrency can be
> introduced. Implicit parallelism can be introduced in places where the C
> spec leaves results, and, in particular, evaluation order, unspecified.
> Also, many of the imperative constructs in C (loops, in particular) can
> actually be analyzed and parallelized in many cases. At least, Fortran
> compilers do this and I would be highly surprised if the same
> techniques wouldn't work in C.
>
this is implicit parallelization, which is largely what Intel proposes.
however:
as I noted before, one has to go through a kind of elaborate act to make
the
code paralellizable (the compiler, using 'deterministic' parallelization,
can't really do anything drastic);
secondly, said deterministic paralellization, doesn't really make the
concurrent semantics available in a directly usable form.
thus, Intel's 'Ct' extensions...
their whole idea is that we can break up the dependencies, and make the
code
parallelizable, by introducing various vector types (aka: pass by value
arrays), and defining a bunch of parallelizable vector operations (add,
multiply-add, ...).
limitation: for most general-purpose problems, I can't see this approach
being too easily usable.
the result:
likely for most general-purpose code, the compiler will do a poor job at
parallelizing it.
> As far as libraries go, obviously there is already sup****t for
> processes, threads, and various forms of message passing. All these are
> platform-dependent, but libraries have been written that provide a
> consistent API on multiple platforms (e.g. GLib provides threads on
> win32 and *nix). Whole operating systems have been written with these,
> and processes and sockets are pretty much the standard way to do
> concurrency on *nix and Plan9.
>
threads are the general purpose and simple option.
as noted though, threads are, in general, awkward and painful.
they will not go away, but I am looking for "alternative" options as well
(potentially ones left largely under the sanity and control of the
compiler).
> If you are thinking about extending or breaking the language, I would
> urge you to consider how much keeping all the ugliness and limitations
> of C is really worth to you, and if you couldn't get that value by
> designing a new, cleaner, better language that can call C code (e.g.
> through embedding C code in your programs, through allowing extensions
> to be written in C, or through allowing your language implementation to
> be embedded in C programs).
>
well, part of the issue is that I also need to be able to compile existing
C
code...
I can offer extensions, but I can't really "break" the language.
I am actually specifically avoiding breaking it.
note that many other compilers extend the language as well.
OpenMP, for example, is sup****ted by GCC.
the problem with OpenMP, well, is that it is not very elegant, and though
implementing raw parallelism, doesn't do a whole lot else.
note that, thus far, my compiler is still far less heavily extensioned
than
GCC...
note that some of what I am imagining, would also apply to sup****ting
OpenMP
as well.
this is another nifty thing of this kind of compiler:
what happens at one level is not even all that close to what happens at
another.
> If you decide that you do want to embrace and extend C, things I would
> think about are slightly altering the semantics to allow more
> concurrency, introducing some more abstract constructs (e.g. loops that
> do not use visible state), constructs that allow programmers to specify
> that some steps must be done without interruption (perhaps along the
> lines of Promela's d_step or Java's synchronized), and constructs that
> specify that the programmer does _not_ care about the order of
> execution of some statements.
>
the loop ideas (bfor, thfor), would likely in many cases be stateless, and
in fact, serve primarily to indicate how to go about branching and joining
(this is analogous to traditional loop unrolling).
as for the non-interruption, I had failed to mention it, but I had already
imagined something like:
serial <body>
that would do just this.
> Otherwise, general good design principles apply. Try to avoid building
> things into the compiler; instead, provide the building blocks to build
> the things you want in libraries and build them there. For exmaple, many
> common constructs in concurrent programming can be built out of others;
> e.g. you can build monitors from mutexes (or the other way around). This
> means you need to provide only one in your language, the other can be in
> a library.
>
the problem with C:
the C language proper is very difficult to extend in this way.
damn near anything like this, ends up going right into the compiler...
the major limitation in all this is that C is very limited when it comes
to
"first class abstractions".
effectively, to get these, I would either have to:
severely alter and reinterpret the language (would make the compiler a lot
more complicated);
build damn near an entirely new language on top of it.
what I can do though, is what I have done, is to partly modularize some
parts of the compiler, such that extensions can be added (in general)
without too much effecting the general compiler machinery.
basically, what this would mean is that code can register "interfaces"
with
several different parts of the compiler, and thus add new syntax and/or
semantics. to the user/programmer though, this still looks just like good
old compiler extensions though (and language features are still not
first-class, aka, the language can't add to itself).
to allow any such features as general library-based extensions, would
actually require adding some fairly "hard-core" extensions (examples would
include, a far more capable preprocessor, possibly sup****ting some form of
regex macros), operator overloading, and possibly a turing-complete
lisp-like macro engine.
as such, a whole lot goes into the core of the compiler, and this is
actually simpler.
when one has already written close to 100 kloc for a compiler, adding a
few
kloc more hardly asks that much...
I had partly also started an effort to rewrite the upper-end of the
compiler
(going from S-Expressions back to DOM trees), for reasons of DOM trees
being
more flexible. this effort has not yet been completed.
the preprocessor and parser have been largely rewriten, but the upper
compiler stages, such as the reducer (aka: AST-level optimizer) and
compiler
loop (AST->RPNIL), have not yet been rewritten.
the type-machinery in the parser has been fairly severely altered, and
likely the changes to the main 'compile' stage are likely to be notable.
the
reducer is likely to need to be almost completely rewritten...
the lower compiler is unlikely to really be effected by much of any of
this
though (all it sees is RPNIL, aka: the original C code is transformed into
a
kind of RPN-based language, and is almost completely isolated from the
other
compiler stages).
as for any concurrency-style features, much of the "heavy lifting" is
likely
to end up in the lower compiler (but, by the time it reaches RPNIL, the
code
will not be much of anything like how it comes into the parser anyways,
and
is likely to have been decomposed into a fairly simply set of primitives).
as with some of the ideas (especially the 'implicit processes'):
I realized I could somewhat simplify the problem by allowing a much
simpler
and more general (though, sadly, likely much slower) approach to the
implementation.
namely, I use a hash table as a means of "locking" pretty much anything
that
can be represented by a pointer.
thus, I no longer have to worry about how I represent the locks, ... but
simply, what and where I lock.
a sync function would thus be, locking and unlocking the function pointer,
and for a struct, locking and unlocking the in-memory pointer.
the major cost:
well, locking in this manner will probably be a little more expensive than
a
traditional mutex (since it will be necessary to hash the pointer, among
other things...), but it should be ok.
luckily, very few functions or objects are likely to be synchronizing.
a very similar idea (another, or maybe the same, hash) can be employed in
the implementation of message/work queues (upon completion, a sync
function
would check for another call for it in the queue, aka, a trampoline).
note:
I like work queues, as in general, they tend to be much cheaper and less
error prone than traditional threads.
part of my concurrency model focuses around being able to be effectively
make use of queues (workers may spawn as needed, such as when there are
not
enough to complete the request, but will likely tend to be far less than
the
conceptual number of threads).
each worker will be, in effect, a trampoline (applies args to functions,
and
sleeps when work is lacking, and it could be set up to die if it doesn't
see
work in too long).
I have generally had fairly good success with this in the past, though,
not
in C, as typically all this has been done in my non-C languages (C poses a
few challenges, especially given the non-reflective nature of its data
representations...).
this is partly why the last part ran into such difficulty...
> Regards,
>
> Bob
>
> --
> On the other hand, you have different fingers.
>
>


|