Janne Blomqvist wrote:
> On 2008-05-05, glen herrmannsfeldt <gah@[EMAIL PROTECTED]
> wrote:
>>The READ could also be:
>> read(1, end=40) (matrix(i,matrixIndex),i=1000*internalCount+1:
>> @[EMAIL PROTECTED]
1000*(internalCount+1))
>>It might be that one is faster than the other for a particular
>>compiler and library.
I said one might be faster, but not which one.
> At least for gfortran the opposite is true. The implied do loop above
> is handled by the frontend, IIRC converting it to (roughly) the
> equivalent do loop like
> do i = 1000*internalCount+1, 1000*(internalCount+1)
> read (1, end=40) matrix(i,matrixIndex)
> end do
Well, it can't be quite like that because of the record
boundaries, but one call to the I/O routine per array element
I do understand.
> So you have the overhead of a read statement, which is considerable,
> for each element in the array. OTOH by reading array slices like the
> original code, the runtime library gets a pointer to the array
> descriptor (a modified descriptor describing the slice, not the
> original one) and has the op****tunity to read multiple elements at a
> time.
The usual Fortran I/O routines have one subroutine call to start,
supplying unit number and possibly format information, then one
call for each I/O list element (or possibly array element), then
one to finish the operation.
I remember when the OS/360 Fortran library was changed to do
implied DO as one call instead of one per element. It was
a big deal because it made newly compiled programs incompatible
with old versions of the library. (The general rule is that
libraries are back compatible but not forward compatible.
Most of the time, though, older versions will still work.)
> I think there's a PR about this in the gcc bugzilla somewhere. But
> it's not really easy to fix, it would probably require either the
> frontend to convert the implied do loop into the equivalent array
> descriptor, or to create some kind of iterator structure and pass that
> to the library, or to have some runtime interpreter for implied do
> loops. All three approaches would require a fair amount of work, I
> think.
> Ah, here it is:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35339
> Now, of course, other compilers might do it better.
For the simple cases of one array inside an implied DO
it should be pretty easy, and I would expect compilers
to be able to do that. For more complicated ones,
READ(1) ((X(I,J),Y(J,I),I=1,N),Z(J),J=1,M)
maybe I would not be surprised if it was done with
one subroutine call per element. Even so, other than
the subroutine call overhead it shouldn't be so slow.
-- glen


|