|
| By Paul Hinker, Sun Microsystems, June 2007 |
|
| |
Porting dusty deck Fortran source can be an exercise in patience and conditional compilation. An application that needs to run in the ILP-32, LP-64, and ILP-64 models faces the problem of interfacing with external libraries seamlessly.
Here are examples of using the Basic Linear Algebra Subprogram Standard (BLAS) AXPY family of routines (
caxpy,
daxpy,
saxpy, and
zaxpy):
ILP-32 interface:
SUBROUTINE saxpy(N, ALPHA, X, INCX, Y, INCY) INTEGER*4 :: N, INCX, INCY REAL*4 :: ALPHA, X(*), Y(*)
LP-64 interface:
SUBROUTINE saxpy(N, ALPHA, X, INCX, Y, INCY) INTEGER*4 :: N, INCX, INCY REAL*4 :: ALPHA, X(*), Y(*)
ILP-64 interface:
SUBROUTINE saxpy(N, ALPHA, X, INCX, Y, INCY) INTEGER*8 :: N, INCX, INCY REAL*4 :: ALPHA, X(*), Y(*)
ILP-64 interface (strict Fortran type adherence):
SUBROUTINE saxpy(N, ALPHA, X, INCX, Y, INCY) INTEGER*8 :: N, INCX, INCY REAL*8 :: ALPHA, X(*), Y(*)
Strict Fortran adherence means that
INTEGER and
REAL data types have identical bit-width. This is contrary to the strong implication that single precision routines (i.e., those that are prefixed by '
s') expect 32-bit floating point data. (See
Further Details: Floating Point Arithmetic in the
LAPACK Users' Guide.)
The ILP-64 interfaces to the Sun Performance Library routines are suffixed by
_64 to distinguish them from routines by the same name that expect 32-bit integers but run in an LP-64 model. Some older Cray source code strictly adheres to the Fortran Language specification, which requires
INTEGER and
REAL data types to have the same bit width, and that source code expects the floating point data sent to the
s-prefixed routines to be 64-bit.
There are a variety of ways to handle this. First, here's a purely brute-force approach of manually editing the source with conditional compilation code:
#ifdef ILP64 call saxpy_64(n,alpha,incx,y,incy) #else call saxpy(n,alpha,incx,y,incy) #endif
Applying the brute-force approach to an application that consists of thousands of files and millions of lines of source code is a waste of engineering time. The same effect can be accomplished with an
awk,
sed, or
perl script, but there are 1700+ routines in the Performance Library, so even scripting the process would be time consuming and error prone.
Another option is to use the Fortran 95 generic interface functionality to allow the source code to remain virtually unchanged and yet facilitate the use of each of the three programming models (ILP-32, LP-64, and ILP-64).
Let's take a very simple example that calls the BLAS
saxpy routine:
% cat tst.f
program tst saxpy
implicit none
integer, parameter :: N = 10, INCX = 1, INCY = 1
real, parameter :: ALPHA = 1.0
real, dimension(N) :: X, Y
X = 1.0
Y = 2.0
call saxpy_64(N, ALPHA, X, INCX, Y, INCY)
print *, SUM(Y)
END
% f95 -o tst tst.f -xlic_lib=sunperf -m32 \
-xarch=[sparc|sparcvis|sparcvis2|sse2]
% tst
30.0
When compiled with one of the ILP-32 architectures (
-m32), the
saxpy call resolves to the one expecting 32-bit integers and real parameters.
% f95 -o tst tst.f -xlic_lib=sunperf -m64 \ -xarch=[sparc|sparcvis|sparcvis2|sse2] % tst 30.0
When compiled with one of the ILP-64 architectures (
-m64), the
saxpy call resolves to the LP-64 interface, which expects 32-bit integers and real parameters. This behavior is for backward compatibility reasons. However, when compiled with one of the ILP-64 libraries, additional entry points are available. That is, our source could look like the following:
program tst saxpy
implicit none
integer, parameter :: N = 10, INCX = 1, INCY = 1
real, parameter :: ALPHA = 1.0
real, dimension(N) :: X, Y
X = 1.0
Y = 2.0
call saxpy_64(N, ALPHA, X, INCX, Y, INCY)
print *, SUM(Y)
END
% f95 -o tst tst.f -xlic_lib=sunperf -m64 \
-xarch=[sparc|sparcvis|sparcvis2|sse2] -xtypemap=integer:64
% tst
30.0
The integer declarations have been changed to 8-byte integers using the
-xtypemap compiler option. Explicit declaration changes could be made in the source itself.
program tst saxpy
implicit none
integer(8), parameter:: N = a10, INCX = 1, INCY = 1
real, parameter :: ALPHA = 1.0
real, dimension(N) :: X, Y
X = 1.0
Y = 2.0
call saxpy_64(N, ALPHA, X, INCX, Y, INCY)
print *, SUM(Y)
END
% f95 -o tst tst.f -xlic_lib=sunperf -m64 \
-xarch=[sparc|sparcvis|sparcvis2|sse2]
Warning : The xtypemap option only applies to implicitly typed variables.
INTEGER :: N gets promoted to INTEGER*8 :: N but
INTEGER*4 :: N would not be promoted.
In theory, the
xtypemap option is a great way to port to the ILP-64 model. In practice, it's not quite a silver bullet. As long as interfaces are clearly defined and strictly follow typing rules, it works well.
In the example above, if we forget the
-xtypemap=integer:64 flag on the compilation line and do not explicitly change the integers passed to the
saxpy_64 routine to
INTEGER*8, the compiler will generate no errors or warnings (since Fortran doesn't do type matching). But when the program is run, chances are the results will be wrong or a segmentation fault will occur, since 32-bit integers will be sent to a routine that is expecting 64-bit integers.
An F95 interface can be used to describe the different programming models.
module sunperf64
interface saxpy
!
! ILP-32 and LP-64 interface
!
subroutine saxpy(n,alpha,x,incx,y,incy)
integer(4) :: n, incx, incy
real(4) :: alpha, x(*), y(*)
end subroutine
!
! ILP-64 interface
!
subroutine saxpy_64(n,alpha,x,incx,y,incy)
integer(8) :: n, incx, incy
real(4) :: alpha, x(*), y(*)
end subroutine
!
! ILP-64 interface w/strict Fortran typing
!
subroutine daxpy_64(n,alpha,x,incx,y,incy)
integer(8) :: n, incx, incy
real(8) :: alpha, x(*), y(*)
end subroutine
end interface
end module
If this module is compiled into a
.mod file, it will allow a single call to the
saxpy routine to be interpreted as any of the ILP-32, LP-64, ILP-64, or ILP-64 (strict) calling conventions.
% f95 -c sunperf64.F95
This will create a
.mod file in the current directory by the name of
sunperf64.mod. It will also create a
sunperf64.o file (which contains nothing of use and can be discarded).
F95
.mod files provide the F95 compiler with information concerning interfaces to subroutines and functions. These files allow the compiler to check subroutine and function call parameter lists for type, number, and shape consistency.
Given the original example code, the only source-level change that needs to be made is the addition of the
USE SUNPERF64 statement at the beginning of the program, subroutine, or function that calls Performance Library routines. Of course, you can call the module file anything you like.
program tst saxpy
use sunperf64
implicit none
integer, parameter :: N = 10, INCX = 1, INCY = 1
real, parameter :: ALPHA = 1.0
real, dimension(N) :: X, Y
X = 1.0
Y = 2.0
call saxpy_64(N, ALPHA, X, INCX, Y, INCY)
print *, SUM(Y)
END
Then, this line would call the ILP-32 version:
% f95 -o tst tst.f -xlic_lib=sunperf -xarch=[sparc|sparcvis|sparcvis2|sse2]
This line would call the LP-64 version:
% f95 -o tst tst.f -xlic_lib=sunperf -xarch=[sparc|sparcvis|sparcvis2|sse2] -m64
The following line would call the ILP-64 version:
% f95 -o tst tst.f -xlic_lib=sunperf \ -xarch=[sparc|sparcvis|sparcvis2|sse2] -m64 -xtypemap=integer:64
And this line would call the ILP-64 (strict) version:
% f95 -o tst tst.f -xlic_lib=sunperf \ -xarch=[v9|v9b|amd64] -xtypemap=integer:64,real:64 -m64
If the above experiments are performed, the compiler will complain that
sunperf64.mod was compiled with a different default integer type, a different architecture than the executable that is being created, or both. Since there is no executable code in the interfaces created by the
.mod file, these warnings are harmless. If the warnings are bothersome, the
.mod file can be created in several different incarnations:
% f90 -c -m32 sunperf64.F90 ** for ILP-32 % f90 -c -m64 sunperf64.F90 ** for LP-64 % f90 -c -m64 sunperf64.F90 -xtypemap=integer:64 ** for ILP-64
Paul Hinker has worked in the Performance Library Group for 10 years as the Team Lead and Technical Lead. Before coming to Sun as part of the acquisition of Dakota Scientific Software, Paul worked in the Advanced Computing Laboratory at the Los Alamos National Laboratory. The Performance Library Group is based in Broomfield, Colorado and produces the Sun Performance Library (a.k.a. Perflib or Sunperf), which is part of the Sun Studio Compiler and Tools package.
