by Darryl Gove, December 2011
Independent of the execution environment, as you seek to exploit parallelism, you must ensure that your code is correct and provides predictable results. This article describes how to use the Oracle Solaris Studio Thread Analyzer to analyze parallel code for correctness.
Most processors today—SPARC and x86 alike—are equipped with multiple cores and are capable of supporting multiple, simultaneous execution threads. Many systems also employ multiple multicore processors. Taking advantage of these multiple cores and exploiting multiple threads of execution is important if you want to derive as much value and performance as possible from your selected platforms.
The Oracle Solaris operating system provides an efficient and scalable threading model as well as a smart scheduler to deliver resources to applications through a variety of application development and deployment tools:
Much existing code was written without the assumption of parallel threads of execution. Oracle Solaris Studio compilers provide mechanisms to let an application run multiple threads without requiring you to specify how this is done. Loops, in particular, often represent opportunities where a previously repetitive serial operation can be divided into multiple, independent execution threads.
You can use the following compiler flags with Oracle Solaris Studio compilers to govern automatic parallelization behavior:
-xautoparcompiler flag to instruct the compiler to look for loops that can be safely parallelized in the code.
-xloopinfocompiler flag to generate information about the loops the compiler has parallelized.
-xreductioncompiler flag find and parallelize reduction operations that take a range of values and output a single value, such as summing all the values in an array.
Also, set the
OMP_NUM_THREADS environment variable at runtime to control the number of threads for code that is parallelized using automatic parallelization or the OpenMP compiler flags.
Support for OpenMP 3.1 in Oracle Solaris Studio means that the compilers can look for directives (pragma) in the source code in order to build a parallel version of the application. Similar to automatic parallelization, the compiler does the work so you don't have to manage the threads.
OpenMP represents an incremental approach to parallelization with potentially fine granularity. OpenMP allows you to set directives around specific loops to be optimized through threading while leaving other loops untouched. The other distinct advantage of this approach is that you can derive a serial and a parallel version of the application from the exact same code base, which can be helpful for debugging.
Use the following OpenMP-related compiler flags with Oracle Solaris Studio:
-xopenmpcompiler flag. OpenMP directives are recognized only when this flag is used.
-xvparacompiler flag to report potential parallelization issues.
-xloopinfocompiler flag to direct the compiler to provide the details of which loops were parallelized.
Also, set the
OMP_NUM_THREADS environment variable at runtime to control the number of desired threads for code that is parallelized using OpenMP compiler flags or automatic parallelization. The default number of threads is two.
By programming to the POSIX threads API, you can have complete control over thread usage in your applications. The POSIX threads (or Pthreads) specification represents a POSIX standard for a thread API that defines a set of C programming language types, functions, and constants. Oracle Solaris Studio compilers support the POSIX threads programming model.
For information about the Pthreads API, see "
pthread.h(3HEAD)" in the man pages section 3: Library Interfaces and Headers Oracle Solaris 11 reference manual.
The Oracle Solaris Studio Thread Analyzer is designed to help ensure multithreaded application correctness. Specifically, the Thread Analyzer can help detect, analyze, and debug the following situations, which can arise in multithreaded applications:
To detect data race and deadlock conditions, do the following to compile the code, execute it under the control of the
collect -r all command, and load the code into the Thread Analyzer.
-xinstrument=dataracecompiler flag. It is recommended that the
-gflag also be set and that no optimization level be used to help ensure that the line numbers and call-stacks information is returned correctly.
Or, if you have existing binaries that have been compiled with the Oracle Solaris Studio 12.3 compiler, they can be instrumented using the command
discover -i datarace -o a.out.instrumented a.out.
collect -r alloption to run the resulting application code and create a data race detection and deadlock detection experiment during the execution process. The resulting experiment will be named
Alternately, use the following command to run the application code and create only a data race detection experiment:
% collect -r race <app> <params>
Or use the following command to run the application code and create only a deadlock detection experiment:
% collect -r deadlock <app> <params>
tha tha.1.erto load the results of the experiment into the Thread Analyzer to identify the data race and deadlock conditions. Figure 1 shows Races tab of the Thread Analyzer.
Figure 1. Data race conditions can be identified through the Thread Analyzer.
The Thread Analyzer can also help identify individual lines of source code that are associated with race conditions (Figure 2).
Figure 2. Individual lines of source code associated with data race conditions can be identified using the Thread Analyzer.
For an exhaustive description of compiler flags and options, see the complete Oracle Solaris Studio product documentation at http://oracle.com/technetwork/server-storage/solarisstudio/documentation/oss123-docs-1357739.html.
|Revision 1.0, 12/13/2011|