64-bit x86 Migration, Debugging, and Tuning, With the Sun Studio 10 Toolset

   
By The Sun Studio Team, January 2005  
Read this article for a discussion on porting issues when migrating 32-bit applications to 64-bit x86 Solaris 10 platforms, and to learn about the dbx command-line debugger, and the Performance Analyzer.

Contents
 
Porting to a 64-bit x86 Platform
What Needs to be Recompiled and What Does Not
Compiling 64-bit Code on x86 Platforms
x86 Code Linking Restrictions
Compiling 64-bit Code When Using the -fast Option
Comparing the -fast Option Expansion on x86 Platforms and SPARC Platforms


The dbx Command-Line Debugger
Tools for Tuning 64-bit Code on x86 Solaris 10 Platforms
How to Use the Performance Tools
Limitations of the Performance Tools on 64-bit x86 Platforms
When Code Needs to be Recompiled for the Performance Tools
 

Migrating to 64-bit x86 Solaris 10 Platforms

 
Porting to a 64-bit x86 Platform

If you are moving an application to a Solaris 10 OS on x86 based system for the first time (in 32-bit mode), use the Sun Studio 9 compilers to compile and develop on systems running the Solaris 8, 9, or 10 OS. Do the final build with Sun Studio 10 compilers. You might see a substantial performance improvement with Sun Studio 10 compilers, both for 32-bit and 64-bit code. If you are moving from 64-bit SPARC V9 architectures, it's a straight recompile (with some of the caveats listed here).

Programs that are already LP64 clean for the most part can just be compiled -xarch=amd64 and should run. Makefiles with SPARC specific compiler options may need to be adjusted.

Why passing an int where a long was expected works on SPARC V9 systems but not on AMD64 architectures.

Prototypes should match function signature:

    With wrong prototype             With correct prototype
            

    --------------------             ----------------------
            

    void insert_stc(int);            void insert_stcc(long);
            

             

    void string_append() {           void string_append() {
            

        insert_stc(-1);                  insert_stc(-1);
            

    }                                }
            

          

On SPARC V9 the call to insert_stc will appear to sign extend the argument from int to a long where the wrong prototype has been used. This allows the incorrect program to function as if a correct prototype was in scope. On AMD64 a 4-byte -1 will be passed as specified by the prototype, resulting in zero extension, and incorrect or undefined execution of the program.

Read Chapter 8 Converting Applications for a 64-Bit Environment in the Sun Studio 10 C User's Guide.

Also, see the Solaris 10 64-bit Developer's Guide

What Needs to be Recompiled and What Does Not

Although 32-bit applications compiled on x86 systems do not need to be recompiled to run on a 64-bit x86 Solaris 10 platform, recompilation often leads to performance improvements.

Code compiled to run on SPARC platforms does need to be recompiled to run on a 64-bit x86 Solaris 10 platform; the AMD Opteron 64-bit x86 instruction set is very different than the SPARC instruction set.

Compiling 64-bit Code on x86 Platforms

Use -xarch=amd64. You can also use -xarch=generic64, which is also available on SPARC. Use this same option in makefiles for compiling code on 64-bit x86 and 64-bit SPARC V9 systems.

For the latest Sun Studio 10 compiler option information, see the compiler man pages.

x86 Code Linking Restrictions

Code compiled on Linux platforms cannot be linked with with 32-bit code compiled on Solaris x86 platforms. However, 64-bit code compiled on Linux systems can be linked with 64-bit code compiled on x86 Solaris 10 platforms.

Note, however, that binary compatibility has limitations when files appear in different places within the file system. Furthermore, the Solaris OS is POSIX compliant and Linux is not. So, binary compatibility will only be effective if programmers code to the common subset of Linux and Solaris OS.

Compiling 64-bit Code When Using the -fast Option

Compiling with -fast on an 64-bit x86 (AMD64) platform is not sufficient to generate 64-bit code. You must also specify -xarch=amd64. Here's why:

The -xarch option is evaluated from left to right on the command line, so the last specification of -xarch appearing on the command line determines which value of -xarch will be used.

-fast is a macro option whose expansion includes -xtarget=native. However, even on an AMD64 platform, -xtarget=native will expand to -xarch=sse2, which is a 32-bit architecture. You also need to explicitly follow -fast on the command line with -xarch=amd64 to signal 64-bit code generation.

Be aware that the order of these two options is important. Specifying -xarch=amd64 -fast would expand to -xarch=amd64 -xarch=sse2 which still would result in 32-bit code generation. Specifying -fast -xarch=amd64 would expand to -xarch=sse2 -xarch=amd64 and would correctly signal 64-bit code generation.

Comparing the -fast Option Expansion on x86 Platforms and SPARC Platforms

The -fast option is a macro that can be effectively used as a starting point for tuning an executable for maximum runtime performance. -fast is a macro that can change from one release of the compiler to the next and expands to options that are target platform specific. Compile with the -# option or -xdryrun to examine the expansion of -fast, and incorporate the appropriate options of -fast into the ongoing process of tuning the executable.

Note that to compile a 64-bit x86 object with -fast you need to follow the -fast option with -xarch=amd64 on the command line.

   x86 SPARC
cc -D__MATHERR_ERRNO_DONTCARE
-dalign
-fns
-nofstore
-fsimple=2
-fsingle
-xarch=sse2
-xbuiltin=%all
-xcache=64/64/2:1024/64/8
-xchip=opteron
-xlibmil
-xlibmopt
-xO5
-D__MATHERR_ERRNO_DONTCARE
-fns
-fsimple=2
-fsingle
-xalias_level=basic
-xarch=v8plusa
-xbuiltin=%all
-xcache=16/32/1:4096/64/1
-xchip=ultra2
-xdepend
-xlibmil
-xlibmopt
-xmemalign=8s
-xO5
-xprefetch=auto,explicit
CC -xO5
-xarch=sse2
-xcache=64/64/2:1024/64/8
-xchip=opteron
-fsimple=2
-fns=yes
-ftrap=%none
-xlibmil
-xlibmopt
-xbuiltin=%all
-nofstore
-xO5
-xarch=v8plusa
-xcache=16/32/1:4096/64/1
-xchip=ultra2
-xmemalign=8s
-fsimple=2
-fns=yes
-ftrap=%none
-xlibmil
-xlibmopt
-xbuiltin=%all
f95

-xO5
-xarch=sse2
-xcache=64/64/2:1024/64/8
-xchip=opteron
-dalign
-fsimple=2
-fns=yes
-ftrap=common
-xlibmil
-xlibmopt
-nofstore

-xO5
-xarch=v8plusa
-xcache=16/32/1:4096/64/1
-xchip=ultra2
-xdepend=yes
-xpad=local
-xvector=yes
-xprefetch=auto,explicit
-dalign
-fsimple=2
-fns=yes
-ftrap=common
-xlibmil
-xlibmopt
-fround=nearest

Debugging and Tuning on the AMD64 Platform

 
The dbx Command-Line Debugger
Tools for Tuning 64-bit Code on x86 Solaris 10 Platforms
How to Use the Performance Tools
Limitations of the Performance Tools on 64-bit x86 Platforms
When Code Needs to be Recompiled for the Performance Tools
 
 
The dbx Command-Line Debugger

Sun Studio software on SPARC based platforms includes two dbx binaries: a 32-bit dbx that can debug 32-bit programs only, and a 64-bit dbx that can debug both 32-bit and 64-bit binaries. When dbx starts, it determines which of its binaries to execute. On an x86 Solaris system running the 64-bit kernel, the 64-bit dbx is the default.

For details about the limitations of dbx, see the dbx readme.

Tools for Tuning 64-bit Code on x86 Solaris 10 Platforms

The Sun Studio Performance Tools can help find bottlenecks in C, C++, Fortran, and Java applications. In many ways, these tools are more flexible and detailed than prof and gprof. They can help answer the following kinds of questions:

  • Which source lines and instructions are consuming the most resources?
  • How did the program arrive at this point in the execution?

For more information about the performance tools in Sun Studio, see the Developer Portal.

How to Use the Performance Tools

First, record an application's run with the Collector, then view and analyze the results with Analyzer. More details can be found at the Performance Analyzer resources page.

Limitations of the Performance Tools on 64-bit x86 Platforms
  • When profiling Java, call stack walking will be disabled when frame pointers are not supplied by the Java compiler.
  • The dataspace profiling option is not supported on x86.

For more details, see the performance analyzer readme.

When Code Needs to be Recompiled for the Performance Tools

In general, you do not need to recompile your application to use the debugging and performance tools. However, the ability to show full call stacks depends on the use of frame pointers. With i386 and AMD64 processors, frame pointers are used in C++, but they are disabled for C when compiled with -fast. -fast is a macro option whose expansion includes -xregs=frameptr. -xregs=frameptr allows the compiler to perform optimizations that reuse the register containing the frameptr. By following -fast with -xregs=no%frameptr on the command line, this optimization is disabled, thus ensuring use of frame pointers. See the explanation for -xregs in the cc(1) man page for more information.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.
Left Curve
System Administrator
Right Curve
Left Curve
Developer and ISVs
Right Curve
Left Curve
Related Products
Right Curve
solaris-online-forum-banner