Optimized Code Debugging With Sun Studio dbx

   
By Nasser Nouri, November 2008  

Note: This article describes features in Sun Studio Express 11/08.

Debugging optimized code is inherently a difficult task for software developers, especially when applications are compiled with high level compiler optimization options such as -O3, -O4 and -O5. The higher the optimization level, the more likely the lines of source code do not match or correlate with instructions that are generated by compiler.

In order to be debugged, an application needs to be compiled with the -g option. The -g option is needed for source-level debugging. The debugging information that is generated by the compiler defines which sequence of instructions is associated with each line of source code in the application.

Usually, the compiler generates the debugging information for a compilation unit before it performs optimization. Hence, the correlation between the source line and the sequence of instructions might be distorted after the optimizing transformations. For instance, the program counter (PC) might bounce back and forth when you issue the next or step commands several times during a debugging session. The bouncing PC might result from any of the following optimizations: dead code elimination, inlining, tail call optimization, commune subexpressions, invariant code motion, and instruction scheduling.

Despite all these optimization steps, the compiler can still generate useful debugging information for function parameters and local variables using location lists. Location lists are used whenever an object such as a function parameter or a local variable whose location is being described can change location during its lifetime. dbx uses location lists to display the values of function parameters and local variables.

This article describes how effectively dbx can be used to debug optimized code when the code generated by the compiler has the location list debugging information.

The dbx debugger is included in the Sun Studio Tools software. dbx runs on Solaris, OpenSolaris, and Linux platforms.

Contents
 
Under the Hood
The Debugging Session
Summary
 
Under The Hood

The location list information is defined in the DWARF Debugging Information Format standards, Version 3 document. DWARF is a debugging file format used by many compilers and debuggers to support source level debugging. It addresses the requirements of a number of procedural languages, such as C, C++, and Fortran. The DWARF document defines the format for the information generated by compilers, assemblers, and linkage editors that is necessary for symbolic, source-level debugging.

This article focuses only on the location list aspects of the DWARF document. However, you are encouraged to glance through the entire DWARF document if you are interested in learning more about symbolic source-level debugging.

Location lists are used whenever the object whose location is being described can change location during its lifetime. For example, a local variable can reside in different locations during its lifetime, which could be different registers or on the stack. Location lists are contained in a separate object file section called .debug_loc.

A location list entry consists of a beginning address offset and an ending address offset. Both the beginning and an ending address offsets are relative to the applicable base address of the compilation unit referencing this location list. The Sun Studio compilers for SPARC platforms use the start of the function as the base address. The gcc compilers use the start of compile unit (object file or module) as the base address.

The lifetime is a collection of ranges where the object (a local variable or a function parameter) resides in a particular location of that range. For the compilers for SPARC platforms, each range consists of a beginning offset and an end offset from the start of the function. For gcc compilers, each range consists of a beginning offset and an end offset from the start of the module.

For each range, the object can reside in a register or a stack location. For registers, the register number is used, while for stack locations, the offset from the frame base is used, to describe the location.

Let's use the following example to understand in detail the location list information that is generated by compilers.

The a.c file is listed below:

#include <stdio.h>

f3 (int k) {
    printf("f3\n");
    printf("k = %d\n", k);
}

f2 (int j) {
    int k = 10;
    printf("f2\n");
    printf("j = %d\n", j);
    k += j;
    printf("k = %d\n", k);
    f3(k);
}

f1 (int i) {
    int k = 100;
    printf ("f1\n");
    printf("i = %d\n", i);
    k += i;
    printf("k = %d\n", k);
    f3(k);
    f2(i);
}

main (int argc, char *argv[])
{
    printf ("main\n");
    f1(123);
    return 0;
}
 

The gcc compiler is used to compile the a.c file with the -O2 optimization option.

% gcc -g -O2 a.c
 


The gcc compiler generates an a.out executable file.

You can also use the Sun Studio C compiler to generate the a.out file:

% cc -g -O2 a.c
 

It is assumed that you are already familiar with dbx instruction-level debugging commands. Otherwise, the following article is recommended for reading before proceeding with rest of this section: AMD64 Instruction-Level Debugging with Sun Studio dbx.

The dis command in dbx is used to disassemble the f2() function. We will refer to the f2() function assembly code later in the article.

(dbx)  
                     dis f2
0x0000000000400540: f2       :  pushq    %rbx
0x0000000000400541: f2+0x0001:  movl     %edi,%ebx
0x0000000000400543: f2+0x0003:  movl     $_IO_stdin_used+0xf,%edi
0x0000000000400548: f2+0x0008:  call     puts[PLT]      [ 0x400448, .-0x100 ]
0x000000000040054d: f2+0x000d:  movl     %ebx,%esi
                    

0x000000000040054f: f2+0x000f:  movl     $_IO_stdin_used+0x12,%edi
0x0000000000400554: f2+0x0014:  addl     $0x000000000000000a,%ebx
0x0000000000400557: f2+0x0017:  xorl     %eax,%eax
                    

0x0000000000400559: f2+0x0019:  call     printf [PLT]   [ 0x400438, .-0x121 ]
0x000000000040055e: f2+0x001e:  movl     %ebx,%esi
0x0000000000400560: f2+0x0020:  movl     $_IO_stdin_used+0x7,%edi
0x0000000000400565: f2+0x0025:  xorl     %eax,%eax
                    

0x0000000000400567: f2+0x0027:  call     printf [PLT]   [ 0x400438, .-0x12f ]
0x000000000040056c: f2+0x002c:  movl     %ebx,%edi
0x000000000040056e: f2+0x002e:  popq     %rbx
                    


0x000000000040056f: f2+0x002f:  jmp      f3     [ 0x400520, .-0x4f ]
0x0000000000400574: f2+0x0034:  nop
0x0000000000400578: puts+0x0130:        nop
0x000000000040057c: puts+0x0134:        nop
                  
 

The dwarfdump utility is used to print the DWARF information:

     % dwarfdump a.out > dwarf.log
 

The dwarf.log file contains all sorts of DWARF information that dbx needs for source-level debugging. However, we are only interested in the location list information that is generated for the f2() function. First we need to make sure the .debug_loc section is generated by the compiler:

.debug_loc format <o b e l> means section-offset begin-addr end-addr length-of-block-entry
         <obel> 0x00000000 0x000000000 0x00000001        2
         <obel> 0x00000014 0x000000001 0x0000001c        2
         <obel> 0x00000028 0x000000000 0x00000000        0
         <obel> 0x00000038 0x000000000 0x00000008        1
         <obel> 0x0000004b 0x000000008 0x00000017        1
         <obel> 0x0000005e 0x000000017 0x0000001c        1
         <obel> 0x00000071 0x000000000 0x00000000        0
         <obel> 0x00000081 0x000000020 0x00000021        2
         <obel> 0x00000095 0x000000021 0x00000054        2
         ......
         ......

 

The .debug_loc section contains location lists for the entire a.c program.

Since the a.c program is compiled with the gcc compiler, the module low_pc is used for the base address. The low_pc of the module is defined with the DW_AT_low_pc  0x400520 statement.

<0><   11>     DW_TAG_compile_unit
               DW_AT_producer               GNU C 4.2.0
               DW_AT_language               DW_LANG_C89
               DW_AT_name                   a.c
               DW_AT_comp_dir               /home/nassern/test/dbx/loc_list_demo2
                
                       DW_AT_low_pc                 0x400520
               DW_AT_high_pc                0x4005ff
               DW_AT_stmt_list              323
           
               ....
               ....
                    
 

The following is DWARF information for the f2() function:

<1><  715>      DW_TAG_subprogram
                DW_AT_external              yes(1)
                DW_AT_name                  f2
                DW_AT_decl_file             1 /home/nassern/test/dbx/loc_list_demo2/a.c
                DW_AT_decl_line             8
                DW_AT_prototyped            yes(1)
                DW_AT_type                  <87>
                DW_AT_low_pc                0x400540
                DW_AT_high_pc               400574
                DW_AT_frame_base            <loclist with 2 entries follows>
                        [ 0]<lowpc=0x20><highpc=0x21>DW_OP_breg7+8
                        [ 1]<lowpc=0x21><highpc=0x54>DW_OP_breg7+16
                DW_AT_sibling               <778>

<2><   751>     DW_TAG_formal_parameter
                DW_AT_name                  j
                DW_AT_decl_file             1 /home/nassern/test/dbx/loc_list_demo2/a.c
                DW_AT_decl_line             8
                DW_AT_type                  <87>
                DW_AT_location              <loclist with 3 entries follows>
                        [ 0]<lowpc=0x20><highpc=0x28>DW_OP_reg5
                        [ 1]<lowpc=0x28><highpc=0x37>DW_OP_reg3
                        [ 2]<lowpc=0x37><highpc=0x3e>DW_OP_reg4

<2><   764>     DW_TAG_variable
                DW_AT_name                  k
                DW_AT_decl_file             1 /home/nassern/test/dbx/loc_list_demo2/a.c
                DW_AT_decl_line             9
                DW_AT_type                  <87><
                DW_AT_location              <loclist with 2 entries follows>
                        [ 0]<lowpc=0x37><highpc=0x4f>DW_OP_reg3
                        [ 1]<lowpc=0x4f><highpc=0x54>DW_OP_reg5

 

The DW_AT_frame_base statement describes the frame-base location lists. Most often, the compilers in x86 and x64 architectures use the frame pointer register ( %rbp) for storing the temporary values. Therefore, the frame_base construct is used to calculate the start of each frame in the absence of the frame pointer register.

                DW_AT_frame_base            <loclist with 2 entries follows>
                        [ 0]<lowpc=0x20><highpc=0x21>DW_OP_breg7+8
                        [ 1]<lowpc=0x21><highpc=0x54>DW_OP_breg7+16

 

As shown above, there are two entries for frame-base location lists. Each entry specifies how to calculate the start of a frame if the program counter (PC) is pointing to any instruction within the specified range.

For each entry, dbx first figures out the sequence of instructions that are associated with the location list entry by adding the base address to the beginning offset and the ending offset. As you might recall, the base address was 0x400520. If we add 0x20 offset to the base address ( 0x400520), it gives us the address of the first instruction that is covered by the first entry ( 0x400540).

0x0000000000400540: f2      :  pushq    %rbx
 

The above instruction is the only instruction that is covered by the first range. The 0x21 offset is covered in the second location list entry.

Next, dbx needs to map the DW_OP_breg7 register to the actual hardware register on which the system is running. Based on DWARF Register Number Mapping section of x64 ABI, the DW_OP_breg7 maps to the %rsp stack pointer register. Hence, the frame pointer address can be calculated by adding eight to the content of the %rsp register for the first location list entry.

Similarly, for the second entry, the frame pointer address can be calculated by adding sixteen to the content of the %rsp register. The frame pointer address is the same for the following instructions:

0x0000000000400541: f2+0x0001:  movl     %edi,%ebx
0x0000000000400543: f2+0x0003:  movl     $_IO_stdin_used+0xf,%edi
0x0000000000400548: f2+0x0008:  call     puts[PLT]     [ 0x400448, .-0x100 ]
                      

0x000000000040054d: f2+0x000d:  movl     %ebx,%esi
0x000000000040054f: f2+0x000f:  movl     $_IO_stdin_used+0x12,%edi
0x0000000000400554: f2+0x0014:  addl     $0x000000000000000a,%ebx
0x0000000000400557: f2+0x0017:  xorl     %eax,%eax
0x0000000000400559: f2+0x0019:  call     printf [PLT]   [ 0x400438, .-0x121 ]
0x000000000040055e: f2+0x001e:  movl     %ebx,%esi
0x0000000000400560: f2+0x0020:  movl     $_IO_stdin_used+0x7,%edi
0x0000000000400565: f2+0x0025:  xorl     %eax,%eax
0x0000000000400567: f2+0x0027:  call     printf [PLT]   [ 0x400438, .-0x12f ]
0x000000000040056c: f2+0x002c:  movl     %ebx,%edi
0x000000000040056e: f2+0x002e:  popq     %rbx
0x000000000040056f: f2+0x002f:  jmp      f3     [ 0x400520, .-0x4f ]
                    
 

The formal parameter j and the local variable k reside in registers and do not need the frame-base location lists.

<2><  751>     DW_TAG_formal_parameter
               DW_AT_name                  j
               DW_AT_decl_file             1 /home/nassern/test/dbx/loc_list_demo2/a.c
               DW_AT_decl_line             8
               DW_AT_type                  <87>
               DW_AT_location              <loclist with 3 entries follows>
                       [ 0]<lowpc=0x20><highpc=0x28>DW_OP_reg5
                       [ 1]<lowpc=0x28><highpc=0x37>DW_OP_reg3
                       [ 2]<lowpc=0x37><highpc=0x3e>DW_OP_reg4

 

Based on the location lists entries shown above, the formal parameter j resides in the DW_OP_reg5 ( %rdi) register if the PC points to the 0x400540-0x400543 instructions. Similarly, the j parameter resides in the DW_OP_reg3 ( %rbx) register if the PC points to the 0x400548-0x400554 instructions. And finally, the j parameter resides in the DW_OP_reg ( %rsi) register if the PC points to the 0x400557-0x400559 instructions.

<2><  764>     DW_TAG_variable
               DW_AT_name                  k
               DW_AT_decl_file             1 /home/nassern/test/dbx/loc_list_demo2/a.c
               DW_AT_decl_line             9
               DW_AT_type                  <87>
               DW_AT_location              <loclist with 2 entries follows>

                       [ 0]<lowpc=0x37><highpc=0x4f>DW_OP_reg3
                       [ 1]<lowpc=0x4f><highpc=0x54>DW_OP_reg5
 

The location lists entries for the local variable k is shown above. The local variable k resides in the DW_OP_reg3 ( %rbx) register if the PC points to the 0x400557-0x40056e instructions and it resides in the DW_OP_reg5 ( %rdi) register if the PC points to the 0x40056f instruction.

The Debugging Session

This section demonstrates how dbx uses location lists to print the value of formal parameters and local variables.

dbx is used to debug the a.c program that is compiled with the -O2 option.

%  
                       dbx a.out
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.7' in your .dbxrc
Reading a.out
Reading ld-linux-x86-64.so.2
Reading libc.so.6
(dbx)  
                       stop in f2
(2) stop in f2
(dbx) run
                    
 

We instruct dbx to stop at the beginning of the function f2() by setting a breakpoint as shown above.

Running: a.out
(process id 3821)
main
f1
i = 123
k = 223
f3
k = 223
stopped in f2 at line 8 in file "a.c"
    8   f2 (int j) {
(dbx)  
                       where
=>[1] f2(j = 123), line 8 in "a.c"
  [2] main(argc = <value of 'argc' not available>, argv = <value of 'argv' not available>), line 30 in "a.c"
                    
 

The value of formal parameter j can be printed since the PC is pointing to the 0x40054 instruction. The value of local variable k cannot be printed since the PC is not within the range that is specified in the location lists for the local variable k. Hence, the <value of 'k' not available> message is printed.

(dbx)  
                       print $pc
$pc = 0x400540
(dbx)  
                       print j
j = 123
                      

(dbx)  
                       print k
dbx: <value of 'k' not available>
(dbx)  
                       next
stopped in f2 at line 8 in file "a.c"
   8   f2 (int j) {
                    
 

The j parameter resides in the %rdi register as stated in the location list entry.

(dbx)  
                       print $pc
$pc = 0x400541
(dbx)  
                       print -flx $rdi

$rdi = 0x7b
(dbx)  
                       next
stopped in f2 at line 10 in file "a.c"
   10       printf("f2\n");
(dbx)  
                       print $pc
$pc = 0x400543
(dbx)  
                       print j
j = 123
(dbx)  
                       print k
dbx: <value of 'k' not available>
(dbx)  
                       next

stopped in f2 at line 11 in file "a.c"
   11       printf("j = %d\n",j);
                    
 

The j parameter resides in the %rbx register since the PC is pointing to the 0x40054d instruction. The %rdi register is used by compiler to store some other value.

(dbx)  
                       print $pc
$pc = 0x40054d
(dbx)  
                       print j
j = 123
(dbx)  
                       print $rbx
$rbx = 123ULL
(dbx)  
                       print -flx $rbx
$rbx = 0x7b
(dbx)  
                       print -flx $rdi
$rdi = 0x1
(dbx)  
                       next

stopped in f2 at line 12 in file "a.c"
   12       k += j;
                    
 

The j parameter resides in the %rsi register as stated in the location list entry.

(dbx)  
                       print $pc
$pc = 0x400554
(dbx)  
                       print k

dbx: <value of 'k' not available>
(dbx)  
                       print j
j = 123
(dbx)  
                       print $rsi
$rsi = 123ULL
(dbx)  
                       print -flx $rsi
$rsi = 0x7b
(dbx)  
                       next
stopped in f2 at line 11 in file "a.c"
   11       printf("j = %d\n", j);
                    
 

The value of the local variable k can be printed since the PC is pointing to the 0x400557 instruction. The PC is within the range that is specified for the local variable k.

(dbx)  
                       print $pc
$pc = 0x400557
(dbx)  
                       print k

k = 133
(dbx)  
                       next
j = 123
stopped in f2 at line 13 in file "a.c"
   13       printf("k = %d\n", k);
                    
 

The value of j cannot be printed since the PC is out of range for the formal parameter j.

(dbx)  
                       print $pc

$pc = 0x40055e
(dbx)  
                       print j
dbx-g: <value of 'j' not available>
(dbx)  
                       print k
k = 133
(dbx)  
                       next
k = 133
stopped in f2 at line 14 in file "a.c"
   14       f3(k)
                    
 

The local variable k resides in the %rbx register when the PC is pointing to the 0x40056c instruction.

(dbx)  
                       print $pc
$pc = 0x40056c
(dbx)  
                       print -flx $rbx
$rbx = 0x85
(dbx)  
                       print $rbx
$rbx = 133ULL
(dbx)  
                       print j
dbx: <value of 'j' not available>

(dbx)  
                       next
stopped in f2 at line 15 in file "a.c"
   15   }
(dbx)  
                       print $pc
$pc = 0x40056e
(dbx)  
                       print k
k = 133
(dbx)  
                       next
stopped in f2 at line 14 in file "a.c"
   14      f3(k);

                    
 

The local variable k resides in the %rdi register when the PC is pointing to the 0x40056f instruction. The %rbx register is used to store a temporary value.

(dbx)  
                       print $pc

$pc = 0x40056f
(dbx)  
                       print k
k = 133
(dbx)  
                       print $rdi
$rdi = 133ULL
(dbx)  
                       print $rbx
$rbx = 182895161792ULL
(dbx)  
                       print -flx $rbx
$rbx = 0x2a9566b1c0
(dbx)
                    
 
Summary

dbx lets you debug applications that are compiled with the optimization options. dbx uses the location list information that is generated by Sun Studio and gcc compilers to print value of the formal parameters and the local variables.

Location lists are used whenever the object whose location is being described can change location during its lifetime. The lifetime is a collection of ranges where the object (a local variable or a function parameter) resides in a particular location of that range.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.