Debugging Optimized Code

HP/DDE supports debugging of code compiled at optimization levels 2 and below. The following is a brief description of the compiler optimization options that are compatible with debugging:

+O0: Minimal optimization. This is the default.
+O1: Basic block level optimization.
+O2: Full optimization within each procedure in a file. (Can also be invoked with the compiler option -O.)

For more information about optimization levels, consult your compiler documentation.

Ordinarily, you first compile and debug your program without optimization. All or nearly all of the bugs in your program will show up in the unoptimized version.

After eliminating all the bugs that you can find, turn on optimization (compile with -O). If the program behaves incorrectly, scan the source code for the most common kinds of bugs that appear for the first time in optimized code:

Uninitialized variables
Out-of-bounds array references
Variable references based on the assumption that two variables are adjacent in memory

These kinds of problems, however, are often very difficult to find by examining the source code. If you cannot determine the reason for the program's misbehavior, you need to debug the optimized code.

This section provides background information on the differences between optimized code and unoptimized code.

For tutorial and task-oriented information on how to debug optimized code using the debugger, see the online help.

Optimized Code and Unoptimized Code

Source-level debugging of unoptimized code is relatively easy because there is a simple correspondence between source-code statements and the assembly-code instructions into which they are translated. Each statement is translated into a contiguous series of instructions, which are executed in sequence. The source and object code are isomorphic: they have essentially the same form. Also, program variables are stored in memory and are therefore easy to access.

Optimization destroys isomorphism. Optimization is a series of transformations performed on the object code in order to make the program run faster. An optimized program performs the same tasks and produces the same results that the source code specifies. However, the order in which these tasks are performed and the way in which they are performed can change drastically.

In effect, optimization transforms a program into a different program.

The executable program you are debugging is actually not the same program as the source program.

In addition, program variables are stored in registers instead of memory and are therefore more difficult to access.

The following sections describe these problems in detail.

What Optimization Does to Program Logic

Figure 8-4 “Unoptimized Code: Statement-to-Instruction Mapping ” shows how source-code statements map to object-code instructions in unoptimized code. Every instruction in the object code corresponds to a single statement in the source code. And, in general, every statement in the source corresponds to a sequential group of instructions in the object code. (There are a few exceptions; a loop, for example, may be broken up into two groups of instructions, one at the beginning and one at the end of the loop.)

Figure 8-4 Unoptimized Code: Statement-to-Instruction Mapping

This means that even though it is the object code that is being executed and not the source code, a view of the source code in the debugger can still give you an accurate view of what the object code is doing. For example, when you step from statement 1 to statement 2, instructions 1 through 5 are executed in order, and the current location is now instruction 6, corresponding to statement 2.

Figure 8-5 “Optimized Code: Statement-to-Instruction Mapping ” shows the several things that happen to source-code statements in optimized code.

As before, a source-code statement corresponds to several instructions. But these instructions are no longer contiguous; instead, there may be several groups of instructions, called fragments because they represent only part of a statement. (A fragment is formally defined as a maximal set of contiguous instructions corresponding to the same source statement.) In the example, the statement on line 11 corresponds to three fragments: instruction 4, instructions 9 and 10, and instruction 12.
Instructions in the object code can now correspond to more than one source statement. In the example, instruction 4 is associated with the statements on lines 11, 12, 13, 16, 17, and 18.
The order of statement execution can change. In the example, the instructions from line 12 both begin and end execution before the instructions from line 10, though some of the instructions from those lines are interleaved.

Figure 8-5 Optimized Code: Statement-to-Instruction Mapping

For all of these reasons, the order in which instructions are executed no longer corresponds to the ordering of the source-code statements. In the example, instructions 9 through 14 come from lines 11, 12, 10, 13, 10, 14, and 13.
Some source-code statements, such as line 19 in the example, have no corresponding object code at all. This can happen for several reasons, such as the elimination of code that is never executed (dead code).

When you debug unoptimized code,

The source display gives you a good idea of where you are in the program. An arrow points to your current location; you know that the statements (and instructions) before the arrow have executed, and that the statement (and instructions) after the arrow have not yet executed.
You can examine the current state of program data, because variables are stored in memory and their values are always accessible.

The reason for this is that in unoptimized code, when you step from one statement to the next in the source code, what the debugger actually does is to step from one group of assembly instructions to the next; but because the assembly instruction groups correspond exactly to the source statements, it looks as if it is the source code itself that is executing.

When you debug optimized code, however, this correspondence breaks down:

An arrow cannot accurately represent your current location in the source code. When an arrow points to a given statement in the source code, it is likely that not all the statements before that statement have finished executing, and that some of the statements after that statement have at least partly executed.
It is more difficult to determine the current state of program data, because variables may be stored in registers and access to them may be unreliable. Also, the order of assignments may be changed from the order in the source code.

In fact, it isn't really meaningful to talk of the current location in the source program with optimized code—only of the instructions that are actually executing.

What Optimization Does to Data

In unoptimized code, the values of all program variables are kept in memory. Every time the value of a variable changes, the new value is stored in memory. This means that at any time, the debugger can determine the current value of a variable.

In optimized code, the values of variables are kept in registers instead of memory as much as possible, because register access is much faster than memory access. In addition, some variables may be eliminated and replaced by constants. Therefore, it is much more difficult for the debugger to track the current value associated with a variable.

Sometimes, too, there may be multiple copies of a variable—in loop optimizations, for instance. In such cases it is often impossible to obtain a meaningful value for the variable.

Debugging Optimized Code

Technical documentation

» Table of Contents

» Glossary

» Index

Optimized Code and Unoptimized Code

What Optimization Does to Program Logic

What Optimization Does to Data