HPlogo HP C/HP-UX Programmer's Guide: HP 9000 Computers > Chapter 4 Optimizing HP C Programs

Controlling Specific Optimizer Features

» 

Technical documentation

Complete book in PDF

 » Table of Contents

Most of the time, specifying optimization level 1, 2, 3, or 4 should provide you with the control over the optimizer that you need. Additional parameters are provided when you require a finer level of control.

At each level, you can turn on and off specific optimizations using the +O[no]optimization option. The optimization parameter is the name of a specific optimization technique. The optional prefix [no] disables the specified optimization.

Below is a list of advanced optimizer options, followed by detailed information on each option:

  • +Olevel=name1[,name2,...nameN]

  • +O[no]autopar

  • +O[no]dataprefetch

  • +O[no]dynsel

  • +O[no]entrysched

  • +O[no]extern[=name1,name2,...nameN]

  • +O[no]fail_safe

  • +O[no]fastaccess

  • +O[no]fltacc

  • +O[no]global_ptrs_unique[=name1,name2,...nameN]

  • +O[no]initcheck

  • +O[no]inline[=name1,name2,...nameN]

  • +Oinline_budget[=n]

  • +O[no]libcalls

  • +O[no]loop_block

  • +O[no]loop_transform

  • +O[no]loop_unroll[=unrollfactor]

  • +O[no]loop_unroll_jam

  • +O[no]moveflops

  • +O[no]multiprocessor

  • +O[no]parallel

  • +O[no]parallel_env

  • +O[no]parmsoverlap

  • +O[no]pipeline

  • +O[no]procelim

  • +O[no]promote_indirect_calls

  • +O[no]ptrs_ansi

  • +O[no]ptrs_strongly_typed

  • +O[no]ptrs_to_globals

  • +O[no]regionsched

  • +Oreusedir=directory

  • +O[no]report[=report_type]

  • +O[no]regreassoc

  • +O[no]sharedgra

  • +O[no]sideeffects

  • +O[no]signedpointers

  • +O[no]static_prediction

  • +O[no]vectorize

  • +O[no]volatile

  • +O[no]whole_program_mode

+Olevel=name1[,name2,...nameN]

Optimization levels: 1, 2, 3, 4

Default: All functions are optimized at the level specified by the ordinary +Olevel option.

This option lowers optimization to the specified levelfor one or more named functions. level can be 0, 1, 2, 3, or 4. The name parameters are names of functions in the module being compiled. Use this option when one or more functions do not optimize well or properly. It must be used with an ordinary +Olevel option.

This option works the same as the OPT_LEVEL pragma described under “Optimizer Control Pragmas ”. This option overrides the OPT_LEVEL pragma for the specified functions. As with the pragma, you can only lower the level of optimization; you cannot raise it above the level specified in the ordinary +Olevel option. To avoid confusion, it is best to use either this option or the OPT_LEVEL pragma rather than both.

Examples

The following command optimizes all functions at level 3, except for the functions myfunc1 and myfunc2, which it optimizes at level 1.

$ cc +O3 +O1=myfunc1,myfunc2 funcs.c main.c

The following command optimizes all functions at level 2, except for the functions myfunc1 and myfunc2, which it optimizes at level 0.

$ cc -O +O0=myfunc1,myfunc2 funcs.c main.c

+O[no]autopar

See “+O[no]autopar”.

+O[no]dataprefetch

Default: +Onodataprefetch

When +Odataprefetch is enabled, the optimizer inserts instructions within innermost loops to explicitly prefetch data from memory into the data cache. Data prefetch instructions will be inserted only for data structures referenced within innermost loops using simple loop varying addresses (that is, in a simple arithmetic progression). It is only available for PA-RISC 2.0 targets.

The math library contains special prefetching versions of vector routines. If you have a PA-RISC 2.0 application that contains operations on arrays larger than 1 megabyte in size, using +Ovectorize in conjunction with +Odataprefetch may improve performance substantially.

Use this option for applications that have high data cache miss overhead.

+O[no]dynsel

See “+O[no]dynsel”.

+O[no]entrysched

Optimization levels: 1, 2, 3, 4

Default: +Onoentrysched

The +Oentrysched option optimizes instruction scheduling on a procedure"s entry and exit sequences. Enabling this option can speed up an application. The option has undefined behavior for applications which handle asynchronous interrupts by examining the sigcontext values of caller stack operands. The option affects unwinding in the entry and exit regions.

At optimization level +02 and higher (using data flow information), save and restore operations become more efficient.

This option can change the behavior of programs that perform stack unwind-based exception handling or asynchronous interrupt handling. The behavior of setjmp() and longjmp() is not affected.

+O[no]extern[=name1,name2,...nameN]

Optimization levels: 0, 1, 2, 3, 4

Default: +Oextern

This option is available in the LP64 data model only.

The +O[no]extern option allows you to specify which accesses to symbols in an executable or shared library (a load module) can be optimized. Use of +Onoextern creates code that cannot be included in a shared library.Use +Onoextern only to build executables.Only internal symbols (defined in the load module) can be optimized. If +Onoextern is specified without a name list, the compiler assumes that no symbols are external to the load module being compiled, and any symbol can be optimized. If +Oextern is specified without a name list, the compiler assumes that all symbols are external to the load module being compiled and thus cannot be optimized; this is the default.If +Oextern is specified with a name list, the compiler treats the specified symbols as external even if +Onoextern without a name list is in effect. The following example indicates that foo and bar are to eventually be imported from another load module (for example, a shared library); all other functions and data items will not be external, since +Onoextern is specified.

+Oextern=foo,bar +Onoextern

When +Onoextern is specified with a name list, the compiler treats the specified symbols as internal even if +Oextern without a name list is in effect. The following example indicates that references to baz and x may be optimized for access in the local load module. All other symbols will be subject to resolution to another load module since +Oextern is the default.

+Onoextern=baz,x

Use this option to precisely control which symbols' accesses may be optimized. Knowledge of the shared libraries used by an application, or the exported interface of a shared library is required.See also, the HP_DEFINED_EXTERNAL pragma.The default is +Oextern with no name list.

+O[no]fail_safe

Optimization levels: 1, 2, 3

Default: +Ofail_safe

The +Ofail_safe option allows compilations with internal optimization errors to continue by issuing a warning message and restarting the compilation at +O0.

You can use +Onofail_safe at optimization levels 1, 2, 3, or 4 when you want the internal optimization errors to abort your build.

This option is disabled when compiling for parallelization.

+O[no]fastaccess

Optimization levels: 0, 1, 2, 3, 4

Default: +Onofastaccess at optimization levels 0, 1, 2 and 3, +Ofastaccess at optimization level 4

The +Ofastaccess option optimizes for fast access to global data items.

Use +Ofastaccess to improve execution speed at the expense of longer compile times.

+O[no]fltacc

Optimization levels: 2, 3, 4

The +Onofltacc option allows the compiler to perform floating-point optimizations that are algebraically correct but that may result in numerical differences. For example, this option may change the order of expression evaluation as such: If a, b, and c are floating-point variables, the expressions (a + b) + c and a + (b + c) may give slightly different results due to rounding. In general, these differences will be insignificant.

The +Onofltacc option also enables the optimizer to generate fused multiply-add (FMA) instructions, the FMPYFADD and FMPYNFADD. These instructions improve performance but occasionally produce results that may differ from results produced by code without FMA instructions. In general, the differences are slight. FMA instructions are only available on PA-RISC 2.0 systems.

Specifying +Ofltacc disables the generation of FMA instructions as well as some other floating-point optimizations. Use +Ofltacc if it is important that the compiler evaluate floating-point expressions as it does in unoptimized code. The +Ofltacc option does not allow any optimizations that change the order of expression evaluation and therefore may affect the result.

If you are optimizing code at level 2 or higher and do not specify +Onofltacc or +Ofltacc, the optimizer will use FMA instructions, but will not perform floating-point optimizations that involve expression reordering or other optimizations that potentially impact numerical stability.

The list below identifies the different actions taken by the optimizer according to whether you specify +Ofltacc, +Onofltacc, or neither option.

Optimization        Expression       FMA?
Options Reordering?

+02 No Yes
+02 +Ofltacc No No
+02 +Onofltacc Yes Yes

+O[no]global_ptrs_unique[=name1,name2
,...name]

Optimization levels: 2, 3, 4

Default: +Onoglobal_ptrs_unique

Use this option to identify unique global pointers, so that the optimizer can generate more efficient code in the presence of unique pointers, for example by using copy propagation and common sub-expression elimination. A global pointer is unique if it does not alias with any variable in the entire program.

This option supports a comma-separated list of unique global pointer variable names.

+O[no]initcheck

Optimization levels: 2, 3, 4

Default: unspecified

The initialization checking feature of the optimizer has three possible states: on, off, or unspecified. When on (+Oinitcheck), the optimizer initializes to zero any local, scalar, non-static variables that are uninitialized with respect to at least one path leading to a use of the variable.

When off (+Onoinitcheck), the optimizer issues warning messages when it discovers definitely uninitialized variables, but does not initialize them.

When unspecified, the optimizer initializes to zero any local, scalar, non-static variables that are definitely uninitialized with respect to all paths leading to a use of the variable.

Use +Oinitcheck to look for variables in a program that may not be initialized.

+O[no]inline[=name1, name2,...nameN]

Optimization levels: 3, 4Default: +Oinline

When +Oinline is specified without a name list, any function can be inlined. For inlining to be successful, follow prototype definitions for function calls in the appropriate header file.

When specified with a name list, the named functions are important candidates for inlining. For example, saying

+Oinline=foo,bar +Onoinline

indicates that inlining be strongly considered for foo and bar; all other routines will not be considered for inlining, since +Onoinline is given.

When this option is disabled with a name list, the compiler will not consider the specified routines as candidates for inlining. For example, saying

+Onoinline=baz,x

indicates that inlining should not be considered for baz and x; all other routines will be considered for inlining, since +Oinline is the default.

The +Onoinline disables inlining for all functions or a specific list of functions.

Use this option when you need to precisely control which subprograms are inlined.

+Oinline_budget=n

Optimization levels: 3, 4

Default: +Oinline_budget=100

where n is an integer in the range 1 - 1000000 that specifies the level of aggressiveness, as follows:

  • n = 100 Default level of inlining.

  • n > 100 More aggressive inlining. The optimizer is less restricted by compilation time and code size when searching for eligible routines to inline.

  • n = 1 Only inline if it reduces code size.

The +Onolimit and +Osize options also affect inlining. Specifying the +Onolimit option has the same effect as specifying +Oinline_budget=200. The +Osize option has the same effect as +Oinline_budget=1.

Note, however, that the +Oinline_budget=n option takes precedence over both of these options. This means that you can override the effect of +Onolimit or +Osize option on inlining by specifying the +Oinline_budget=n option on the same compile line.

+O[no]libcalls

Optimization levels: 0, 1, 2, 3, 4

Default: +Onolibcalls

Use the +Olibcalls option to increase the runtime performance of code which calls standard library routines in simple contexts. The +Olibcalls option expands the following library calls inline:

  • strcpy()

  • sqrt()

  • fabs()

  • alloca()

  • memset()

  • memcpy()

Inlining will take place only if the function call follows the prototype definition the appropriate header file. Fast subprogram linkage is also emitted to tuned millicode versions of the math library functions sin, cos, tan, atan 2, log, pow,asin, acos, atan, exp, and log10. (See the HP-UX Floating-Point Guide for the most up-to-date listing of the math library functions.) The calling code must not expect to access ERRNO after the function"s return.

A single call to printf() may be replaced by a series of calls to putchar(). Calls to sprintf() and strlen() may be optimized more effectively, including elimination of some calls producing unused results. Calls to setjmp() and longjmp() may be replaced by their equivalents _setjmp() and _longjmp(), which do not manipulate the process"s signal mask.

Use +Olibcalls to improve the performance of selected library routines only when you are not performing error checking for these routines.

Using +Olibcalls with +Ofltacc will give different floating point calculation results than those given using +Ofltacc without +Olibcalls.

The +Olibcalls option replaces the obsolete -J option.

+O[no]loop_block

See “+O[no]loop_block”.

+O[no]loop_transform

Optimization levels: 3, 4

Default: +Oloop_transform

The +O[no]loop_transform option enables [disables] transformation of eligible loops for improved cache performance. The most important transformation is the reordering of nested loops to make the inner loop unit stride, resulting in fewer cache misses.

+Onoloop_transform may be a helpful option if you experience any problem while using +Oparallel.

+O[no]loop_unroll[=unroll factor]

Optimization levels: 2, 3, 4

Default: +Oloop_unroll

The +Oloop_unroll option turns on loop unrolling. When you use +Oloop_unroll, you can also use the unroll factor to control the code expansion. The default unroll factor is 4, that is, four copies of the loop body. By experimenting with different factors, you may improve the performance of your program.

+O[no]loop_unroll_jam

See “+O[no]loop_unroll_jam”.

+O[no]moveflops

Optimization levels: 2, 3, 4

Default: +Omoveflops

Allows [or disallows] moving conditional floating point instructions out of loops. The +Onomoveflops option replaces the obsolete +OE option. The behavior of floating-point exception handling may be altered by this option.

Use +Onomoveflops if floating-point traps are enabled and you do not want the behavior of floating-point exceptions to be altered by the relocation of floating-point instructions.

+O[no]multiprocessor

Optimization levels2: 2, 3, 4

Default: +Onomultiprocessor

If +Omultiprocessor is specified, the compiler performs optimimizations appropriate for executables or shared libraries to run in several different processes on multiprocessor machines.

If you enable this option inappropriately (for example, for an executable only run a uniprocessor system), performance may be degraded.

+O[no]parallel

See “+O[no]parallel”.

+O[no]parmsoverlap

Optimization levels: 2, 3, 4

Default: +Oparmsoverlap

The +Oparmsoverlap option optimizes with the assumption that the actual arguments of function calls overlap in memory.

The +Onoparmsoverlap option replaces the obsolete +Om1 option.

Use +Onoparmsoverlap if C programs have been literally translated from FORTRAN programs.

+O[no]pipeline

Optimization levels: 2, 3, 4

Default: +Opipeline

Enables [or disables] software pipelining. The +Onopipeline option replaces the obsolete +Os option.

Use +Onopipeline to conserve code space.

+O[no]procelim

Optimization levels: 0, 1, 2, 3, 4

Default: +Onoprocelim at levels 0-3, +Oprocelim at level 4

When +Oprocelim is specified, procedures that are not referenced by the application are eliminated from the output executable file. The +Oprocelim option reduces the size of the executable file, especially when optimizing at levels 3 and 4, at which inlining may have removed all of the calls to some routines.

When you specify +Onoprocelim, procedures that are not referenced by the application are not eliminated from the output executable file.

The default is +Onoprocelim at levels 0-3, and +Oprocelim at level 4.

If the +Oall option is enabled, the +Oprocelim option is enabled.

+O[no]promote_indirect_calls

Optimization levels: 3, 4 and profile-based optimization

Default: +Onopromote_indirect_calls

This option uses profile data from profile-based optimization and other information to determine the most likely target of indirect calls and promotes them to direct calls. In all cases the optimized code tests to make sure the direct call is being taken & if not, executes the indirect call. If +Oinline is in effect, the optimizer may also inline the promoted calls. This option can only be used with profile-based optimization, described in “Profile-Based Optimization ”.

The optimizer tries to determine the most likely target of indirect calls. If the profile data is incomplete or ambiguous, the optimizer may not select the best target. If this happens, your code's performance may decrease.

At +O3, this option is only effective if indirect calls from functions within a file are mostly to target functions within the same file. This is because +O3 optimizes only within a file whereas +O4 optimizes across files.

+O[no]ptrs_ansi

Optimization levels: 2, 3, 4

Default: +Onoptrs_ansi

Use +Optrs_ansi to make the following two assumptions, which the more aggressive +Optrs_strongly_typed does not make:

  • An int *p is assumed to point to an int field of a struct or union.

  • char * is assumed to point to any type of object.

When both are specified, +Optrs_ansi takes precedence over +Optrs_strongly_typed.

For more information about type aliasing see “Aliasing Options ”.

+O[no]ptrs_strongly_typed

Optimization levels: 2, 3, 4

Default: +Onoptrs_strongly_typed

Use +Optrs_strongly_typed when pointers are type-safe. The optimizer can use this information to generate more efficient code.

Type-safe (that is, strongly-typed) pointers are pointers to a specific type that only point to objects of that type, and not to objects of any other type. For example, a pointer declared as a pointer to an int is considered type-safe if that pointer points to an object only of type int, but not to objects of any other type.

Based on the type-safe concept, a set of groups are built based on object types. A given group includes all the objects of the same type.

The term type-inferred aliasing is a concept which means any pointer of a type in a given group (of objects of the same type) can only point to any object from the same group; it can not point to a typed object from any other group.

For more information about type aliasing see “Aliasing Options ”.

Type casting to a different type violates type-inferring aliasing rules. See Example 2 below.

Dynamic casting is allowed. See Example 3 below.

For more details, see “Aliasing Options ”.

Example 1: How Data Types Interact

The optimizer generally spills all global data from registers to memory before any modification to global variables or any loads through pointers. However, you can instruct the optimizer on how data types interact so it can generate more efficient code.

If you have the following:

1  int *p;
2 float *q;
3 int a,b,c;
4 float d,e,f;
5 foo()
6 {
7 for (i=1;i<10;i++) {
8 d=e
9 *p=b;
10 e=d+f;
11 f=*q;
12 }
13 }

With +Onoptrs_strongly_typed turned on, the pointers p and q will be assumed to be disjoint because the types they point to are different types. Without type-inferred aliasing, *p is assumed to invalidate all the definitions. So, the use of d and f on line 10 have to be loaded from memory. With type-inferred aliasing, the optimizer can propagate the copy of d and f and thus avoid two loads and two stores.

This option can be used for any application involving the use of pointers, where those pointers are type safe. To specify when a subset of types are type-safe, use the [NO]PTRS_STRONGLY_TYPED pragma. The compiler issues warnings for any incompatible pointer assignments that may violate the type-inferred aliasing rules discussed in “Aliasing Options ”.

Example 2: Unsafe Type Cast

Any type cast to a different type violates type-inferred aliasing rules. Do not use +Optrs_strongly_typed with code that has these unsafe type casts. Use the [NO]PTRS_STRONGLY_TYPED pragma to prevent the application of type-inferred aliasing to the unsafe type casts.

struct foo{
int a;
int b;
} *P;

struct bar {
float a;
int b;
float c;
} *q;

P = (struct foo *) q;
/* Incompatible pointer assignment
through type cast */

Example 3: Generally Applying Type Aliasing

Dynamic cast is allowed with +Optrs_strongly_typed or +Optrs_ansi. A pointer dereference is called dynamic cast if a cast is applied on the pointer to a different type.

In the example below, type-inferred aliasing is applied on P generally, not just to the particular dereference. Type-aliasing will be applied to any other dereferences of P.

struct s {
short int a;
short int b;
int c;
} *P;
* (int *)P = 0;

For more information about type aliasing, see “Aliasing Options ”.

+O[no]ptrs_to_globals[=name1, name2, ...nameN]

Optimization levels: 2, 3, 4

Default: +Optrs_to_globals

By default global variables are conservatively assumed to be modified anywhere in the program. Use this option to specify which global variables are not modified through pointers, so that the optimizer can make your program run more efficiently by incorporating copy propagation and common sub-expression elimination.

This option can be used to specify all global variables as not modified via pointers, or to specify a comma-separated list of global variables as not modified via pointers.

Note that the on state for this option disables some optimizations, such as aggressive optimizations on the program"s global symbols.

For example, use the command-line option +Onoptrs_to_globals=a,b,c to specify global variables a, b, and c as not being accessed through pointers. No pointer can access these global variables. The optimizer will perform copy propagation and constant folding because storing to *p will not modify a or b.

int a, b, c;
float *p;
foo()
{
a = 10;
b = 20;
*p = 1.0;
c = a + b;
}

If all global variables are unique, use the following option without listing the global variables:

+Onoptrs_to_globals

In the example below, the address of b is taken. This means b can be accessed indirectly through the pointer. You can still use +Onoptrs_to_globals as: +Onoptrs_to_globals +Optrs_to_globals=b.

long b,c;
int *p;

p=b;

foo()

For more information about type aliasing see “Aliasing Options ”.

+O[no]regionsched

Optimization levels: 2, 3, 4

Default: +Onoregionsched

Applies aggressive scheduling techniques to move instructions across branches. This option is incompatible with the linker -z option. If used with -z, it may cause a SIGSEGV error at run-time.

Use +Oregionsched to improve application run-time speed. Compilation time may increase.

+Oreusedir=directory

Optimization levels: 4 or with profile-based optimization

Default: no reuse of object files

This option specifies a directory where the linker can save object files created from intermediate object files when using +O4 or profile-based optimization. It reduces link time by not recompiling intermediate object files when they don't need to be.

When you compile with +I, +P, or +O4, the compiler generates intermediate code in the object file. Otherwise, the compiler generates regular object code in the object file. When you link, the linker first compiles the intermediate object code to regular object code, then links the object code. With this option you can reduce link time on subsequent links by avoiding recompiling intermediate object files that have already been compiled to regular object code and have not changed.

Note that when you do change a source file or command line options and recompile, a new intermediate object file will be created and compiled to regular object code in the specified directory. The previous object file in the directory will not be removed. You should periodically remove this directory since old object files cannot be reused and will not be automatically removed.

+O[no]regreassoc

Optimization levels: 2, 3, 4

Default: +Oregreassoc

If disabled, this option turns off register reassociation.

Use +Onoregreassoc to disable register reassociation if this optimization hinders the optimized application performance.

+O[no]report=[report_type]

See “+O[no]report[= report_type]”.

+O[no]sharedgra

See “+O[no]sharedgra”.

+O[no]sideeffects[=name1, name2, ...nameN]

Optimization levels: 2, 3, 4

Default: assume all subprograms have side effects

Assume that subprograms specified in the name list might modify global variables. Therefore, when +Osideeffects is enabled the optimizer limits global variable optimization.

The default is to assume that all subprograms have side effects unless the optimizer can determine that there are none.

Use +Onosideeffects if you know that the named functions do not modify global variables and you wish to achieve the best possible performance.

+O[no]signedpointers

Optimization levels: 0, 1, 2, 3, 4

Default: +Onosignedpointers

Perform [or do not perform] optimizations related to treating pointers as signed quantities. Applications that allocate shared memory and that compare a pointer to shared memory with a pointer to private memory may run incorrectly if this optimization is enabled.

Use +Osignedpointers to improve application run-time speed.

+O[no]static_prediction

Optimization levels: 0, 1, 2, 3, 4

Default: +Onostatic_prediction

+Ostatic_prediction turns on static branch prediction for PA-RISC 2.0 targets.

PA-RISC 2.0 has two means of predicting which way conditional branches will go: dynamic branch prediction and static branch prediction. Dynamic branch prediction uses a hardware history mechanism to predict future executions of a branch from its last three executions. It is transparent and quite effective unless the hardware buffers involved are overwhelmed by a large program with poor locality.

With static branch prediction on, each branch is predicted based on implicit hints encoded in the branch instruction itself; the dynamic branch prediction is not used.

Static branch prediction"s role is to handle large codes with poor locality for which the small dynamic hardware facility will prove inadequate.

Use +Ostatic_prediction to better optimize large programs with poor instruction locality, such as operating system and database code.

Use this option only when using PBO, as an amplifier to +P. It is allowed but silently ignored with +I, so makefiles need not change between the +I and +P phases.

+O[no]vectorize

Optimization levels: 0, 1, 2, 3, 4

Default: +Onovectorize

+Ovectorize allows the compiler to replace certain loops with calls to vector routines.

Use +Ovectorize to increase the execution speed of loops.

When +Onovectorize is specified, loops are not replaced with calls to vector routines.

Because the +Ovectorize option may change the order of operations in an application, it may also change the results of those operations slightly. See the HP-UX Floating-Point Guide for details.

The math library contains special prefetching versions of vector routines. If you have a PA2.0 application that contains operations on very large arrays (larger than 1 megabyte in size), using +Ovectorize in conjunction with +Odataprefetch may improve performance substantially.

You may use +Ovectorize at levels 3 and 4. +Onovectorize is also included as part of +Oaggressive and +Oall.

This option is only valid for PA-RISC 1.1 and 2.0 systems.

+O[no]volatile

Optimization levels: 1, 2, 3, 4

Default: +Onovolatile

The +Ovolatile option implies that memory references to global variables cannot be removed during optimization.

The +Onovolatile option implies that all globals are not of volatile class. This means that references to global variables can be removed during optimization.

The +Ovolatile option replaces the obsolete +OV option.

Use this option to control the volatile semantics for all global variables.

+O[no]whole_program_mode

Optimization level: 4

Default: +Onowhole_program_mode

The +Owhole_program_mode option enables the assertion that only the files that are compiled with this option directly reference any global variables and procedures that are defined in these files. In other words, this option asserts that there are no unseen accesses to the globals.

When this assertion is in effect, the optimizer can hold global variables in registers longer and delete inlined or cloned global procedures.

All files compiled with +Owhole_program_mode must also be compiled with +O4. If any of the files were compiled with +O4 but were not compiled with +Owhole_program_mode, the linker disables the assertion for all files in the program.

The default, +Onowhole_program_mode, disables the assertion.

Use this option to increase performance speed, but only when you are certain that only the files compiled with +Owhole_program_mode directly access any globals that are defined in these files.

© Hewlett-Packard Development Company, L.P.