|
|
HP C/HP-UX Programmer's Guide: HP 9000 Computers > Chapter 8 Threads and Parallel
ProcessingParallel Processing Options |
|
HP C provides the following optimization options for parallelizing C programs: Optimization level(s): 3, 4 Default: +Oautopar if +Oparallel is enabled When used with +Oparallel, the +Onoautopar option causes the compiler to parallelize only those loops marked by the loop_parallel or prefer_parallel pragmas. Because the compiler does not automatically find parallel tasks or regions, user-specified task and region parallelization is not affected by this option. A loop is safe to parallelize if it has an iteration count that can be determined at runtime before loop invocation, and contains no loop-carried dependences, procedure calls, or I/O operations. A loop-carried dependence exists when one iteration of a loop assigns a value to an address that is referenced or assigned on another iteration. Optimization level(s): 3, 4 Default: +Odynsel if +Oparallel is enabled When specified with +Oparallel, +Odynsel (the default) enables workload-based dynamic selection. For parallelizable loops whose iteration counts are known at compile time, +Odynsel causes the compiler to generate either a parallel or a serial version of the loop—depending on which is more profitable. This optimization also causes the compiler to generate both parallel and serial versions of parallelizable loops whose iteration counts are unknown at compile time. At runtime, the loop workload is compared to parallelization overhead, and the parallel version is run only if it is profitable to do so. The +Onodynsel option disables dynamic selection and tells the compiler that it is profitable to parallelize all parallelizable loops. The dynsel pragma can be used to enable dynamic selection for specific loops when +Onodynsel is in effect. See Also: “dynsel[(trip_count=n)]” Optimization level(s): 3, 4 Default: +Onoloop_block The +O[no]loop_block option enables [disables] blocking of eligible loops for improved cache performance. The +Onoloop_block option disables automatic and directive-specified loop blocking. For more information on loop blocking, see the Parallel Programming Guide for HP-UX Systems. Optimization level(s): 3, 4 Default: +Onoloop_unroll_jam The +O[no]loop_unroll_jam option enables [disables] loop unrolling and jamming. The +Onoloop_unroll_jam option disables both automatic and directive-specified unroll and jam. Loop unrolling and jamming increases register exploitation. For more information on the unroll and jam optimization, see the Parallel Programming Guide for HP-UX Systems. Optimization level(s): 3, 4 Default: +Onoparallel The +Oparallel option optimizes the time it takes to execute a single process running on a multiprocessor system.
The +Oparallel option causes the compiler to:
The following methods can be used to specify the number of processors used in executing your parallel programs:
The +Oparallel option disables +Ofailsafe. See Also: “Transforming Loops for Parallel Execution (+Oparallel) ”. Optimization level(s): 3, 4 Default: +Onoreport This option causes the compiler to display various optimization reports. +Onoreport is the default. The value of report_type determines which report is displayed, as described below. +Oreport=loop produces the Loop Report. This report gives information on optimizations performed on loops and calls. Using +Oreport (without =report_type) also produces the Loop Report. +Oreport=private produces the Loop Report and the Privatization Table, which provides information on loop variables that are privatized by the compiler. +Oreport=all produces all reports. The +Oreport[=report_type] option is active only at +O3 and above. The +Onoreport option does not accept any of the report_type values. See the Parallel Programming Guide for HP-UX Systems for more information on the optimization reports. Optimization level(s): 2, 3, 4 Default: +Osharedgra The +Onosharedgra option disables global register allocation for shared-memory variables that are visible to multiple threads. This option can help if a variable shared among parallel threads is causing wrong answers. See the Parallel Programming Guide for HP-UX Systems for more information. |
|