HPlogo HP C/HP-UX Programmer's Guide: HP 9000 Computers > Chapter 8 Threads and Parallel Processing

Parallel Processing Pragmas

» 

Technical documentation

Complete book in PDF

 » Table of Contents

The syntax of a parallel processing pragma is:

#pragma [_CNX] pragma-list

where:

pragma-list

is a comma-separated list of pragmas described in this section.

See “Specifying Task Parallelism” for an example on using these pragmas.

In the sections that follow, namelist represents a comma-separated list of variables or arrays. The occurrence of a lowercase n or m is used to indicate an integer constant. Occurrences of gate_var are for variables that have been, or are being, defined as gates.

begin_tasks[(attribute_list)]

This pragma defines the beginning of sections of code (see next_task) that are to be executed as independent, parallel tasks. Each task is executed by a separate thread. begin_tasks must have an accompanying end_tasks in the same program unit.

The optional attribute_list can be any of the following legal combinations (m is an integer constant):

  • threads (default)

  • dist

  • ordered

  • max_threads=m

  • threads, ordered

  • dist, ordered

  • threads, max_threads=m

  • dist, max_threads=m

  • ordered, max_threads=m

  • threads, ordered, max_threads=m

  • dist, ordered, max_threads=m

Attributes may be listed in any order. The compiler flags any attribute combinations other than those listed above with a warning and ignores the pragma.

Refer to the Parallel Programming Guide for HP-UX Systems for a complete discussion of parallel tasking.

block_loop[(block_factor=n)]

This pragma indicates a specific loop to block, and optionally, the block factor n (n must be an integer constant greater than or equal to 2) that is to be used in the compiler's internal computation of loop nest based data reuse. If no block_factor is specified, the compiler uses a heuristic to determine the block_factor. Refer to the Parallel Programming Guide for HP-UX Systems for more information on blocking.

critical_section[(gate_var)]

This pragma defines the beginning of a code block in which only one thread may be executing at a time. The end of the code block must be indicated by an end_critical_section pragma, which must appear in the same flow of control within the same program unit. The optional gate_var can be used to implement a critical section which is not contiguous at the source level. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

dynsel[(trip_count=n)]

This pragma enables workload-based dynamic selection for the immediately following loop. trip_count represents either the thread_trip_count or node_trip_count attribute, and n is an integer constant.

When thread_trip_count=n is specified, the serial version of the loop is run if the iteration count is less than n; otherwise, the thread-parallel version is run. When node_trip_count=n is specified, the serial version of the loop is run if the iteration count is less than n; otherwise, the node-parallel version is run, assuming +Onodepar is specified.

end_critical_section

This pragma defines the end of the critical section that was begun with the critical_section pragma. critical_section and end_critical_section must appear as a pair. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

end_ordered_section

This pragma defines the end of the ordered section that was begun with the ordered_section pragma. ordered_section and end_ordered_section must appear as a pair. Refer to the Parallel Programming Guide for HP-UX Systems for more information on ordered sections.

end_parallel

This pragma signifies the end of a parallel region. The parallel pragma signifies the beginning of a parallel region. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

end_tasks

This pragma terminates the specification of parallel tasks indicated by begin_tasks and next_task. It must appear at the end of the last section of parallel code defined by these pragmas. All of these must appear in the same program unit. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

loop_parallel[(attribute_list)]

This pragma is an explicit instruction to the compiler to parallelize the immediately following loop. The loop iterations are run in an indeterminate order unless the optional ordered attribute appears. You are responsible for any required data privatization and loop synchronization, as described in the Parallel Programming Guide for HP-UX Systems. The optional attribute_list can be any of the following combinations (n and m are integer constants):

  • threads (default)

  • dist

  • ordered

  • max_threads=m

  • chunk_size=n

  • threads, ordered

  • dist, ordered

  • threads, max_threads=m

  • dist, max_threads=m

  • ordered, max_threads=m

  • threads, chunk_size=n

  • dist, chunk_size=n

  • threads, ordered, max_threads=m

  • dist, ordered, max_threads=m

  • chunk_size=n, max_threads=m

  • threads, chunk_size=n, max_threads=m

  • dist, chunk_size=n, max_threads=m

  • ivar= indvar

The ivar= indvar attribute is:

  • Required for all loops in C

  • Compatible with any other attribute

Attributes may be listed in any order. The compiler flags any attribute combinations other than those listed above with a warning and ignores the pragma.

Refer to the Parallel Programming Guide for HP-UX Systems for more information.

loop_private(namelist)

This pragma declares a list of variables and/or arrays private to the immediately following loop. No values may be carried into the loop by loop_private variables. To be loop private, the variables and/or arrays must be assigned before they are used on each iteration of the immediately following loop. These private data items should be treated as distinct objects from the shared items of the same name that exist outside the loop. Values assigned to loop_private variables on the final iteration (that is, the nth iteration of a loop with n iterations) may be saved into the shared variables of the same name if the save_last pragma also appears on this loop. If save_last is not used, then the value of any shared variable declared to be loop_private is undefined at loop termination. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

next_task

This pragma starts a block of code following a begin_tasks block that will be executed as a parallel task. The end of the code block is marked by another next_task or by an end_tasks pragma.

This pragma must appear within a begin_tasks and end_tasks pair. There is no limit on the number of next_task pragmas that can appear. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

no_block_loop

This pragma disables loop blocking on the immediately following loop. Refer to the Parallel Programming Guide for HP-UX Systems for more information on loop blocking.

no_distribute

This pragma disables loop distribution for the immediately following loop. Refer to the Parallel Programming Guide for HP-UX Systems for more information on loop distribution.

no_dynsel

This pragma disables workload-based dynamic selection for the immediately following loop. Refer to the Parallel Programming Guide for HP-UX Systems for more information on dynamic selection.

no_loop_dependence(namelist)

This pragma informs the compiler that the arrays in namelist do not have any dependencies for iterations of the immediately following loop. Use no_loop_dependence for arrays only; use loop_private to indicate dependence-free scalar variables.

This pragma causes the compiler to ignore any dependences that it perceives to exist. This can enhance the compiler's ability to optimize the loop, including the possibility of parallelization.

Refer to the Parallel Programming Guide for HP-UX Systems for more information.

no_loop_transform

This pragma prevents the compiler from performing reordering transformations on the following loop. The compiler does not distribute, fuse, block, interchange, unroll, unroll and jam, or parallelize a loop on which this pragma appears. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

no_parallel

This pragma prevents the compiler from generating parallel code for the immediately following loop. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

no_side_effects(funclist)

This pragma (#pragma _CNX no_side_effects) informs the compiler that the functions appearing in funclist have no side effects wherever they appear lexically following the pragma. Side effects include modifying a function argument, performing I/O, or calling another routine that does any of the above. The compiler can sometimes eliminate calls to procedures that have no side effects; also, the compiler may be able to parallelize loops with calls when informed that the called routines do not have side effects.

ordered_section(gate_var)

This pragma defines the beginning of an ordered section. An ordered section is the same as a critical section (a code block in which only one thread may be executing at a time) with the additional restriction that the threads must pass through the ordered section in iteration order of the most recently initiated parallelized loop. The end of the code block must be indicated by an end_ordered_section pragma. Ordered sections must appear within the control flow of a loop_parallel (ordered) pragma. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

parallel[(attribute_list)]

This pragma signifies the beginning of a parallel region of code. All code up to the following end_parallel pragma will be run on all available threads. No loop transformations, data privatization, or parallelization analysis will be performed by the compiler on the region.

The optional attribute_list can be any of the following legal combinations (m is an integer constant):

  • threads (default)

  • max_threads=m

  • threads,max_threads=m

Attributes may be listed in any order. The compiler flags any attribute combinations other than those listed above with a warning and ignores the pragma.

Refer to the Parallel Programming Guide for HP-UX Systems for more information.

parallel_private(namelist)

This pragma declares a list of variables or arrays private to the immediately following parallel region. It serves the same purpose for parallel regions that task_private serves for tasks. The privatized variables and arrays will not carry their values beyond the end_parallel pragma. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

prefer_parallel[(attribute_list)]

This pragma instructs the compiler to parallelize the following loop, but only if it is safe to do so. A loop is safe to parallelize if it has an iteration count that can be determined at runtime before loop invocation and contains no loop-carried dependences, procedure calls, or I/O operations. (A loop-carried dependence exists when one iteration of a loop assigns a value to an address that is referenced or assigned on another iteration.) Refer to the Parallel Programming Guide for HP-UX Systems for more information.

The optional attribute_list can be any of the following combinations (n and m are integer constants):

  • threads (default)

  • dist

  • max_threads=m

  • chunk_size=n

  • threads, max_threads=m

  • dist, max_threads=m

  • threads, chunk_size=n

  • dist, chunk_size=n

  • chunk_size=n, max_threads=m

  • threads, chunk_size=n, max_threads=m

  • dist, chunk_size=n, max_threads=m

Attributes may be listed in any order. The compiler flags any attribute combinations other than those listed above with a warning and ignores the pragma.

save_last[(list)]

This pragma specifies that the variables in the comma-separated list that are also named in an associated loop_private(namelist)pragma must have their last values saved into the "shared" variable of the same name at loop termination. (A variable's last value in a loop of n iterations is the value it is assigned in the nth iteration.)

If the optional list is not used, save_last specifies that all variables named in an associated loop_private(namelist) pragma must have their last values saved into the "shared" variable of the same name at loop termination.

If save_last is not specified then the values in any privatized variables or arrays are indeterminate at loop termination. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

scalar

This pragma prevents the compiler from performing reordering transformations on the following loop. The compiler does not distribute, fuse, block, interchange, unroll, unroll and jam, or parallelize a loop on which this pragma appears.

The no_loop_transform pragma provides the same functionality as the scalar pragma and is recommended in place of the scalar pragma.

task_private(namelist)

This pragma privatizes the variables and arrays specified in namelist for each task specified in the immediately following begin_tasks/end_tasks block. If a task_private data object is referenced within a task, it must have been assigned a value previously in that task. The privatized variables and arrays do not carry their values beyond the end_tasks pragma. Refer to the Parallel Programming Guide for HP-UX Systems for more information.

Specifying Task Parallelism

The following example uses the begin_tasks, task_private, next_task, and end_tasks pragmas to specify simple task-parallelism:

/* one thread executes the for loop */
#pragma begin_tasks, task_private(i)

for(i=0;i<n-1;i++)
a[i] = a[i +1] + b[i];

/* another thread executes the function call */
#pragma next_task

tsub(x,y);

/* a third thread assigns elements of array d to every
other element of c */
#pragma next_task

for(i=0;i<500;i++)
c[i*2]=d[i];

#pragma end_tasks

The loop induction variable i is manually privatized because it is used to control loops in two different tasks. If i was not private, both tasks would modify it, causing wrong answers. The task_private pragma is described in “task_private(namelist)”.

© Hewlett-Packard Development Company, L.P.