HP 3000 Manuals

OPTIMIZE Directive [ HP FORTRAN 77/iX Reference ] MPE/iX 5.0 Documentation


HP FORTRAN 77/iX Reference

OPTIMIZE Directive 

The OPTIMIZE directive sets up optimizer options that can improve
performance.

Syntax 

          [LEVEL1                                 ]
          [LEVEL2                                 ]
          [LEVEL2_MIN                             ]
          [LEVEL2_MAX                             ]
          [ASSUME_NO_EXTERNAL_PARMS               ]
$OPTIMIZE [ASSUME_NO_FLOATING_INVARIANT           ] [ON ]
          [ASSUME_NO_PARAMETER_OVERLAPS           ] [OFF]
          [ASSUME_NO_SHARED_COMMON_PARMS          ]
          [ASSUME_NO_SIDE_EFFECTS                 ]
          [ASSUME_PARM_TYPES_MATCHED              ]
          [LOOP_UNROLL[COPIES=n SIZE=n STATISTICS]]

ON                            Alone, specifies level 2 optimization.

                              With a preceding option, sets that option
                              on.

OFF                           Alone, specifies level 0 optimization.
                              This is the default.

                              With a preceding option, sets that option
                              off.

LEVEL1                        Specifies level 1 optimization.

LEVEL2                        Specifies level 2 optimization, with the
                              following ASSUME settings:

                                     ASSUME_NO_EXTERNAL_PARMS       ON
                                     ASSUME_NO_FLOATING_INVARIANT   ON
                                     ASSUME_NO_PARAMETER_OVERLAPS   ON
                                     ASSUME_NO_SHARED_COMMON_PARMS  ON
                                     ASSUME_NO_SIDE EFFECTS         OFF
                                     ASSUME_PARM_TYPES_MATCHED      ON
                                     LOOP_UNROLL                    ON

LEVEL2_MIN                    Specifies level 2 optimization with all the
                              ASSUME settings OFF.

LEVEL2_MAX                    Specifies level 2 optimization with all the
                              ASSUME settings ON.

ASSUME_NO_EXTERNAL_PARMS      Assumes that none of the parameters passed
                              to the current procedure are from an
                              external space, that is, different from the
                              user's own data space.  Parameters can come
                              from another space if they come from
                              operating system space or if they are in a
                              space shared by other users.

ASSUME_NO_FLOATING_INVARIANT  Assumes that no floating invariant
                              operations are executed conditionally with
                              loops.

ASSUME_NO_PARAMETER_OVERLAPS  Assumes that no actual parameters passed to
                              a procedure overlap each other.

ASSUME_NO_SHARED_COMMON_PARMS This directive should be ON when all of the
                              following are true:
                                 *   The parameter passed to the current
                                     procedure is part of a common block
                                     used by that procedure.
                                 *   The parameter is named differently
                                     than the variable name it has in the
                                     common block.
                                 *   The parameter is reassigned with the
                                     same value within the procedure.

ASSUME_NO_SIDE_EFFECTS        Assumes that the current procedure changes
                              only local variables.  It does not change
                              any variables in COMMON, nor does it change
                              parameters.

ASSUME_PARM_TYPES_MATCHED     Assumes that all of the actual parameters
                              passed were the type expected by this
                              subroutine.

LOOP_UNROLL                   Unrolls DO loops having 60 or less
                              operations four times.  For further
                              details, see "Loop Unrolling" in this
                              chapter.  The default is ON.

There are five levels of optimization:

Level 0          Does no optimizing.  This is obtained by specifying
                 $OPTIMIZE OFF.

Level 1          Optimizes only within each basic block.  This is
                 obtained by specifying $OPTIMIZE LEVEL1 ON.

Level 2 minimum  Optimizes within each procedure with no assumptions on
                 interactions of procedures.  That is, the compiler
                 assumes nothing, making this the most conservative level
                 2 optimization.  This level is obtained by specifying
                 $OPTIMIZE LEVEL2_MIN ON within each procedure.

Level 2 normal   Optimizes within each procedure with normal assumptions
                 on interactions of procedures set as described earlier.
                 In general, these settings are appropriate for most
                 FORTRAN programs.  This level is obtained by specifying
                 $OPTIMIZE LEVEL2 ON, $OPTIMIZE ON or just $OPTIMIZE
                 within each procedure.

Level 2 maximum  Optimizes within each procedure with all assumptions on
                 interactions of procedures set to OFF. This is obtained
                 by specifying $OPTIMIZE LEVEL2_MAX ON within each
                 procedure.

A basic block is a set of instructions to be executed in sequence, with
one entrance, the first instruction, and one exit, the last; the block
contains no branches.

Parameters can come from another space if they come from the operating
system or if they are in a space shared by other users.

The following options are meaningful only when the compiler is performing
level 2 optimization, that is, only if the option ON, LEVEL2, LEVEL2_MIN,
or LEVEL2_MAX has been specified:

     ASSUME_NO_PARAMETER_OVERLAPS
     ASSUME_NO_SIDE_EFFECTS
     ASSUME_PARM_TYPES_MATCHED
     ASSUME_NO_EXTERNAL_PARMS
     ASSUME_NO_SHARED_COMMON_PARMS
     ASSUME_NO_FLOATING_INVARIANT
     LOOP_UNROLL

Default               Off.

Location              The following OPTIMIZE options must appear before
                      any nondirective statements in the program unit:

                           OFF
                           ON
                           LEVEL1
                           LEVEL2
                           LEVEL2_MIN
                           LEVEL2_MAX
                           ASSUME_NO_PARAMETER_OVERLAPS
                           ASSUME_NO_EXTERNAL_PARMS
                           ASSUME_NO_SHARED_COMMON_PARMS
                           ASSUME_NO_FLOATING_INVARIANT

                      These options can appear anywhere within a program
                      unit:

                           ASSUME_NO_SIDE_EFFECTS
                           ASSUME_PARM_TYPES_MATCHED
                           LOOP_UNROLL

Toggling/ Duration    The optimize options remain in effect until they
                      are changed by another OPTIMIZE directive.

Impact on             This directive can improve performance.  Loop
Performance           unrolling, which usually improves performance, can
                      occasionally degrade performance because of large
                      loops (register spilling) and code expansion
                      (crossing the page boundary causing cache misses
                      and TLB misses.)

Flagging Uninitialized Variables 

When the compiler is performing level 2 optimization, it will detect any
uninitialized non-static simple local variables.  However, it will not
detect uninitialized common variables, static variables, or variables of
character and complex type.  For example:

     $OPTIMIZE
         FUNCTION func(type)
         COMMON /a/comvar
         SAVE statvar
         REAL foo,type
         type = 10.2
         foo = comvar
         foo = statvar
         foo = typo
         RETURN
         END

The variable typo is flagged as an uninitialized variable because it was
typed incorrectly and, therefore, not initialized.  However, statvar
and comvar are not flagged because of their global and static
characteristics.  A warning message will be issued when an uninitialized
variable is detected.

Example 

     C     Start with minimum level 2 optimization.
     $OPTIMIZE LEVEL2_MIN

           PROGRAM FEQ7
           INTEGER num(10), ans, calculate
           CHARACTER*2 option(10)
     C
     C     For the next two calls, the parameter type declarations are the same in
     C     the main program and the subroutine or function.  Therefore, we can
     C     further optimize the program by setting the following optimizer option.
     $OPTIMIZE ASSUME_PARM_TYPES_MATCHED ON
           call getnum_option(num,option)
     C
     C     For the next call, the function will not change the parameter value or
     C     any global variables in COMMON blocks.  Therefore, we can further
     C     optimize the program by setting the following optimizer option.
     $OPTIMIZE ASSUME_NO_SIDE_EFFECTS ON
           ans= calculate(num,option)
     $OPTIMIZE ASSUME_NO_SIDE_EFFECTS OFF
           WRITE(6,*) 'Result = ',ans
           END
     C
     C     For the next subroutine, we know that the actual parameters passed to
     C     this subroutine are not overlapped with each other, from a shared
     C     common block, nor from another space different from the user's own
     C     program, thus we can further optimize the program by setting the
     C     following optimizer options.
     $OPTIMIZE ASSUME_NO_PARAMETER_OVERLAPS ON
     $OPTIMIZE ASSUME_NO_EXTERNAL PARMS ON
     $OPTIMIZE ASSUME_NO_SHARED_COMMON PARMS ON
           SUBROUTINE getnum_option(value,operation)
           INTEGER value(10)
           CHARACTER*2 operation(10)

           DO 10  i = 1,10
     20    WRITE(6,*) 'Please input operation type and integer value :'
           READ(5,*) operation(i),value(i)

           IF (operation(i).EQ.' ') GOTO 30

           IF ((operation(i).NE.'**').AND.
          /    (operation(i).NE.'*' ).AND.
          /    (operation(i).NE.'/' ).AND.
          /    (operation(i).NE.'-' ).AND.
          /    (operation(i).NE.'+' )) GOTO 20
     10    CONTINUE
     30    RETURN
           END
     C
     C     For the next subroutine, we know that the actual parameters passed to
     C     this subroutine are not overlapped with each other, not from
     C     external space, nor from a shared common block.  We can thus leave the
     C     ASSUME_NO_PARAMETER_OVERLAPS, ASSUME_NO_EXTERNAL_PARMS, and
     C     ASSUME_NO_SHARED_COMMON_PARMS settings ON.
     C
           FUNCTION calculate(value,operation)
           INTEGER value(10),calculate,ans
           CHARACTER*2 operation(10)

           ans = 0
           DO 10  i = 1,10

           IF (operation(i).EQ.' ') GOTO 30

           IF (operation(i).EQ.'**') THEN
                ans = ans ** value(i)
           ELSE IF (operation(i).EQ.'*' ) THEN
                ans = ans * value(i)
           ELSE IF (operation(i).EQ.'/' ) THEN
                ans = ans / value(i)
           ELSE IF (operation(i).EQ.'-' ) THEN
                ans = ans - value(i)
           ELSE IF (operation(i).EQ.'+' ) THEN
                ans = ans + value(i)
           ENDIF
     10    CONTINUE
     30    calculate = ans
           RETURN
           END

Loop Unrolling 

                      [ON        ]
                      [OFF       ]
$OPTIMIZE LOOP_UNROLL [COPIES = n]
                      [,SIZE = n ]
                      [STATISTICS]

ON                    Turns on loop unrolling.  ON is the default at
                      level 2.

OFF                   Turns off loop unrolling.

COPIES = n            Tells the compiler to unroll the loop n times.  The
                      default is four times.

SIZE = n              Tells the compiler to unroll the loops that have
                      less than n operations.  The default is 60
                      operations.

STATISTICS            Tells the compiler to give statistics about the
                      unrolled loops.

Limits on Use.   

DO loops at level 2 are unrolled four times by default.  If the loop
limit is either not known at compile time or is less than four times, an
extra copy of the DO loop body is generated.  This is called unrolling
the loop four or more times.

Although loop unrolling optimization usually increases performance, it
can occasionally degrade performance because of large loops (register
spilling) and code expansion (crossing the page boundary causing cache
misses and TLB misses.)  When you encounter these circumstances, you can
turn off loop unrolling locally by using the compiler directive.  Use the
compiler directive $OPTIMIZE to specify optimization level in the source
and for changing the assumptions made by the compiler.  You can use a
suboption LOOP_UNROLL to control some constraints:

     $OPTIMIZE LOOP_UNROLL

You can also use the LOOP_UNROLL suboption on the $OPTIMIZE directive to
change the DO LOOP constraints for unrolling dynamically:

   *   You can unroll a DO loop more than four times.

   *   You can force a DO loop to unroll despite its large size.

   *   You can find the reason why a DO loop is not unrolled.

The highest level of optimization must be on for LOOP_UNROLL to work.
Otherwise, LOOP_UNROLL is ignored.  If LOOP_UNROLL is ignored, but
STATISTICS has been specified, you will still get the DO loop statistics.


NOTE The number of operations reported by STATISTICS is approximate. Each assignment, arithmetic operation, and logical operation counts as an operation. Each subscript of a subscripted variable counts as a separate operation.
To unroll the loop two times instead of four times (which is the default), use $OPTIMIZE LOOP_UNROLL COPIES=2 To unroll a DO loop that is larger than the default, use $OPTIMIZE LOOP_UNROLL COPIES=2, SIZE=500 substituting an appropriate size for the digit 500. Example. C Example to illustrate the use of LOOP_UNROLL $OPTIMIZE ON PROGRAM UNROLL_EXAMPLE DIMENSION A(10), B(10,10) DIMENSION X(10,10,10), Y(10,10,10), Z(10,10,10) . . . . C The inner loop has only one statement. The loop can be unrolled C 10 times avoiding a branch and an extra copy of the loop. A straight C line code is generated for the inner loop. $OPTIMIZE LOOP_UNROLL COPIES=10 DO 20 J=1,10 DO 10 I=1,10 A(I) = A(I) + B(I,J) 10 CONTINUE 20 CONTINUE C Change COPIES back to default. $OPTIMIZE LOOP_UNROLL COPIES=4 . . . . C This DO loop has more than 60 operations. C This does not get unrolled by default. The LOOP_UNROLL option is used C to unroll it two times by increasing the SIZE to a large value. $OPTIMIZE LOOP_UNROLL COPIES=2, SIZE=200 DO 40 I=1,10 DO 30 J=1,20 V1 = X(I,J+1,K) - X(I,J-1,K) V2 = Y(I,J+1,K) - Y(I,J-1,K) V3 = Z(I,J+1,K) - Z(I,J-1,K) X(I,J,K) = X(I,J,K) + A11*V1 + A2*V2 + * A3*V3 + S*(Y(I+1,J,K)-2.0*X(I,J,K)+X(I-1,J,K)) Y(I,J,K) = Y(I,J,K) + A1*V1 + A2*V2 + * A3*V3 + S*(Y(I+1,J,K)-2.0*Y(I,J,K)+Y(I-1,J,K)) Z(I,J,K) = Z(I,J,K) + A1*V1 + A2*V2 + * A3*V3 + S*(Z(I+1,J,K)-2.0*Z(I,J,K)+Z(I-1,J,K)) 30 CONTINUE 40 CONTINUE C Change the options back to the default values. $OPTIMIZE LOOP_UNROLL COPIES=4, SIZE=60 . . . . STOP END


MPE/iX 5.0 Documentation