OPTIMIZE Directive [ HP FORTRAN 77/iX Reference ] MPE/iX 5.0 Documentation
HP FORTRAN 77/iX Reference
OPTIMIZE Directive
The OPTIMIZE directive sets up optimizer options that can improve
performance.
Syntax
[LEVEL1 ]
[LEVEL2 ]
[LEVEL2_MIN ]
[LEVEL2_MAX ]
[ASSUME_NO_EXTERNAL_PARMS ]
$OPTIMIZE [ASSUME_NO_FLOATING_INVARIANT ] [ON ]
[ASSUME_NO_PARAMETER_OVERLAPS ] [OFF]
[ASSUME_NO_SHARED_COMMON_PARMS ]
[ASSUME_NO_SIDE_EFFECTS ]
[ASSUME_PARM_TYPES_MATCHED ]
[LOOP_UNROLL[COPIES=n SIZE=n STATISTICS]]
ON Alone, specifies level 2 optimization.
With a preceding option, sets that option
on.
OFF Alone, specifies level 0 optimization.
This is the default.
With a preceding option, sets that option
off.
LEVEL1 Specifies level 1 optimization.
LEVEL2 Specifies level 2 optimization, with the
following ASSUME settings:
ASSUME_NO_EXTERNAL_PARMS ON
ASSUME_NO_FLOATING_INVARIANT ON
ASSUME_NO_PARAMETER_OVERLAPS ON
ASSUME_NO_SHARED_COMMON_PARMS ON
ASSUME_NO_SIDE EFFECTS OFF
ASSUME_PARM_TYPES_MATCHED ON
LOOP_UNROLL ON
LEVEL2_MIN Specifies level 2 optimization with all the
ASSUME settings OFF.
LEVEL2_MAX Specifies level 2 optimization with all the
ASSUME settings ON.
ASSUME_NO_EXTERNAL_PARMS Assumes that none of the parameters passed
to the current procedure are from an
external space, that is, different from the
user's own data space. Parameters can come
from another space if they come from
operating system space or if they are in a
space shared by other users.
ASSUME_NO_FLOATING_INVARIANT Assumes that no floating invariant
operations are executed conditionally with
loops.
ASSUME_NO_PARAMETER_OVERLAPS Assumes that no actual parameters passed to
a procedure overlap each other.
ASSUME_NO_SHARED_COMMON_PARMS This directive should be ON when all of the
following are true:
* The parameter passed to the current
procedure is part of a common block
used by that procedure.
* The parameter is named differently
than the variable name it has in the
common block.
* The parameter is reassigned with the
same value within the procedure.
ASSUME_NO_SIDE_EFFECTS Assumes that the current procedure changes
only local variables. It does not change
any variables in COMMON, nor does it change
parameters.
ASSUME_PARM_TYPES_MATCHED Assumes that all of the actual parameters
passed were the type expected by this
subroutine.
LOOP_UNROLL Unrolls DO loops having 60 or less
operations four times. For further
details, see "Loop Unrolling" in this
chapter. The default is ON.
There are five levels of optimization:
Level 0 Does no optimizing. This is obtained by specifying
$OPTIMIZE OFF.
Level 1 Optimizes only within each basic block. This is
obtained by specifying $OPTIMIZE LEVEL1 ON.
Level 2 minimum Optimizes within each procedure with no assumptions on
interactions of procedures. That is, the compiler
assumes nothing, making this the most conservative level
2 optimization. This level is obtained by specifying
$OPTIMIZE LEVEL2_MIN ON within each procedure.
Level 2 normal Optimizes within each procedure with normal assumptions
on interactions of procedures set as described earlier.
In general, these settings are appropriate for most
FORTRAN programs. This level is obtained by specifying
$OPTIMIZE LEVEL2 ON, $OPTIMIZE ON or just $OPTIMIZE
within each procedure.
Level 2 maximum Optimizes within each procedure with all assumptions on
interactions of procedures set to OFF. This is obtained
by specifying $OPTIMIZE LEVEL2_MAX ON within each
procedure.
A basic block is a set of instructions to be executed in sequence, with
one entrance, the first instruction, and one exit, the last; the block
contains no branches.
Parameters can come from another space if they come from the operating
system or if they are in a space shared by other users.
The following options are meaningful only when the compiler is performing
level 2 optimization, that is, only if the option ON, LEVEL2, LEVEL2_MIN,
or LEVEL2_MAX has been specified:
ASSUME_NO_PARAMETER_OVERLAPS
ASSUME_NO_SIDE_EFFECTS
ASSUME_PARM_TYPES_MATCHED
ASSUME_NO_EXTERNAL_PARMS
ASSUME_NO_SHARED_COMMON_PARMS
ASSUME_NO_FLOATING_INVARIANT
LOOP_UNROLL
Default Off.
Location The following OPTIMIZE options must appear before
any nondirective statements in the program unit:
OFF
ON
LEVEL1
LEVEL2
LEVEL2_MIN
LEVEL2_MAX
ASSUME_NO_PARAMETER_OVERLAPS
ASSUME_NO_EXTERNAL_PARMS
ASSUME_NO_SHARED_COMMON_PARMS
ASSUME_NO_FLOATING_INVARIANT
These options can appear anywhere within a program
unit:
ASSUME_NO_SIDE_EFFECTS
ASSUME_PARM_TYPES_MATCHED
LOOP_UNROLL
Toggling/ Duration The optimize options remain in effect until they
are changed by another OPTIMIZE directive.
Impact on This directive can improve performance. Loop
Performance unrolling, which usually improves performance, can
occasionally degrade performance because of large
loops (register spilling) and code expansion
(crossing the page boundary causing cache misses
and TLB misses.)
Flagging Uninitialized Variables
When the compiler is performing level 2 optimization, it will detect any
uninitialized non-static simple local variables. However, it will not
detect uninitialized common variables, static variables, or variables of
character and complex type. For example:
$OPTIMIZE
FUNCTION func(type)
COMMON /a/comvar
SAVE statvar
REAL foo,type
type = 10.2
foo = comvar
foo = statvar
foo = typo
RETURN
END
The variable typo is flagged as an uninitialized variable because it was
typed incorrectly and, therefore, not initialized. However, statvar
and comvar are not flagged because of their global and static
characteristics. A warning message will be issued when an uninitialized
variable is detected.
Example
C Start with minimum level 2 optimization.
$OPTIMIZE LEVEL2_MIN
PROGRAM FEQ7
INTEGER num(10), ans, calculate
CHARACTER*2 option(10)
C
C For the next two calls, the parameter type declarations are the same in
C the main program and the subroutine or function. Therefore, we can
C further optimize the program by setting the following optimizer option.
$OPTIMIZE ASSUME_PARM_TYPES_MATCHED ON
call getnum_option(num,option)
C
C For the next call, the function will not change the parameter value or
C any global variables in COMMON blocks. Therefore, we can further
C optimize the program by setting the following optimizer option.
$OPTIMIZE ASSUME_NO_SIDE_EFFECTS ON
ans= calculate(num,option)
$OPTIMIZE ASSUME_NO_SIDE_EFFECTS OFF
WRITE(6,*) 'Result = ',ans
END
C
C For the next subroutine, we know that the actual parameters passed to
C this subroutine are not overlapped with each other, from a shared
C common block, nor from another space different from the user's own
C program, thus we can further optimize the program by setting the
C following optimizer options.
$OPTIMIZE ASSUME_NO_PARAMETER_OVERLAPS ON
$OPTIMIZE ASSUME_NO_EXTERNAL PARMS ON
$OPTIMIZE ASSUME_NO_SHARED_COMMON PARMS ON
SUBROUTINE getnum_option(value,operation)
INTEGER value(10)
CHARACTER*2 operation(10)
DO 10 i = 1,10
20 WRITE(6,*) 'Please input operation type and integer value :'
READ(5,*) operation(i),value(i)
IF (operation(i).EQ.' ') GOTO 30
IF ((operation(i).NE.'**').AND.
/ (operation(i).NE.'*' ).AND.
/ (operation(i).NE.'/' ).AND.
/ (operation(i).NE.'-' ).AND.
/ (operation(i).NE.'+' )) GOTO 20
10 CONTINUE
30 RETURN
END
C
C For the next subroutine, we know that the actual parameters passed to
C this subroutine are not overlapped with each other, not from
C external space, nor from a shared common block. We can thus leave the
C ASSUME_NO_PARAMETER_OVERLAPS, ASSUME_NO_EXTERNAL_PARMS, and
C ASSUME_NO_SHARED_COMMON_PARMS settings ON.
C
FUNCTION calculate(value,operation)
INTEGER value(10),calculate,ans
CHARACTER*2 operation(10)
ans = 0
DO 10 i = 1,10
IF (operation(i).EQ.' ') GOTO 30
IF (operation(i).EQ.'**') THEN
ans = ans ** value(i)
ELSE IF (operation(i).EQ.'*' ) THEN
ans = ans * value(i)
ELSE IF (operation(i).EQ.'/' ) THEN
ans = ans / value(i)
ELSE IF (operation(i).EQ.'-' ) THEN
ans = ans - value(i)
ELSE IF (operation(i).EQ.'+' ) THEN
ans = ans + value(i)
ENDIF
10 CONTINUE
30 calculate = ans
RETURN
END
Loop Unrolling
[ON ]
[OFF ]
$OPTIMIZE LOOP_UNROLL [COPIES = n]
[,SIZE = n ]
[STATISTICS]
ON Turns on loop unrolling. ON is the default at
level 2.
OFF Turns off loop unrolling.
COPIES = n Tells the compiler to unroll the loop n times. The
default is four times.
SIZE = n Tells the compiler to unroll the loops that have
less than n operations. The default is 60
operations.
STATISTICS Tells the compiler to give statistics about the
unrolled loops.
Limits on Use.
DO loops at level 2 are unrolled four times by default. If the loop
limit is either not known at compile time or is less than four times, an
extra copy of the DO loop body is generated. This is called unrolling
the loop four or more times.
Although loop unrolling optimization usually increases performance, it
can occasionally degrade performance because of large loops (register
spilling) and code expansion (crossing the page boundary causing cache
misses and TLB misses.) When you encounter these circumstances, you can
turn off loop unrolling locally by using the compiler directive. Use the
compiler directive $OPTIMIZE to specify optimization level in the source
and for changing the assumptions made by the compiler. You can use a
suboption LOOP_UNROLL to control some constraints:
$OPTIMIZE LOOP_UNROLL
You can also use the LOOP_UNROLL suboption on the $OPTIMIZE directive to
change the DO LOOP constraints for unrolling dynamically:
* You can unroll a DO loop more than four times.
* You can force a DO loop to unroll despite its large size.
* You can find the reason why a DO loop is not unrolled.
The highest level of optimization must be on for LOOP_UNROLL to work.
Otherwise, LOOP_UNROLL is ignored. If LOOP_UNROLL is ignored, but
STATISTICS has been specified, you will still get the DO loop statistics.
NOTE The number of operations reported by STATISTICS is approximate.
Each assignment, arithmetic operation, and logical operation counts
as an operation. Each subscript of a subscripted variable counts
as a separate operation.
To unroll the loop two times instead of four times (which is the
default), use
$OPTIMIZE LOOP_UNROLL COPIES=2
To unroll a DO loop that is larger than the default, use
$OPTIMIZE LOOP_UNROLL COPIES=2, SIZE=500
substituting an appropriate size for the digit 500.
Example.
C Example to illustrate the use of LOOP_UNROLL
$OPTIMIZE ON
PROGRAM UNROLL_EXAMPLE
DIMENSION A(10), B(10,10)
DIMENSION X(10,10,10), Y(10,10,10), Z(10,10,10)
. .
. .
C The inner loop has only one statement. The loop can be unrolled
C 10 times avoiding a branch and an extra copy of the loop. A straight
C line code is generated for the inner loop.
$OPTIMIZE LOOP_UNROLL COPIES=10
DO 20 J=1,10
DO 10 I=1,10
A(I) = A(I) + B(I,J)
10 CONTINUE
20 CONTINUE
C Change COPIES back to default.
$OPTIMIZE LOOP_UNROLL COPIES=4
. .
. .
C This DO loop has more than 60 operations.
C This does not get unrolled by default. The LOOP_UNROLL option is used
C to unroll it two times by increasing the SIZE to a large value.
$OPTIMIZE LOOP_UNROLL COPIES=2, SIZE=200
DO 40 I=1,10
DO 30 J=1,20
V1 = X(I,J+1,K) - X(I,J-1,K)
V2 = Y(I,J+1,K) - Y(I,J-1,K)
V3 = Z(I,J+1,K) - Z(I,J-1,K)
X(I,J,K) = X(I,J,K) + A11*V1 + A2*V2 +
* A3*V3 + S*(Y(I+1,J,K)-2.0*X(I,J,K)+X(I-1,J,K))
Y(I,J,K) = Y(I,J,K) + A1*V1 + A2*V2 +
* A3*V3 + S*(Y(I+1,J,K)-2.0*Y(I,J,K)+Y(I-1,J,K))
Z(I,J,K) = Z(I,J,K) + A1*V1 + A2*V2 +
* A3*V3 + S*(Z(I+1,J,K)-2.0*Z(I,J,K)+Z(I-1,J,K))
30 CONTINUE
40 CONTINUE
C Change the options back to the default values.
$OPTIMIZE LOOP_UNROLL COPIES=4, SIZE=60
. .
. .
STOP
END
MPE/iX 5.0 Documentation