HP 3000 Manuals

Run-Time Efficiency [ HP COBOL II/XL Programmer's Guide ] MPE/iX 5.0 Documentation


HP COBOL II/XL Programmer's Guide

Run-Time Efficiency 

You can improve your program's run-time efficiency with the following:

   *   An improved algorithm.

       This is the most important way you can improve your program's
       efficiency.  Neither control options nor the optimizer can make up
       for a slow algorithm.  A program that uses a binary search
       (without control options or optimization) is still faster than a
       program that uses a linear search (with control options and
       optimization).

   *   Coding heuristics.

   *   Control options.

   *   The optimizer.

This section discusses the last three ways of improving rum-time
efficiency.


NOTE Coding heuristics and the optimizer do not significantly improve the performance of I-O-bound programs.
Coding Heuristics The following coding heuristics for run-time efficiency are guidelines, not rules, for writing programs that run faster. They do not always work, but programmers have learned from experience that they usually do. * Put variables that are referenced most often (such as array subscripts and counters) at the beginning of the WORKING-STORAGE SECTION in the main program and nondynamic subprograms. Put them at the end of the WORKING-STORAGE SECTION in each dynamic subprogram. * Avoid conversion of data to different types. If two or more variables are operands in the same operation, declare them to be of the same type. It is more efficient to have operands of the same type than to make one of several operands "faster." See the examples below. * If fields are often used together as operands in arithmetic statements, it is more efficient to define them with the same PICTURE clause. [REV BEG] * The following examples illustrate the above two points. * The first of the following COMPUTE statements is faster than the other two because no conversion is necessary and the intermediate result is the same as the receiving operand, DISPLAY-4. The second COMPUTE requires a conversion from BINARY to DISPLAY because the receiving item is DISPLAY. The third statement requires a conversion from BINARY to PACKED-DECIMAL because the receiving item is PACKED-DECIMAL. These conversions take many machine instructions. 01 DISPLAY-4 PIC S9(4). 4 bytes long. 01 BINARY-4 PIC S9(4) BINARY. 2 bytes long. 01 DECIMAL-9 PIC S9(9) PACKED-DECIMAL. 5 bytes long. : COMPUTE DISPLAY-4 = DISPLAY-4 + DISPLAY-4. COMPUTE DISPLAY-4 = BINARY-4 + BINARY-4. COMPUTE DECIMAL-9 = BINARY-4 + BINARY-4. * Calculations involving multiplication, division, and exponentiation can require conversions for intermediate results. When the intermediate results of BINARY operands exceed 18 digits, the operands are converted to PACKED-DECIMAL. This takes many extra instructions. The first of the following COMPUTE statements is faster then the second because the intermediate result is 18 digits. The second requires conversion to PACKED-DECIMAL because the intermediate result is 20 digits. The result of the multiplication is then converted back to BINARY. 01 BINARY-9 PIC S9(9) BINARY. 4 bytes long. 01 BINARY-10 PIC S9(10) BINARY. 8 bytes long. : COMPUTE BINARY-9 = BINARY-9 * BINARY-9. COMPUTE BINARY-9 = BINARY-10 * BINARY-10. * The first of the following COMPUTE statements is faster because the intermediate result is 16 bits. The intermediate result of the second COMPUTE is 32 bits so the operands must be converted to 32-bit values. 01 BINARY-2 PIC S9(2) BINARY. 16 bits long. 01 BINARY-3 PIC S9(3) BINARY. 16 bits long. 01 BINARY-4 PIC S9(4) BINARY. 16 bits long. : COMPUTE BINARY-4 = BINARY-2 * BINARY-2. COMPUTE BINARY-4 = BINARY-2 * BINARY-3. * When you MOVE BINARY data items from a larger field to a smaller field, many instructions are required to truncate the data. The first of the following MOVE statements is faster than the second because the data must be truncated in the second MOVE but not the first. 01 BINARY-3 PIC S9(3) BINARY. 16 bits long. 01 BINARY-4 PIC S9(4) BINARY. 16 bits long. : MOVE BINARY-4 TO BINARY-4. MOVE BINARY-4 TO BINARY-3. [REV END] * If a variable is a subscript or a varying identifier in a PERFORM loop, declare it to be of the type PIC S9(9) BINARY SYNC. * When coding the UNTIL condition in a loop, keep this in mind: the comparisons equal and not equal are faster than the comparisons less than and greater than. * Compile subprograms with the SUBPROGRAM or ANSISUB control option. Calls to such subprograms execute faster than calls to subprograms compiled with PROGRAM-ID identifier IS INITIAL or the DYNAMIC control option, which require that the subprogram data be reinitialized whenever the program is called. For less code, use the control option SUBPROGRAM instead of ANSISUB. Performance is the same, but an ANSISUB subprogram contains extra code that reinitializes its data if the main program executes a CANCEL statement. * If paragraphs are performed close together timewise, put them close together physically in your source code. * If a paragraph is performed from only one place, use an in-line PERFORM statement for it. This applies especially to loops. * Use NOT phrases to minimize checking. NOT AT END is especially useful. See the example in "NOT Phrases." * Do not pass parameters BY CONTENT. [REV BEG] * Do not specify the ON EXCEPTION or ON OVERFLOW phrase in a CALL statement when you use a literal to specify the program name. If you do, the program is not bound to the subprogram[REV END] until run time. This slows the program by approximately .01 second per CALL, and the loader cannot detect missing subprograms. * Do not use NEXT SENTENCE as the ELSE clause in an IF statement. Use CONTINUE or END-IF, or reverse the sense of the condition. * Computation is fastest with the following types of operands, listed from the fastest to the slowest: [REV BEG] * PIC S9(9) BINARY, SYNCHRONIZED. * PIC S9(4) BINARY, SYNCHRONIZED.[REV END] * PACKED-DECIMAL, the fewer digits the faster the computation. * Numeric DISPLAY, the fewer digits the faster the computation. [REV BEG] Computation is faster with signed numbers than with unsigned numbers.[REV END] [REV BEG] Coding Heuristics when Calling COBOL Functions. The following are guidelines when your program calls COBOL functions. For more information on the COBOL functions, see Chapter 10, "COBOL Functions," in the HP COBOL II/XL Reference Manual. * Some of the functions are implemented as calls to run-time libraries. The rest are implemented simply as inline code. Inline functions are generally faster than functions in the run-time library. You might find that coding your own routine for some library functions is faster than calling the COBOL function. * The following functions convert the parameter values to floating point values to calculate the function result. These functions will execute faster on systems that have a floating point coprocessor. Use the ROUNDED phrase when precision of these function values is important. * ACOS, ASIN, ATAN, COS, SIN, TAN. * LOG, LOG10, RANDOM, SQRT, SUM. * MAX, MIN (on numeric operands). * NUMVAL, NUMVAL-C. * ORD-MAX, ORD-MIN. * ANNUITY, MEAN, MEDIAN, MIDRANGE, PRESENT-VALUE, RANGE, STANDARD-DEVIATION, VARIANCE. * The precision of functions that convert the parameter values to floating point values is limited to 15 significant digits. Also, fractional values may have rounding errors even if the total size of the argument is less than or equal to 15 digits. Use of equality comparisons, as shown below, are not recommended. Not recommended: IF FUNCTION COS(ANGLE-RADIANS) = 0.1 THEN PERFORM P1 END-IF Recommended alternative: COMPUTE ROUNDED COS-NUM = FUNCTION COS(ANGLE-RADIANS) IF COS-NUM > 0.1 THEN PERFORM P1. where COS-NUM is defined as follows: 01 COS-NUM PIC S9V9 COMP. Another alternative: IF FUNCTION COS(ANGLE-RADIANS) >= .0999 AND <= 0.1001 THEN PERFORM P1 END-IF [REV END] Control Options These control options make a program run faster: * OPTIMIZE=1, which invokes the optimizer. See the section, "The Optimizer", for details. * OPTFEATURES=LINKALIGNED[16], which generates code that accesses variables in the LINKAGE SECTION more efficiently. (If the called program specifies OPTFEATURES=LINKALIGNED[16], have the calling program specify OPTFEATURES=CALLALIGNED[16].) * SYNC32, which allows the compiler to align variables along the optimum boundaries. These control options make a program run more slowly: * VALIDATE, which takes extra time to check that the digits[REV BEG] and signs[REV END] of numeric items are valid. * BOUNDS, which generates and executes extra code to check ranges. * SYNC16, which specifies an alignment that is not optimum for Series 900 systems. * ANSISORT, which prevents SORT from reading or writing files directly. * SYMDEBUG, which generates Symbolic Debug information to be executed in the program file. (If a source program was compiled with SYMDEBUG, you can link the object module with the NODEBUG option, causing the program to ignore the Symbolic Debug information and improving its execution speed). This control option makes a program compile more slowly: * CALLINTRINSIC, which causes the compiler to check the file SYSINTR for each call literal. The Optimizer The optimizer is an optional part of the compiler that modifies your code so that it uses machine resources more efficiently, using less space and running faster. Note that the optimizer improves your code, not your algorithm. Optimization is no substitute for improving your algorithm. A good, unoptimized algorithm is always better than a poor, optimized one. You can compile your program with level one optimization by compiling it with the control option OPTIMIZE=1. The advantages of level one optimization are: * The program is approximately 3.1% smaller. * The program runs 2.8% to 4.4% faster. (Programs that use the Central Processor Unit intensively and those that do relatively little I-O benefit most.) The disadvantages of level one optimization are: * The program compiles approximately 10% more slowly. * The symbolic debugger cannot be used with program. * The statement numbers are not visible when using DEBUG[REV BEG] or from trap messages (see the example in Chapter 7 for details).[REV END] Level two optimization is not available for HP COBOL II/XL programs for the following reasons: * The most common COBOL data type is the multibyte string, which is difficult to optimize. (The easiest types to optimize are level 01 or 77, 16- or 32-bit, binary data types.) * HP COBOL II/XL programs call millicode routines far more often than non-COBOL programs do, and the optimizer does not optimize across routine calls. Level two optimization for COBOL would not have improved performance enough to be worth the effort, which was spent on improving millicode routines and code generation instead. Millicode Routines. Millicode routines are assembly language routines that deliver high performance for common COBOL operations such as move and compare. It supports COBOL operations on MPE XL the way microcode supports them on MPE V. Millicode routines are very specialized, tuned to provide optimal performance for specific data types of specific lengths. Based on operation and data types and lengths, the COBOL compiler calls the appropriate millicode routines. The calling convention for a millicode routine differs from that of an ordinary routine in the following ways: * A millicode call requires fewer registers to be saved across a call, making it faster than an ordinary call. * A millicode call uses general register 31 as the return register, rather than general register 2. When to Use the Optimizer. Compile your program with optimization only after you have debugged it. The optimizer can transform legal programs only. Once you have compiled your program with optimization, you cannot use the symbolic debugger to debug it. This is because debug information will be missing from it. The compiler does not generate debug information and perform optimizations at the same time. You can still use DEBUG on your program after you have compiled it with optimization; however, the statement numbers will not appear in the code. Transformations. The five level one optimizer transformations are: 1. Basic block optimization. The optimizer reads the machine code (specifically, the machine instruction list). When it reads a branch instruction, it creates a basic block of code. It joins two basic blocks into one extended block in the following two cases: a. If it can remove the branch instruction and append the "branched to" block to the "branched from" block. b. If the basic blocks are logically related and can be optimized as a single unit. Basic block optimization has these three components: a. Removal of common expressions in basic blocks. If more than one expression assigns the same value to the same field within the block, the optimizer removes all but the first expression. b. Removal and optimization of load and copy instructions in basic blocks. c. Optimization of elementary branches in basic and extended blocks. (Removal of unnecessary branches and more efficient arrangement of necessary branches.) 2. Instruction scheduling. Instruction scheduling reorders the instructions within a basic block to accomplish the following: a. Minimize load and store instructions. b. Optimize the scheduling of instructions that follow branches. 3. Dead code elimination. Dead code elimination is the removal of all code that will never be executed. 4. Branch optimization. Branch optimization traces the flow of IF-THEN-ELSE-IF...statements and reorganizes them more efficiently. 5. Peephole optimization. Peephole optimization substitutes simpler, equivalent instruction patterns for more complex ones.


MPE/iX 5.0 Documentation