Operation [ COBOL/HP-UX Operating Guide for the Series 700 and 800 ] MPE/iX 5.0 Documentation
COBOL/HP-UX Operating Guide for the Series 700 and 800
Operation
The following sections describe programming methods that will allow you
to make the most efficient use of your native code programs.
Performance Programming
This section gives some guidelines which, if followed, allow your COBOL
system to optimize fully the native code produced for your programs.
This results in smaller and faster applications.
Do remember that these are only guidelines; programs that do not conform
to these guidelines still run correctly, just less efficiently.
Environment Division Considerations.
If you use the DECIMAL POINT IS COMMA clause, you must ensure that any
commas separating two numeric literals are followed by a space. Any
commas which are not followed by a space are treated as decimal points by
the Micro Focus COBOL system.
Data Division Considerations.
This section identifies items in the Data Division which affect the size
and performance of your program and suggests the most efficient ways of
using them.
Data Types
Using the correct data-type is important to get the greatest efficiency
from operations, particularly arithmetic operations.
The following items show how different data types affect performance.
* Use unsigned COMP-5
or COMP-X
numeric data items; preferably COMP-5.
For operations on integer data items containing up to nine digits
(or up to four bytes for binary items), the following list
applies. It begins with the fastest numeric data types and ends
with the slowest.
COMP-5 Operations on
COMP-5 data items are performed as true
binary operations. This results in the
fastest performance.
COMP-5 "2" is assumed. See Appendix G
Directives for Compiler/Generator for
details of the COMP-5 directive.
COMP-X Operations on
COMP-X data items are performed as binary
operations.
However, the data items are stored in COBOL
order which may be the opposite of the byte
ordering on the CPU being used to execute
the program. This means that arithmetic on
items longer than one byte may involve
operations to change the byte order before
the arithmetic can be carried out, resulting
in slower arithmetic than on COMP-5 items.
COMP Operations on
COMP data items, by default, operate as
defined in the
ANSI standard. This results in truncation
of the result of operations before it is
stored in a COMP item. This generally
results in slower arithmetic than on COMP-X
data items. However, if the directives COMP
NO
TRUNC are used when the program is compiled,
operations on COMP items behave exactly like
operations on COMP-X items.
DISPLAY Arithmetic on
DISPLAY items is faster than arithmetic on
COMP-3 items in intermediate (.int) code.
If a data item takes up more than four bytes
of storage or is non-integer, then
arithmetic is generally much slower.
COMP-3 Arithmetic on
COMP-3 data items is performed in packed
decimal and is much slower than arithmetic
on COMP items. It should be avoided.
* Mixing different usage types in the same statement is normally
less efficient than using the same usage throughout. The main
exception to this is mixing COMP-5 and COMP-X items, where there
is very little impact on performance.
* The fastest and smallest code is produced for operations on items
that contain up to nine digits, or four bytes for binary items
(such as COMP, COMP-5 or COMP-X)
* Use only numeric items that occupy one, two or four bytes of
storage.
Operations on numeric items containing more than nine digits, or
more than four bytes for binary items, produces the slowest and
largest code.
* Do not redefine COMP-5 items to access individual bytes; if access
to individual bytes is required use COMP-X.
* Use edited items only when necessary and use only simple ones such
as ZZ9. If possible, use them in a subroutine so the total number
of edited moves in your program is kept as small as possible.
* Align two-byte items on two-byte boundaries and longer items on
four-byte boundaries. Ensure that stride of a table is a multiple
of the largest element, if possible a power of two (pad the table
as necessary). The stride of a table is the size of one element;
for example, the stride of the following table is two bytes:
01 a occurs 10.
03 b pic x.
03 c pic x.
Data items that are specified as 01 level items or are the
appropriate number of bytes from the start of an 01 level item
will normally be correctly aligned (this assumes that the
ALIGN directive is set to a multiple of four).
The
REF directive can be used to find out how data items are aligned.
* The SYNC clause is documentary, except when you specify the
IBMCOMP compiler directive.
* Items in the Working Storage section are accessed more quickly
than those in the Linkage Section.
Procedure Division Considerations.
This section identifies items in the Procedure Division which affect the
size and performance of your program and suggests the most efficient ways
of using them.
As a general rule, the simpler the operation, the faster it executes and
the smaller the compiled code. To get the best performance it is often
better to use a number of simple operations rather than one complex
operation. The following are general guidelines that result in the
fastest and smallest possible code.
Arithmetic Statements
To get the best performance from arithmetic statements always use the
simplest forms.
Operations
* The following operations are optimized for COMP-5 and COMP-X data
items up to four bytes long.
move a to b
add a to b
subtract a from b
multiply a by b
divide a into b
if a condition b
where:
a is a numeric literal or data item up to four bytes long
b is a numeric data item up to four bytes long
On other data items, these simple operations result in faster code
than more complex instructions, but the benefits are not so great
as with COMP-5 or COMP-X items.
More complex forms of these instructions, involving more than two
operands may not produce code as efficient as the simple form.
* Do not use the GIVING form of these verbs. If necessary, create
temporary variables and code several simple statements to achieve
the same result. For example, write:
move a to c
add b to c
rather than:
add a to b giving c
* Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING
phrases.
* No optimization is done on arithmetic statements if the
ON SIZE ERROR phrase is used. For this reason, it is recommended
that this phrase is not used if high performance is required.
* The ROUNDED
phrase impacts performance, but it is generally faster to use
ROUNDED than try to round the result using your own routine. The
only exception to this is when using the simple operations
described above on COMP-5 or COMP-X items.
* Do not mix items of different sizes in an arithmetic statement
(for example, try to use all two-byte items or all four-byte
items).
COMPUTE
* Do not use the COMPUTE verb except for performing calculations
involving floating point data. In this case, COMPUTE is the most
efficient statement.
COMPUTE is defined to evaluate its result in a temporary field of
the largest possible significance to hold the result. This
temporary field may be of a larger significance than is actually
required, causing the operations to be slower. The temporary
field is stored in the target field with any necessary truncation.
This also applies to
IF statements involving expressions. For example:
if a + b < c
It is often better to define a temporary variable of the required
size and use that to hold the result of the arithmetic, as
follows:
move a to temp
add b to temp
if temp < c
Decimal Alignment
* Operations are fastest if they need no decimal point alignment.
For example:
* ADD and SUBTRACT operations are fastest if the source and
target have the same decimal alignment.
* MULTIPLY operations are fastest if the decimal alignment of
the target is the sum of the alignments of the sources.
* DIVIDE operations are fastest if the divisor has no
fractional part and if the dividend and the target have the
same decimal alignment.
Exponentiation
* Exponentiation operations are relatively slow. No optimization is
done for them. MULTIPLY and DIVIDE operations should be used
instead where integer powers are involved.
Initialization
*
By default, COBOL initializes all data items in the
Working-Storage Section to spaces if no VALUE clause is specified.
This includes all numeric items. The effect of doing arithmetic
on such an item depends on the usage of the item:
Usage DISPLAY:
Intermediate code reports run-time error 163 "Illegal
character in numeric field". In generated code, the results
are unpredictable.
Any other usage:
Results are unpredictable.
To avoid these problems, all numeric items should be initialized
to numeric values before use.
Alphanumeric Data Manipulation
* Reference modified fields are optimized if coded in one of the
following forms:
item (literal:)
item (literal:literal)
item (literal:variable)
item (variable:variable)
item (variable + literal:literal)
item (variable - literal:literal)
item (variable + literal:variable)
item (variable - literal:variable)
Other forms of reference modification are inefficient.
* If the offset or length of the reference modification is a data
item, use a COMP-5 item of the smallest optimum size (one, two or
four bytes) that will accommodate the range of values involved.
Define it in Working-Storage.
* In a MOVE statement, have the source item the same size as or
larger than the target. This prevents space-padding code being
generated.
* Do not use the INITIALIZE verb.
* Do not use the CORRESPONDING option of the MOVE verb.
* Do not use the STRING or UNSTRING verbs - they create a lot of
code. For manipulating file-names use the COBOL System Library
Routines CBL_SPLIT_FILENAME and CBL_JOIN_FILENAME. For other
purposes, create your own loops; they are almost always more
efficient.
* If you attempt a MOVE between two numeric edited items the result
will be undefined although no error status is returned.
Table Handling
* A subscript should be a COMP-5 item of the smallest optimum size
(one, two or four bytes) that will accommodate the range of values
involved.
* The optimal definition for a subscript is a two-byte COMP-5 item.
* Subscripts for items that have the same bound
and stride
and are used for consecutive operands are optimized so that they
are only evaluated once. For example:
01 a pic xx comp-5 occurs 10.
01 b pic xx comp-5 occurs 10.
01 c pic xx comp-5 occurs 10.
01 d pic xx comp-5 occurs 10.
. . .
compute a(i) = b(i) + c(i) - d(i)
would result in the subscript i being evaluated only once,
although it is used four times in the statement. The stride of
each of these tables is the same - two.
* When compiling your program for use in production, use the NOBOUND
directive. Use BOUND only when debugging. It causes code to be
generated every time a subscript or index is used to check that it
is within the defined bounds of the table.
* If you are using USAGE DISPLAY or COMP-3 subscripts, use of the
BOUNDOPT directive can give performance improvements. For
example,
. . .
01 array pic x occurs 20.
01 array-index pic 9(5) value 2.
. . .
move "a" to array(array-index).
. . .
If the program is compiled without BOUNDOPT, all five digits of
array-index are used to evaluate the subscript. If BOUNDOPT is
specified, only the last two digits of array-index are used, as
only two digits are needed to access all elements of a 20 element
table.
* Access to tables defined with
OCCURS ... DEPENDING is less efficient than access to tables of
fixed size, and so should be avoided where high performance is
needed.
* Bound checking on a variable length table checks only if the
subscript or index points outside the maximum length of the table,
it does not take account of the table's current length (that is,
the value of the item specified in the DEPENDING phrase).
Conditional Statements
* In
IF statements,
conditions are evaluated in the order that they occur. Therefore,
you should put the conditions that are most likely to produce a
false result before others. Similarly, you should put those
conditions that can be evaluated fastest before slower conditions.
* Comparisons using
EQUALS (=) or
NOT EQUAL are faster than comparisons using
GREATER (>) or
LESS (<). Some CPUs can compare against binary zero
more efficiently than against other literals.
* In both alphanumeric and numeric comparisons, have the source and
target items the same size.
* Do not use large EVALUATE statements. They are compiled into a
series of IF ... ELSE IF ... statements where the value of the
expression is derived separately for each WHEN clause.
* Order an EVALUATE statement so that the most commonly satisfied
condition is tried first. Do not use complex expressions or
Linkage items as conditions in an EVALUATE statement; instead,
calculate the value yourself in a Working-Storage item and use
that item.
* Use a GO TO ... DEPENDING statement if the range of possible
values is fairly close. Although this construct has the
disadvantage of not being particularly suited to structured
programming, it is efficient.
Logic operators
A number of COBOL system library routines (call-by-name) are available to
perform bit-wise logic operators on data items. These are described in
Chapter 8 , Library Routines (Call-by-Name). They perform operations
such as bitwise
AND,
OR and
XOR.
The generator recognizes calls to these routines and, if possible,
optimizes them to produce in-line code rather than calls to the run-time
system. The calls are optimized if the length is specified as a literal.
The PERFORM Statement
Using PERFORM is generally very efficient, and is a very good way of
keeping the size of your program down as well as giving it an
easy-to-maintain structure. The following rules enable you to use it in
the most efficient ways.
* Put commonly used pieces of code in sections and PERFORM them.
This saves space for any statement used more than once that
produces more than eight bytes of generated code (in a NOTRICKLE
program). It is often beneficial even on single statements, for
example edited moves, subprogram calls or file I/O.
* Use PERFORM section n TIMES but not the equivalent in-line
perform.
* When incrementing or decrementing a counter, terminate it with a
literal value rather than a value held in a data item. For
example, to execute a loop n times, set the counter to n and then
decrement the counter until it becomes zero, rather than
incrementing the counter from 0 to n.
* Do not use PERFORM a THRU b. For example, the following does not
produce very efficient code for the first PERFORM because the end
of the range of the second PERFORM lies within the range of the
first PERFORM.
perform a thru e
perform b thru d
stop run
a.
. . .
b.
. . .
c.
. . .
d.
. . .
e.
. . .
* The range of an out-of-line PERFORM statement should not contain
the end of another perform range. If it does, the program is said
to trickle
; that is execution is allowed to trickle past the end of a
perform range.
One way to ensure that your program does not trickle is to perform
sections only, not paragraphs. This coding style generally gives
you a more easily maintained program, too.
For example, do not use code in the form:
procedure division.
perform a thru c
perform b thru d
stop run.
a.
display "a".
b.
display "b".
c.
display "c".
d.
display "d".
e.
* Another way to ensure that your program does not trickle, is to
put a STOP RUN after the main section (that is, before the first
PERFORMed section/paragraph)
* The presence of an
ALTER statement in a program prevents nearly all PERFORM
optimization. As a general rule, the ALTER statement should be
avoided.
CALL Statements
* If you are not using nested programs, ensure that the NONESTCALL
compiler directive is specified.
* Some operating systems will share the code portion of statically
linked applications, resulting in greater system efficiency
between multiple processes.
* Try to limit the number of CALL statements a program makes, if
necessary by avoiding splitting it into too many subprograms.
* CALL statements that do not alter the RETURN-CODE
special register or whose effect on RETURN-CODE are of no interest
should use a calling convention
of 4. The compiler directive DEFAULTCALLS
can be used to set this globally.
* Calls to the COBOL System Library Routines that carry out logical
operations, such as CBL_AND,
are optimized by the generator to actual machine logic operators,
providing the parameters are eight bytes long or less. These too
should use a calling convention of 4.
* Ensure that parameters appear in the same order in the CALL
statement as they do in the procedure division header.
* Ensure that the order of parameters is the same as the order of
their description in the called program's Linkage Section.
* Ensure that any linkage section items which are not referenced in
the procedure division header appear after those that do.
* Ensure that all parameters are 01 or 77 level items.
Parameters
Avoid making many references to linkage items. These include items
defined in the Linkage Section, items set to CBL_ALLOC_MEM
allocated memory, items defined as EXTERNAL.
Accessing
Linkage Section items is always slower than accessing Working-Storage
Section items. If a Linkage Section item is used frequently, it is
faster to move it into a Working-Storage Section item when the program is
entered and move it back to the Linkage Section if necessary before
exiting to the calling program. The Working-Storage Section item should
then be accessed throughout the program rather than the item in the
Linkage Section.
If a linkage parameter is optional you can detect its presence using a
statement of the form:
if address of linkage-item not = null
provided it was called from another COBOL program.
Sorting
If you use input and output procedures with a sort in your program, it is
important that you write them efficiently as they will be executed once
for each record you are sorting. Inefficient input and output procedures
can make the sort process appear to be very slow.
Compiler Directives.
A number of compiler directives can be used to make the native code for a
program better optimized. Some of these directives must be used with
care; ensure that the behavior you get with them is acceptable.
In general, always use the following directives when compiling your
programs to native code:
NOALTER
ALIGN(4)
COMP
NOANIM
NOBOUND
NOCHECKDIV
NONESTCALL
NOODOSLIDE
NOQUAL
NOSEG
NOTRUNC
Other suggestions (to help prevent inefficient coding):
REMOVE "UNSTRING"
REMOVE "STRING"
REMOVE "GIVING"
REMOVE "ROUNDED"
REMOVE "COMPUTE"
REMOVE "ERROR"
REMOVE "ALTER"
REMOVE "INITIALIZE"
REMOVE "CORRESPONDING"
REMOVE "TALLYING"
REMOVE "THRU"
REMOVE "THROUGH"
Using Directives to Optimize for Speed
There are many
directives you can use to optimize for speed. In some cases, the
defaults for compiler directives are the ones that provide speed
optimization. This section points out directives that need to be changed
from their default values to provide for speed optimization.
ALTER Directive
For efficiency reasons, you should not use ALTER statements in programs.
It is recommended that you avoid them altogether, and compile with
NOALTER, to prevent the compiler from having to produce code to look for
them.
BOUND Directive
The BOUND directive does boundary checking on table items.
Your applications can be made faster (and smaller) by compiling with
NOBOUND. Otherwise, the compiler inserts extra code to do boundary
checking on all references to table items.
During testing, you should use the BOUND option until you are satisfied
that your program is not referencing data beyond your table limits. For
production, NOBOUND gives you the desired efficiency.
BOUNDOPT Directive
The BOUNDOPT directive can be used to optimize your code if:
* you are using USAGE DISPLAY or COMP-3 subscripts
* you are using NOBOUND (see the BOUND directive above)
When BOUNDOPT is specified, digits in a USAGE DISPLAY subscript above the
size of the table are ignored. For example, a PIC 9(3) subscript would
be treated as PIC 9(2) for a table with less than 100 entries.
We recommended that you do not use USAGE DISPLAY subscripts.
COMP Directive
The COMP directive prevents code checking for
numeric overflow. This produces highly compact and efficient code.
COMP changes the behavior of arithmetic on data items defined as USAGE
COMP. It produces more efficient code, but the behavior does not conform
to the ANSI standard.
If used with the proper care, COMP can improve the speed of your
programs.
IBMCOMP Directive
The IBMCOMP directive causes data items with USAGE COMP to be compiled in
IBM synchronized format. It also affects the storage of words and bytes
(see your Language Reference for details).
NESTCALL Directive
The NESTCALL directive allows nested programs to appear in your program.
If you know you have no nested calls in your program, specifying
NONESTCALL allows the Compiler to generate slightly more efficient code.
TRUNC Directive
The TRUNC directive causes the compiler to create code to determine
whether variables need to be truncated or not.
If you are certain that you do not need truncation of variables, then
NOTRUNC causes the creation of more efficient code.
Using Directives to Optimize for Size
There are many directives you can use to optimize for size. In some
cases, the defaults for compiler directives are the choices that provide
size optimization. This section points out directives that need to be
changed from their default values to provide for size optimization.
ALIGN Directive
This directive specifies the boundary on which 01 and 77 level items are
stored. ALIGN(4) is the default. If you use ALIGN(2), you force
alignment on four-byte boundaries and thus reduce wasted space.
ANIM Directive
The ANIM directive
is used frequently during testing. However, when you are compiling for
production, use NOANIM. This reduces data size slightly.
Using Dynamically Linked Executable Files
If your system supports dynamically linked executable files, you should
use these in preference to statically linked executable files. The
advantages of using dynamic linking over static linking are:
* drastically reduced size of executable file
* faster linking (that is, faster creation of executable files)
* greater shareability of executable files (that is, more executable
files can be run at the same time)
Dynamically linked executable files can be created by setting the -B cob
flag to "d" for dynamic, which is the default if the flag is not
explicitly set. However, if your system does not support dynamic
linking, static linking is the default and use of the -B cob flag is
unsupported. For full details on -B, see Appendix D , Descriptions of
cob Flags.
Implementation of Floating-Point
Micro Focus COBOL provides IEEE floating-point support. The following
sections describe the range of values and the accuracy available using
floating-point support.
Range.
The range of values available with each of the two COBOL binary
floating-point data types is as follows:
COMP-1 from 8.43E-37 through 3.37E38
-8.43E-37 through -3.37E38
COMP-2 from 4.19E-307 through 1.67E308
-4.19E-307 through -1.67E308
Accuracy.
The following table shows the relationship between storage size and
significance.
Type Size Significant
digits
COMP-1 4 bytes 6-7
COMP-2 8 bytes 15-16
Handling Large Programs
The Micro Focus COBOL system enables you to execute statically linked or
dynamically loadable code. Statically linked code is an executable
object module which has all of its overlays permanently linked into
memory. Dynamically loadable code is either a COBOL intermediate code
(.int) file or a COBOL native code (.gnt) file. The dynamic loader loads
COBOL program overlays as needed.
When designing a COBOL application program that is to be dynamically
loadable, you want it to make efficient use of the available memory.
It is possible, using the COBOL system, to create and run programs that
use more memory than is physically present in your computer. There are
two ways you can do this:
* Carefully segment one large COBOL program so that it holds on disk
the code that is not being used, or
* Separate the program into smaller programs and use the COBOL call
mechanism.
The following sections look at these topics.
Note: Segmentation is rarely required for UNIX as memory management is
not usually a problem. However, the facility is available
should it be required and so is described here.
Segmentation (overlaying)
This section describes the segmentation mechanism, which enables you to
divide a COBOL program with a large Procedure Division into a COBOL
program with a small Procedure Division and a number of overlays
containing the remainder of the Procedure Division. Segmentation enables
all of the Procedure Division to be loaded into the available memory.
However, because it cannot be loaded all at once, it is loaded one
segment at a time to achieve the same effect in the reduced memory space.
To segment a program, you must divide it into sections by using a
SECTION label. Each group of sections with a common section number then
forms a single segment in the Procedure Division.
For example:
. . .
. . .
section 52.
move a to b.
. . .
. . .
section 62.
move x to y.
. . .
. . .
You can use segmentation only on the Procedure Division. The
Identification, Environment and Data Divisions are common to all
segments. In addition there may be a common Procedure Division segment
called the root segment. All of this common code is known as the
permanent segment. The compiler allows space for the permanent segment
and for the largest independent segment in a segmented program. See your
Language Reference for a complete description of control flow between
permanent and independent segments.
You can suppress segmentation at compile time using the NOSEG compiler
directive
,
even if you specify segment numbers. See Appendix G , Directives for
Compiler/Generator for details on this directive.
Interprogram Communication(CALL)
This section looks at the interprogram communication (call) mechanism for
creating large applications but avoiding large programs.
The COBOL system enables you to design applications as separate programs
at source level. Each program is then called dynamically from the main
application program, without first having to be linked with the other
programs. The format of the call statement is described in your Language
Reference. An intermediate code program can call a native code program
and vice versa.
Figure 10-1 shows an example of an application divided up in this
way.
Figure 10-1. Application Divided Using CALL
The main program A, which is permanently resident in memory, calls B, C,
or H, which are programs within the same suite. These programs call
other specific functions as follows:
B calls D, E and F
C calls X, Y, Z, L and K
H calls K
K calls M, N and O.
Because the functions B, C and H are stand-alone they do not need to be
permanently resident in memory together. You can call them as they are
needed, using the same physical memory when they are called. The same
applies to the lower functions at their level in the tree structure.
In the figure, you would have to plan the use of
CALL and
CANCEL operations so that a frequently called subprogram such as K would
be kept in memory to avoid load time. On the other hand, because it is
called by C or H, it cannot be initially called without C or H in memory;
that is, the largest of C or H should call K initially so as to allow
space. It is important also to avoid overflow of programs. At the level
of X, Y and Z, the order in which they are loaded does not matter because
they do not make calls at a lower level.
It is advantageous to leave called programs in memory if they open files
to avoid having to reopen them on every call. The
EXIT statement leaves files open in memory. The CANCEL statement closes
the files and releases the memory that the canceled program occupies,
provided all other programs in the executable file have been canceled or
never called.
If you use a tree structure of called, independent programs as shown
earlier, each program can call the next dynamically by using the
technique shown in the following sample coding:
working-storage section.
01 next-prog pic x(20) value spaces.
01 current-prog pic x(20) value "rstprg".
procedure division.
loop.
call current-prog using next-prog
cancel current-prog
if next-prog = spaces
stop run
end-if
move next-prog to current-prog
move spaces to next-prog
go to loop.
The actual programs to be run can then specify their successors as
follows:
. . .
. . .
linkage-section.
01 next-prog pic x(20).
. . .
. . .
procedure division using next-prog.
. . .
. . .
move "follow" to next-prog
exit program.
In this way, each independent segment or subprogram cancels itself and
changes the name in the
CALL statement to call the next one with the
USING phrase.
MPE/iX 5.0 Documentation