Performance Programming [ Micro Focus COBOL for UNIX COBOL User Guide ] MPE/iX 5.0 Documentation
Micro Focus COBOL for UNIX COBOL User Guide
Performance Programming
This section gives some guidelines which, if followed, enable your COBOL
system to optimize fully the native code produced for your programs.
This results in smaller and faster applications.
This COBOL system, or an equivalent, is available from Micro Focus for
many environments, including DOS, Windows, OS/2, Windows/NT and most
variants of UNIX. Programs written on one system can be moved to other
systems provided certain portability rules are followed. See the chapter
Portability Issues for more details.
The run-time system and native code generators are tailored separately to
each environment. However, they are all based on one of two
technologies:
* For operating systems that use 16-bit segmented addressing -
referred to as 16-bit.
* For operating systems that use 32-bit flat addressing - referred
to as 32-bit.
The 16-bit operating systems supported are: DOS, Windows and OS/2 v1.x
operating systems, running on the Intel 80x86 family of processors. The
32-bit operating systems supported include all supported versions of
UNIX, Windows/NT and OS/2 v2.x. The 16-bit technology runs on Windows/NT
and OS/2 v2.x but gives poorer performance than the 32-bit technology.
This section contains information for making your program efficient with
both of these technologies. Where information differs between
technologies, this is shown.
Do remember that these are only guidelines; programs that do not conform
to these guidelines still run correctly, just less efficiently.
Data Division Considerations
This section identifies items in the Data Division which affect the size
and performance of your program, and suggests the most efficient ways of
using them.
Data Types.
Using the correct data-type is important to get the greatest efficiency
from operations, particularly arithmetic operations. For details of the
storage of each data type please refer to your Language Reference.
* Use signed integer COMP-5
or COMP-X
for all numeric data items.
COMP-5 usage is defined for binary storage with the value stored in the
native byte sequence of the processor. This makes it ideal for
arithmetic efficiency. However, it is not suitable for data stored in
files or passed to other machines over a network as the data is not
accessible by programs running in an environment in which COMP-5 is
stored in a different sequence. In these cases use COMP-X.
* In native code programs, it is more efficient to move integer data
items that are not COMP-X or COMP-5 to COMP-5 data items before
doing arithmetic operations on them.
* The following shows the order of speed of processing data of
different types. This applies to any size of numeric data item.
On 16-bit systems it applies to non-integer as well as integer,
provided the ON SIZE ERROR clause is not specified. On 32-bit
systems this applies to integer data items. The list is ordered
fastest to slowest.
COMP-5 Operations on COMP-5 data items are performed as
true native binary operations.
Fastest performance of COMP-5 is given using the
Compiler directive COMP-5"2", the default. See
your COBOL System Reference for more details.
COMP-X Operations on
COMP-X data items are performed as binary
operations.
The data items are stored in COBOL order, which
might not be the same as the native byte order.
For example, on Intel 80x86 systems (DOS, Windows,
OS/2 and some versions of UNIX), byte ordering is
different. In this case, arithmetic on items
longer than one byte involve operations to change
the byte order before the arithmetic can be carried
out, resulting in slightly slower arithmetic than
on COMP-5 items.
COMP Operations on
COMP data items, by default, operate as defined in
the
ANSI standard. This results in truncation of the
result of operations before it is stored in a COMP
item. This generally results in slower arithmetic
than on COMP-X data items. However, if the
directives COMP and
NOTRUNC are used when the program is compiled,
operations on COMP items behave exactly like
operations on COMP-X items.
COMP-3 Arithmetic on
COMP-3 data items is performed in packed decimal
and is much slower than arithmetic on COMP items.
It should be avoided.
DISPLAY Arithmetic on
DISPLAY items is generally much slower than
arithmetic on COMP items, and should be avoided.
In 32-bit systems, arithmetic on DISPLAY items in
intermediate code is faster than on COMP-3 items.
If the ON SIZE ERROR clause is used on 16-bit systems, or if the data
item is non-integer, the above ordering does not apply. For example, in
16-bit COBOL systems arithmetic on such items uses packed decimal
registers inside the run-time system. In this non-optimized mode the
order is:
COMP-3
DISPLAY
COMP-5
COMP
COMP-X
However, individual cases might behave differently.
In intermediate code, the order is:
DISPLAY
COMP-5
COMP
COMP-X
COMP-3
* Addition and subtraction on non-integer data items is fastest if
the items are COMP-5 and the decimal alignment is the same in
both.
* Mixing different usage types in the same statement is less
efficient than using the same usage throughout. The main
exception to this is mixing COMP-5 and COMP-X items, where there
is very little impact on performance.
* Use only numeric items that occupy one, two, four or eight bytes
of storage.
In 16-bit systems, the fastest and smallest code is produced for
operations on items that contain up to four digits, or two bytes for
binary items. The next fastest code is produced for operations on items
that contain between five and nine digits, or up to four bytes for binary
items.
In 32-bit systems, the fastest and smallest code is produced for
operations on items that contain up to nine digits, or four bytes for
binary items.
Operations on numeric items containing more than nine digits, or more
than four bytes for binary items, produces the slowest and largest code.
* Align items on even byte boundaries. Align numeric items greater
than two bytes on four-byte boundaries. Use the ALIGN directive
to ensure 01-level items are always aligned on such boundaries.
Use ALIGN"4", or ALIGN"8" for compatibility with 64-bit systems.
To ensure all items in a table are correctly aligned, check that the size
of an occurrence in the table (the stride
of the table) is a multiple of two or four bytes as required. Pad the
table as necessary. For example:
01 a occurs 10.
03 b pic x(4) comp-5.
03 c pic x.
03 filler pic x(3).
A three-byte filler has been added to each occurrence in the table to
ensure that the numeric data item is always aligned on a four-byte
boundary.
The
REF directive can be used to find out how data items are aligned.
* Do not redefine COMP-5 items to access individual bytes; if access
to individual bytes is required use COMP-X.
* Use edited items only when necessary and use only simple ones such
as ZZ9. If possible use them in a subroutine so the total number
of edited moves in your program is kept as small as possible.
Procedure Division Considerations
This section identifies items in the Procedure Division which affect the
size and performance of your program, and suggests the most efficient
ways of using them.
As a general rule, the simpler the operation, the faster it executes and
the smaller the compiled code. To get the best performance it is often
better to use a number of simple operations rather than one complex
operation. The following are general guidelines that result in the
fastest and smallest possible code.
Arithmetic Statements
To get the best performance from arithmetic statements always use the
simplest forms.
Operations.
* Use simple two-operand arithmetic statements wherever possible.
The following operations are optimized for COMP-5 and COMP-X data items
up to four bytes long.
move a to b
add a to b
subtract a from b
multiply a by b
divide a into b
if a condition b
where:
a is a numeric literal or data item up to four bytes
long
b is a numeric data item up to four bytes long
On other data items, these simple operations result in faster code than
more complex instructions, but the benefits are not so great as with
COMP-5 or COMP-X items.
More complex forms of these instructions, involving more than two
operands, might not produce code as efficient as the simple form.
* Do not use the GIVING form of these verbs. If necessary, create
temporary data items and code several simple statements to achieve
the same result. For example, write:
move a to c
add b to c
rather than:
add a to b giving c
Statements containing the GIVING phrase are optimized on 32-bit systems
provided the operands are all of the same type.
* Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING
phrases.
* No optimization is done on arithmetic statements if the
ON SIZE ERROR phrase is used. For this reason, we recommended you
do not use this phrase if high performance is required.
* The ROUNDED
phrase impacts performance, but it is generally faster to use
ROUNDED than try to round the result using your own routine. The
only exception to this is when using the simple operations
described above on COMP-5 or COMP-X items.
* Do not mix items of different sizes in an arithmetic statement
(for example, try to use all two-byte items or all four-byte
items).
COMPUTE.
* Do not use COMPUTE statements except for performing calculations
involving floating-point data. In this case, COMPUTE is the most
efficient statement.
COMPUTE is defined to evaluate its result in a temporary field of the
largest possible significance to hold the result. This temporary field
might be of a larger significance than is actually required, causing the
operations to be slower. The temporary field is stored in the target
field with any necessary truncation. This also applies to
IF statements involving expressions. For example:
if a + b < c
It is often better to define a temporary data item of the required size
and use that to hold the result of the arithmetic, as follows:
move a to temp
add b to temp
if temp < c
Decimal Alignment.
* Operations are fastest on integers. Operations on non-integer
numbers are most efficient if they need no decimal point alignment
. For example:
* ADD and SUBTRACT operations are fastest if the source and
target have the same decimal alignment.
* MULTIPLY operations are fastest if the decimal alignment of
the target is the sum of the alignments of the sources.
* DIVIDE operations are fastest if the divisor has no
fractional part and if the dividend and the target have the
same decimal alignment.
Exponentiation.
* Most exponentiation operations are relatively slow. In 16-bit
systems, exponents of 2 or 0.5 specified as a literal are
optimized. In all other cases no optimization is done for them.
MULTIPLY and DIVIDE operations should be used instead where
integer powers are involved.
Initialization.
* By default, COBOL initializes all data items in the
Working-Storage Section to spaces if no VALUE clause is specified.
This includes all numeric items. The effect of doing arithmetic
on such an item depends on the usage of the item:
Usage DISPLAY:
If run-time checking for compatible data is enabled, a run-time error 163
"Illegal character in numeric field" occurs. The results are
unpredictable for code run with run-time checking disabled, unless the
SPZERO directive has been used. (Run-time checking is enabled using the
+F run-time switch, and the CHECKNUM directive for generated code. See
your COBOL System Reference for more information.)
Any other usage:
Results are unpredictable.
To avoid these problems, all numeric items should be initialized to
numeric values before use.
Alphanumeric Data Manipulation.
* Reference modified fields are optimized if coded in one of the
following forms:
item (literal:)
item (literal:literal)
item (literal:variable)
item (variable:variable)
item (variable + literal:literal)
item (variable - literal:literal)
item (variable + literal:variable)
item (variable - literal:variable)
Other forms of reference modification are inefficient.
* If the offset or length of the reference modification is a data
item, use a COMP-5 item of the smallest optimum size (one, two or
four bytes) that accommodates the range of values involved.
Define it in the Working-Storage Section. For the 16-bit COBOL
system a two-byte COMP-5 item is the best; for 32-bit use a four
byte item.
* In a MOVE statement, the source item should be the same size as,
or larger than, the target. This prevents space-padding code
being generated.
* Do not use the INITIALIZE verb.
* Do not use the CORRESPONDING option of the MOVE verb.
* Do not use the STRING or UNSTRING verbs - they create a lot of
code. For manipulating file-names use the COBOL System Library
Routines CBL_SPLIT_FILENAME and CBL_JOIN_FILENAME. For other
purposes, create your own loops; they are almost always more
efficient.
* If you attempt a MOVE between two numeric-edited items the result
is undefined although no error status is returned.
Table Handling.
* A subscript should be a COMP-5 item of the smallest optimum size
(two or four bytes) that accommodates the range of values
involved. On 16-bit systems, two bytes is the optimum; on 32-bit
use a four byte item.
* Subscripts for items that have the same stride
and are used in consecutive statements are optimized so that they
are only evaluated once. For example:
01 a pic xx occurs 10.
01 b pic xx occurs 10.
01 c pic xx occurs 10.
01 d pic xx occurs 10.
. . .
move a(i) to b(i)
if c(i) = d(i)
display "pass"
end-if
would result in the subscript i being evaluated only once in 16-bit COBOL
systems, and twice in 32-bit COBOL systems, although it is used four
times in two statements. The stride of each of these tables is the same:
two.
* When generating your program to native code for use in production,
use the NOBOUND directive (or cob switch -O). Use BOUND only when
debugging. It causes code to be generated, every time a subscript
or index is used, to check that it is within the defined bounds of
the table.
If you are using USAGE DISPLAY subscripts, the
BOUNDOPT directive (switched on by NOBOUND) can give performance
improvements. For example,
. . .
01 array pic x occurs 20.
01 array-index pic 9(5) value 2.
. . .
move "a" to array(array-index).
. . .
If the program is compiled without BOUNDOPT, all five digits of
array-index are used to evaluate the subscript. If BOUNDOPT is
specified, only the last two digits of array-index are used, as only two
digits are needed to access all elements of a 20 element table.
* Access to tables defined with
OCCURS...DEPENDING is less efficient than access to tables of
fixed size, and so should be avoided where high performance is
needed.
* Bound checking on a variable length table checks only if the
subscript or index points outside the maximum length of the table,
it does not take account of the table's current length (that is,
the value of the item specified in the DEPENDING phrase).
Conditional Statements.
* In IF statements,
conditions within combined conditions are evaluated in the order
that they occur. Therefore, you should put the conditions that
are most likely to produce a false result before others.
Similarly, you should put those conditions that can be evaluated
fastest before slower conditions.
* Comparisons using
EQUALS (=) or
NOT EQUAL are faster than comparisons using
GREATER (>) or
LESS (<). In some systems, comparisons against binary zero are
more efficient than against other literals.
* In both alphanumeric and numeric comparisons, have the source and
target items the same size.
* Do not use large EVALUATE statements. They are compiled into a
series of IF...ELSE IF...statements where the value of the
expression is derived separately for each WHEN clause.
In 16-bit systems, certain simple EVALUATE statements are compiled into a
GO TO depending on statement, which is very efficient.
* Order an EVALUATE statement so that the most commonly satisfied
condition is tried first. Do not use complex expressions or items
defined in the Linkage Section as conditions in an EVALUATE
statement; instead, include statements to calculate the value in
an item defined in the Working-Storage Section and use that item
in the EVALUATE statement.
* Use a GO TO...DEPENDING statement if the range of possible values
is fairly close. Although this construct has the disadvantage of
not being particularly suited to structured programming, it is
efficient.
Logical Operations.
A number of COBOL System Library Routines (call-by-name) are available to
perform bitwise logical operations on data items. These are described in
your COBOL System Reference. They perform operations such as bitwise
AND,
OR and
XOR.
The Generator recognizes calls to these routines and, if possible,
optimizes them to produce in-line code rather than calls to the run-time
system. In-line code is native code which performs the function directly
without making any calls. The alternative is a call to a generic
run-time routine which must allow for many cases.
On 16-bit systems, the calls are optimized if the length parameter is a
literal and no more than eight. On 32-bit systems, the calls are
optimized if the length is specified as a literal.
The PERFORM Statement.
Using PERFORM is generally very efficient, and is a very good way of
keeping the size of your program down as well as giving it an
easy-to-maintain structure. The following rules enable you to use it in
the most efficient ways.
* Put commonly used pieces of code in sections and perform them.
Apart from being good coding practice, this saves space. It is often
beneficial even on single statements; for example, edited moves,
subprogram calls or file I/O.
* Use:
PERFORM section n TIMES
but not the equivalent in-line perform.
* When incrementing or decrementing a counter, terminate it with a
literal value rather than a value held in a data item. For
example, to execute a loop n times, set the counter to n and then
decrement the counter until it becomes zero, rather than
incrementing the counter from zero to n.
* Perform sections, not paragraphs. Put EXIT PROGRAM and STOP RUN
statements at the end of the first (main) section.
The range of an out-of-line PERFORM statement should not contain the end
of another perform range. If it does, the program is said to trickle
; that is execution is allowed to go past the end of a perform range.
Applying the rule above ensures that this does not occur.
* On 16-bit systems use the NOTRICKLE directive when compiling to
native code.
If your program does not trickle you can compile it with the directive
NOTRICKLE which causes the Compiler to produce more efficient code,
PERFORM being implemented as a machine code CALL instruction with a
corresponding RET at the end of the perform range. If NOTRICKLE is used
where two perform ranges overlap, results are unpredictable. Refer to
your COBOL System Reference for details of the NOTRICKLE directive.
For example, do not use code in the form:
procedure division.
perform a thru c
perform b thru d
stop run.
a.
display "a".
b.
display "b".
c.
display "c".
d.
display "d".
e.
Use the TRICKLECHECK
directive when you compile your program to determine if your program
trickles. When you execute a program compiled with this directive, a
run-time error is produced whenever it attempts to trickle.
On 32-bit systems, all optimization is done automatically.
* Only use GO TO to paragraphs within the same section.
* Do not use PERFORM .. THRU statements
For example, the following produces very inefficient code for the first
PERFORM because the end of the range of the second PERFORM lies within
the range of the first PERFORM.
perform a thru e
perform b thru d
stop run
a.
. . .
b.
. . .
c.
. . .
d.
. . .
e.
. . .
* Do not use ALTER statements.
The presence of an
ALTER statement in a program prevents optimization of PERFORM statements.
CALL Statements.
* If you are not using nested programs, ensure that the NONESTCALL
Compiler directive is specified.
* The following format of CALL generally produces faster code for
dynamically and statically linked programs:
call "literal" using ...
* When calling other programs, the
CALL statement executes faster if the calling and called programs
are static linked to form one executable file. For further
information, see the chapter Developing COBOL Applications.
* Some operating systems share the code portion of statically linked
applications, resulting in greater system efficiency between
multiple processes.
* Try to limit the number of CALL statements a program makes, if
necessary by avoiding splitting it into too many subprograms.
* CALL statements that do not alter the RETURN-CODE
special register or whose effect on RETURN-CODE are of no interest
should use a calling convention
of 4. The Compiler directive DEFAULTCALLS
can be used to set this globally.
* Ensure that parameters appear in the same order in the CALL
statement as they do in the Procedure Division header.
* Ensure that the order of parameters is the same as the order of
their description in the called program's Linkage Section. Ensure
that any Linkage Section items which are not referenced in the
Procedure Division header appear after those that do.
* Ensure that all parameters are 01- or 77-level items.
* In 16-bit systems, use the Generator directive NOPARAMCOUNTCHECK
if your program is always called with the correct number of
parameters, or if it does not reference unsupplied parameters.
Most programs fall into this category.
Parameters.
Avoid making many references to linkage items. These include items
defined in the Linkage Section, items set to CBL_ALLOC_MEM
allocated memory, and items defined as EXTERNAL
.
Accessing
linkage items is always slower than accessing Working-Storage Section
items. If a Linkage Section item is used frequently, it is faster to
move it into a Working-Storage Section item when the program is entered
and move it back to the Linkage Section if necessary before exiting to
the calling program. The Working-Storage Section item should then be
accessed throughout the program rather than the item in the Linkage
Section.
If a linkage parameter is optional you can detect its presence using a
statement of the form:
if address of linkage-item not = null
Sorting.
DOS, Windows and On 16-bit systems, the run-time system defaults to a
OS/2 fast
sort algorithm. The size of files that can be sorted
under this method is limited, but is enough for most
files. The alternative method which can handle the
sorting of very large files, but is slower, can be
enabled by the run-time switch B2. (See your COBOL
System Reference for details on this switch).
If you use input and output procedures with a sort in your program it is
important that you write them efficiently as they are executed once for
each record you are sorting. Inefficient input and output procedures can
make the sort process appear to be very slow.
Compiler Directives
A number of Compiler directives can be used to make the native code for a
program better optimized. Some of these directives must be used with
care; ensure that the behavior you get with them is acceptable.
In general, always use the following directives when compiling your
programs to native code:
UNIX:
NOALTER
ALIGN"4" or ALIGN"8"
COMP
NOANIM
NOBOUND
NOCHECKDIV
NONESTCALL
NOODOSLIDE
NOQUAL
NOSEG
NOTRUNC
Other suggestions (to help prevent inefficient coding):
REMOVE "UNSTRING"
REMOVE "STRING"
REMOVE "GIVING"
REMOVE "ROUNDED"
REMOVE "COMPUTE"
REMOVE "ERROR"
REMOVE "ALTER"
REMOVE "INITIALIZE"
REMOVE "CORRESPONDING"
REMOVE "TALLYING"
REMOVE "THRU"
REMOVE "THROUGH"
Using Directives to Optimize for Speed.
There are many
directives you can use to optimize for speed. In some cases, the
defaults for Compiler directives are the ones that provide speed
optimization. This section points out directives that need to be changed
from their default values to provide for speed optimization.
ALTER Directive.
For efficiency reasons, you should not use ALTER statements in programs.
It is recommended that you avoid them altogether, and compile with
NOALTER, to prevent the Compiler from having to produce code to look for
them.
BOUND Directive.
The BOUND directive does boundary checking on table items.
Your applications can be made faster (and smaller) by compiling with
NOBOUND. Otherwise, the Compiler inserts extra code to do boundary
checking on all references to table items.
During testing, you should use the BOUND option until you are satisfied
that your program is not referencing data beyond your table limits. For
production, NOBOUND gives you the desired efficiency.
BOUNDOPT Directive.
The BOUNDOPT directive can be used to optimize your code if:
* you are using USAGE DISPLAY or COMP-3 subscripts
* you are using NOBOUND (see the BOUND directive above)
When BOUNDOPT is specified, digits in a USAGE DISPLAY subscript above the
size of the table are ignored. For example, a PIC 9(3) subscript would
be treated as PIC 9(2) for a table with less than 100 entries.
We recommend that you do not use USAGE DISPLAY subscripts.
COMP Directive.
The COMP directive prevents code checking for
numeric overflow. This produces highly compact and efficient code.
COMP changes the behavior of arithmetic on data items defined as USAGE
COMP. It produces more efficient code, but the behavior does not conform
to the ANSI standard.
If used with the proper care, COMP can improve the speed of your
programs.
IBMCOMP Directive.
The IBMCOMP directive causes data items with USAGE COMP to be compiled in
IBM synchronized format. It also affects the storage of words and bytes
(see your Language Reference for details).
NESTCALL Directive.
The NESTCALL directive enables nested programs to appear in your program.
If you know you have no nested calls in your program, specifying
NONESTCALL enables the Compiler to generate slightly more efficient code.
TRUNC Directive.
The TRUNC directive causes the Compiler to create code to determine
whether data items need to be truncated or not.
If you are certain that you do not need truncation of data items, then
NOTRUNC causes the creation of more efficient code.
Using Directives to Optimize for Size.
There are many directives you can use to optimize for size. In some
cases, the defaults for Compiler directives are the choices that provide
size optimization. This section points out directives that need to be
changed from their default values to provide for size optimization.
ALIGN Directive.
This directive specifies the boundary on which 01- and 77-level items
start. This boundary should always be a power of two, such as two, four
or eight. For 32-bit systems use a minimum of ALIGN"4". For 16-bit
systems use a minimum of ALIGN"2". Use ALIGN"8" if you are using a
64-bit operating system. Higher powers of two retain efficiency, but
increase the amount of unused space between data records.
ANIM Directive.
The ANIM directive
is used frequently during testing. However, when you are compiling for
production, use NOANIM. This reduces data size slightly.
Using Dynamically Linked Executable Files.
UNIX This section is specific to the 32-bit environment on
UNIX.
If your system supports dynamically linked executable files, you should
use these in preference to statically linked executable files. The
advantages of using dynamic linking over static linking are:
* drastically reduced size of executable file
* faster linking (that is, faster creation of executable files)
* greater shareability of executable files (that is, more executable
files can be run at the same time)
Dynamically linked executable files can be created by setting the -B cob
flag to dynamic, which is the default if the flag is not explicitly set.
However, if your system does not support dynamic linking, static linking
is the default and use of the -B cob flag is unsupported. For full
details on -B, see your COBOL System Reference.
MPE/iX 5.0 Documentation