Writing a Preprocessor [ COBOL/HP-UX Operating Guide for the Series 700 and 800 ] MPE/iX 5.0 Documentation
COBOL/HP-UX Operating Guide for the Series 700 and 800
Writing a Preprocessor
This section explains how to write an integrated preprocessor
and describes the interface used to pass information between a
preprocessor and the Compiler.
Although a preprocessor could be written in a language other than COBOL,
the following description assumes that it is written in COBOL.
The integrated preprocessor interface works on the simple concept that
preprocessing is a form of editing. The preprocessor marks each line of
the source code as original, inserted (that is new lines) or unchanged.
When compiling a COBOL program, the Compiler calls the preprocessor
instead of directly reading the source file and receives the code line by
line from the preprocessor.
The operation of Animator depends upon a mapping of each line of object
code on to each line of source code. The marking of source lines
described above allows this mapping to be valid even though the object
code does not match the source code.
Definition of the Interface
This section describes the interface
between the Compiler and a preprocessor.
Preprocessor Parameters.
Three parameters are passed across the interface, mode-flag, buffer and
response. mode-flag is used to pass control information, buffer is used
for text information (source lines and file-names) and response is used
to indicate the type of source line in the buffer.
They are defined as follows:
01 mode-flag pic 9(2) comp-x.
01 buffer pic x(80).
01 response.
03 response-status pic 9(2) comp-x.
03 response-code-1 pic 9(4) comp-x.
03 filler redefines response-code-1.
05 filler pic x.
05 resp-main pic 9(2) comp-x.
03 response-code-2 pic 9(4) comp-x.
03 filler redefines response-code-2.
05 filler pic x.
05 resp-more pic 9(2) comp-x.
Note: Details on the data type COMP-X are contained in your Language
Reference.
The Initial Call.
The initial call is made to the preprocessor at the point where the
Compiler would normally open the source file. The mode-flag parameter is
set to 0 and the name of the source file (including path information) is
placed in buffer. The preprocessor should open the file and return zero
in response-status to indicate success; any other value indicates a
failure.
The remainder of the response parameter is not used.
The operating system command line contains any directives to the
preprocessor, terminated by the character x"7F". These directives are
specified in the Compiler command line, directives files or $SET
statements and follow the PREPROCESS directive itself. See the section
"Invoking a Preprocessor" earlier in this chapter, for further
information. For details on how to read the operating system command
line, please see COMMAND-LINE in your Language Reference. The directives
are used to pass information from the user to the integrated preprocessor
and are defined by the designer of the integrated preprocessor.
Subsequent Calls.
Subsequent calls request a line of source code until the preprocessor
indicates that the last line has been reached.
In these calls, the Compiler sets the mode-flag parameter to 1.
The preprocessor returns a line of source code in buffer, and
information about it in resp-main and resp-more. If there is an error,
response-status should be set to a nonzero value (the remaining fields
may be left undefined).
The first byte of response-code-1 and response-code-2 are reserved for
future use and must always be set to zero on return. The simplest way to
achieve this is to set response-code-1 and response-code-2 to zero before
setting resp-main and resp-more.
If you wish to modify your code, you should note that the original source
code lines should always be passed back before their replacement line(s).
Source code lines that are continued across more than one line are
treated as a single line, so the entire source line - not just a portion
of it - must be passed for modification. If you wish to modify such a
source line, you must first mark the entire line to be changed, as
described in the following section, before sending the replacement lines.
Marking Source Lines.
Source lines are marked as unchanged, old (to be treated as commented
out) or new. Special information is also required for source lines
containing COPY (or equivalent verb) statements. The information to mark
these lines is placed in resp-main as follows.
0 The source file has been completely processed and is the end of
input. buffer is undefined
1 buffer contains a new line added by the preprocessor which was
not in the original source code
2 buffer contains a line in the original source code which is
either being commented out or will be replaced by the
preprocessor
3 buffer contains a line in the original source code which contains
the start of a COPY statement that is about to be expanded by the
preprocessor
4 buffer contains a line in the original source code which contains
the continuation of a COPY statement
5 buffer contains a warning message inserted by the preprocessor.
This must have the format of a comment line (that is, the value
"*" in the seventh byte)
6 An unrecoverable error has occurred; this forces the Compiler to
abort and, if in Toolbox, to enter the COBOL Editor. A message
of up to 70 characters may be written to the command line and
this will be displayed on the bottom line of the editor
7 An error has occurred; this forces the compiler to increment its
error count. All error classes may be specified by using
resp-more (see the section "Generating Error Messages" later
in this chapter). The contents of buffer are ignored
11 buffer contains a new line added by the preprocessor which
contains the start of a COPY statement that is about to be
expanded by the preprocessor. It is used when the COPY statement
is not unique on a line. See below for an example of when this
might be used
12 buffer contains a new line added by the preprocessor which
contains the continuation of a COPY statement
32 buffer contains a line from the original source code which has
not been modified by the preprocessor
128 The end of a COPY-file has been reached. buffer must be empty
resp-more is used to specify additional information when resp-main has a
value of 1, 3, 6, 7 or 11.
When resp-main contains the value 1, resp-more is used to indicate the
position in the original source of the replaced non-COBOL verb as
follows.
0 No verb replacement is taking place.
nn is the number of the column containing the first
character of the non-COBOL verb being replaced by
the current line. The line(s) containing the
non-COBOL verb would have previously been marked by
returning the value of 2 in resp-main; if there
were more than one line, the verb is assumed to be
on the first of them. For example, if the original
source contains:
exec abc
do something useful
end-exec
and these three lines are replaced by:
call abc_something_useful
then the value of nn gives the position of the EXEC
statement.
When resp-main contains the value 3 or 11,
resp-more is used to indicate where the word "COPY"
(or equivalent verb) begins, and contains nn, the
column number containing the first letter of the
verb.
When resp-main contains the value 6 or 7, resp-more
is used to indicate how the error should be
handled. See the section "Generating Error
Messages" later in this chapter for more
information.
Source modification.
A number of COBOL commands exist to amend a source file. The following
statements are not supported by the Compiler when a preprocessor is
active and must be handled by the preprocessor:
*
REPLACE (ANSI 85 verb)
*
BASIS mechanism
COPY-files
COPY-files can be expanded by the preprocessor or the Compiler. The
preprocessor needs expand COPY-files only if the source code they contain
has code that needs preprocessing. In all other cases, the Compiler
should be used to expand COPY-files.
The Compiler will expand COPY statements if they are passed back as
either unchanged or amended lines by the preprocessor. The following are
supported:
* Simple COPY statement
* COPY ... REPLACING statements
* COPY copy-file-name OF/IN library-name
* COPY statements spread over several lines
* Nested COPY statements
* ++INCLUDE
* -INC
resp-main and resp-more must be used as described above if the
preprocessor is handling COPY expansion itself.
A value of 11 would be required in resp-main in the following type of
situation where the COPY statement is not unique on a line:
The source contains
01 ITEM-A. COPY "COPY-FIL".
This is first returned with resp-main set to 2 to indicate that this is a
line that is about to be replaced. On the next call, the preprocessor
returns
01 ITEM-A.
with resp-main set to 1 to indicate that this is a replacement line. On
the next call, the preprocessor returns
COPY "COPY-FIL.".
This time, resp-main is set to 11 to indicate this is a replacement line
containing the COPY statement alone. resp-more is set to 20, the
position of the word COPY on the original source line.
Generating Error Messages
If the preprocessor encounters an error
when processing the source code, it can communicate this to the Compiler
so that the error is treated as a syntax error. There are two ways to do
this.
* Set resp-main to the value 5 and place a comment line in buffer;
the comment will be inserted in the list file.
* Set resp-main to the value 6; the Compiler will terminate.
The value in resp-more specifies the column number the error was
found in. It is used when positioning the cursor on return to the
Editor.
It is also possible to force the Compiler to increment its internal error
counts in conjunction with one of the two operations described above.
This is done by setting resp-main to the value 7 and specifying which
error count is to be increased in resp-more. Possible values for
resp-more are:
1 Unrecoverable error
2 Severe error
3 Error
4 Warning
5 Informational
6 Flag count
Increasing the Unrecoverable error count will cause the Compiler to abort
immediately. The contents of buffer are ignored.
It is the responsibility of the preprocessor to output error messages to
the user before forcing the Compiler to either abort or increment the
error counts. This should not be confused with the message that can be
inserted in the list file which is for informational purposes only. The
sort of message which might be displayed by the preprocessor explains to
the user what is happening or requests a choice, for example, the message
"Continue Compiling Y/N ?".
Multiple Preprocessors
Several preprocessors may be active simultaneously on the same source
program. They are arranged in a stack so that the Compiler calls the top
preprocessor in the stack, this preprocessor calls the next preprocessor
and so on to the preprocessor at the bottom of the stack which actually
reads the source code. Each line of source is then passed through every
preprocessor in turn until it reaches the top of the stack and is passed
to the Compiler. In order for this to work, the preprocessor must obey
some additional rules.
Writing a Stackable Preprocessor
A stackable preprocessor is identical to one which is not stackable
except that it should contain extra code to invoke another preprocessor
if this is necessary.
As described above, the Compiler writes preprocessor directives to the
command line when it makes the initial call to a preprocessor. The
preprocessor reads this command line and, if it finds a PREPROCESS
directive, should invoke the preprocessor named in the PREPROCESS
directive. It should also pass any parameters following the directive to
the invoked preprocessor by writing these in turn to the command line.
The interface between two preprocessors is identical to that between the
Compiler and preprocessor specified above, such that it is possible to
use a stackable preprocessor in both stacked and unstacked situations.
It should also be noted that a preprocessor which has not been designed
to be stackable may be stacked at the end of the stack where it is
directly reading the source code.
In most cases, preprocessors will be acting on discrete sets of syntax
and will all be producing valid COBOL syntax, so it is unlikely that more
than one preprocessor will want to process a particular source line.
However, it is possible that preprocessors in a stack represent
language levels so that a source line is edited several times by the
different preprocessors on its route through the stack to the Compiler.
In this case, care must be taken to read the information in the response
field so that the relationship between the source code and the code that
finally reaches the Compiler is maintained. The order in which the
stacking takes place should also be chosen carefully if code altered by
one preprocessor is also to be successfully modified by a second.
MPE/iX 5.0 Documentation