Operation [ Micro Focus COBOL System Reference, Volume 1 ] MPE/iX 5.0 Documentation
Micro Focus COBOL System Reference, Volume 1
Operation
This section contains the following topics:
* Data File Organization - This COBOL system handles files of four
different organizations and three different file assignments.
* File-naming Conventions - File-naming for this COBOL system is
based on standard operating system file-naming.
* Data File Assignment - Gives the three methods of file assignment.
* Multiple Reel Files - These are supported for sequential files
which means that you can specify that your file is held on more
than one of the same type of media.
* File Buffering - A method of speeding up your program by writing
records to a buffer in memory.
* File Usage - A section which covers file sizes and points to note
when creating files.
* File Compression - Data and key compression are supported, as are
duplicate keys.
* Btrieve Call Conversion Modules - Two call conversion modules are
provided for DOS, Windows and OS/2 systems to enable systems using
Btrieve to use the calls used by Extfh, the External File Handler.
One call conversion module is provided for UNIX systems.
These features are explained in the following sections.
Data File Organizations
COBOL programs can create, update and read files of four different
organizations:
* record sequential
* line sequential
* indexed sequential
* relative
Record Sequential.
Sequential files are the simplest form of COBOL file. Records are placed
in the file in the order they are written, and can only be read back in
the same order.
Line Sequential.
Line Sequential files are a special type of sequential file. They
correspond directly to text files as produced by standard editors.
Indexed Sequential.
Indexed files are the most complex form of COBOL file handled directly by
the syntax of this COBOL system. Records in an indexed file are
identified by a unique user-defined key when written. Each record can
contain any number of user-defined keys which can be used to read the
record, either directly or in key sequence.
Relative.
Every record in a relative file can be accessed directly without having
to read through any other records. Each record is identified by a unique
ordinal number both when it is written and when it is read back.
File-naming Conventions on DOS, Windows and OS/2
ThisCOBOL system, and applications created with this system under the
DOS, Windows and OS/2 operating systems, all use the standard operating
system file-name convention:
[device:][ [path-name\]file-name[.[ext] ]
where:
device is a system device. It can be a disk drive, the display, a
printer, or an RS232
port. If you include file-name, device must be a disk
drive. This COBOL system recognizes the following symbolic
device names:
A, B, etc. Disk drive
CON Console keyboard or screen
AUX Alternative names for first asynchronous
COM1 communications adapter port
PUN
COM2 Second asynchronous
communications port
LPT1 Alternative names for first parallel printer
LPT
LST
PRN
LPT2 Second or third parallel printer
LPT3 etc.
ERR Standard Error Output
NUL Dummy device - do not create
(not available with indexed files).
Note: The availability of these devices can vary
according to your environment.
path-name is the operating system path containing or to contain the
file.
If you do not specify a device and/or path-name, the
current drive and/or path is assumed.
file-name is the name of the disk file. file-name can be a maximum
of eight characters in length.
.ext is a period (.) followed by up to three characters giving
the file-name extension. You can specify an extension of
spaces by putting just the period.
NOTE For portability to UNIX systems, the "\" in this command line can
also be "/", which you should use with relative paths. The device
must be omitted.
On OS/2 1.2 or later with the
High Performance File System, this COBOL system and applications created
with it can use the HPFS
naming convention, with the exception of embedded spaces. The full
definition of this is in your OS/2 documentation. In brief, the name is
as above, except the .ext disappears. Instead, each file or directory
name can consist of up to 254 characters and contain any number of
periods (.). This COBOL system regards any text following the final
period as an extension (or a trailing period as a space extension) and
removes it when creating its own names. A complete file specification
can be up to 255 characters long. (OS/2)
This COBOL system accepts any COBOL file for input. The system assigns
the default extension .CBL to your source file. If you have named your
file with an extension, your assigned extension overrides .CBL.
The compiler creates a number of files with different extensions added to
the name you supply. The extensions are: .OBJ, .LNK, .IDY, .LST and
.Inn, where nn is a two-digit number. The linking process creates .EXE
or .DLL files.
Query data lists created by Animator files are given the extension .ils.
For further details of file-names and extensions, refer to your operating
system manuals. However, certain file-names (LPT, CON, and PRN), which
refer to devices rather than to disk files, should not be used within the
COBOL system to access disk files. DOS, Windows and OS/2 support
parallel printers and serial printers. Instructions on configuring DOS
and OS/2 to redirect parallel printer output to the Asynchronous
Communications Adapter are given in the operating system manuals under
the MODE command, for DOS and OS/2.
Windows does not have a command-line. It is not, therefore, possible to
configure Windows to redirect parallel printer output to the Asynchronous
Communications Adapter.
File-naming Conventions on UNIX
This COBOL system and applications created with this system under the
UNIX and similar operating systems, all use the standard operating system
file-naming convention:
[path-name/]file-name[.ext]
where:
path-name is the operating system path containing or to contain the
file.
If you do not specify a path-name, the current directory is
assumed.
file-name is the name of the disk file.
.ext is a period (.) followed by the file-name
extension.
This COBOL system also recognizes the following symbolic names, which
must appear without any path name or extension:
:CI: Standard input
stdin
:CO: Standard output
stdout
LPT1 Printer output
:CE: Standard error
stderr
To handle the symbolic names used by other operating systems, use
file-name mapping to map the symbolic names onto UNIX device names.
The maximum sizes of path-name, file-name and .ext is consistent with the
UNIX system limits. Typically, the total size of the combination of all
three name components is 100 characters. The maximum size of the
combination of file-name and .ext is 14 characters. See your Release
Notes for details of the maximum number of characters permitted in a
component name and a path-name.
This COBOL system detects any path names which exceed the permitted
maximum number of characters for a path when you attempt to open,
split/join or map such files. It similarly detects any component names
which exceed the permitted maximum number of characters for a component
name.
For example:
* An OPEN...OUTPUT with FILE STATUS defined operation using a file
whose component name exceeds the permitted number of characters
for a component name, or uses a file whose path-name exceeds the
permitted maximum number of characters for a path-name causes a
file status error.
* An OPEN...OUTPUT without FILE STATUS defined operation using a
file whose component name exceeds the permitted maximum number of
characters for a component name, or uses a file whose path-name
exceeds the permitted maximum number of characters for a path-name
causes a fatal run-time error.
* A CALL...ON EXCEPTION operation using a file whose path-name
exceeds the permitted maximum number of characters for a path-name
causes an exception error.
* A CALL operation with no EXCEPTION clause that uses a file whose
path-name exceeds the permitted maximum number of characters for a
path-name causes a fatal run-time error.
The fatal run-time error is:
188 Filename too large
except for indexed files for which you receive the fatal run-time system
error:
75 Indexed data file name too long
In the interests of program portability, we advise you not to build the
absolute path-names of files into your programs. You should use relative
path-names instead. For example, you are recommended to use:
mylib/file1
rather than:
/usr/mylib/file1
Data File Assignment
Data file assignment enables you to tie the logical files of a program to
the physical files.
The COBOL system provides three types of file assignment facilities:
* Fixed file assignment, in which you assign the logical (internal)
file-name to a literal (external) file-name at the time you write
your program, which gives a physical file-name.
* Dynamic file assignment, in which you assign the logical
(internal) file-name to a data item defined in your program. You
store the name of the physical file in this variable at run-time
before opening the file.
* External file assignment, in which you assign the logical
(internal) file-name to an environment variable in your program to
specify the external name of the file being referenced.
These are defined in the following sections.
Fixed File Assignment .
In fixed file assignment, you assign the logical (internal) file-name to
a literal (external) file-name in the File-Control paragraph of your
program. Note that you cannot change the literal without recompiling
your program.
Environment Division.
In the File-Control paragraph, you specify the SELECT...ASSIGN TO clause
in the following format:
select file-name
assign to [disk] literal
[,literal].
where:
file-name is the logical (internal) user file-name and can be any
COBOL word you define.
disk is optional and gives you the choice of specifying the
literal either in the SELECT clause, or in the FD entry for
the file in the File section.
literal is the operating system path-name of the corresponding
file. It can be the name of a file on disk, optionally
including a drive and/or path identifier and file-name
extension, or it can be a device-name. Standard
device-names are listed in the section File Naming
Conventions earlier in this chapter.
If you do not specify literal with disk in the SELECT clause and do not
specify the VALUE OF FILE-ID clause in the file's FD entry, the effect is
to use a physical file-name that is the same as the logical (internal)
file-name.
Note that the logical (internal) file-name is folded to uppercase by the
Compiler.
Refer to your operating system manuals for more information on file names
and to your Language Reference for the full syntax of the ASSIGN TO
clause.
Examples.
One example includes file-naming in the DOS, Windows and OS/2 file-naming
format; that is, using a drive identifier and backslashes ( \ ) as
separators. For UNIX, replace the drive identifier with a device name
and the backslashes with forward slashes ( / ).
select stockfile assign to "b:warehs.buy".
select printfile assign to "prn:".
select input-file assign to "data\prog1in".
The file-name you specify is then also used in the OPEN statement when
the file is needed for use in the program. In the first example above
you would specify open input stockfile which causes the file warehs.buy
to be opened on drive b:.
Dynamic File Assignment.
In dynamic assignment, you assign the logical (internal) file-name used
by your program to a data item. You can choose to define the data item
name explicitly in the Data Division. If you do, the contents of the
data item are used when the file is being opened to specify the file-name
to the operating system. If you decide not to define the data item
explicitly, the COBOL system supplies an implicit definition of pic
x(65). Then you must remember to MOVE a value into the data item before
using it in any OPEN statement.
Environment Division.
In the File-Control paragraph, you specify the SELECT...ASSIGN TO clause
in the following format:
select file-name
assign to dynamic disk data-item
where:
file-name is the logical (internal) file-name and can be any COBOL
word you define.
disk enables you to specify the name of the data item that holds
the file-name in the VALUE OF FILE-ID clause in the file's
FD entry.
data-item'' is a COBOL data-name.
If you use the ASSIGN"DYNAMIC" directive, you can omit the word DYNAMIC
from this clause.
Examples.
select stockfile assign dynamic stock-name.
select output-file assign dynamic to real-file-name.
Procedure Division.
Before the file is opened, you must move the operating system file-name
to data-item specified by the ASSIGN phrase of the SELECT clause:
Example.
This example includes file-naming in the DOS and OS/2 file-naming format;
that is, using a drive identifier and backslashes ( \ ) as separators.
For UNIX, replace the drive identifier with a device name and the
backslashes with forward slashes ( / ).
select stock-file assign to dynamic stock-name
. . .
data division.
. . .
working-storage section.
01 stock-name pic x(25).
procedure division.
. . .
move "warehs.buy" to stock-name.
open input stock-file.
. . .
. . .
. . .
close stock-file.
. . .
. . .
move "data\warehs.sel" to stock-name.
open input stock-file.
. . .
. . .
close stock-file.
You can enter the operating system file-name either by entering the name
at the keyboard through an ACCEPT statement or by storing it as variable
data. In this way, different external files can be processed by one
data-item during the run of the program. You must move the name of the
file to data-item before the OPEN statement is executed.
Once the OPEN statement has been executed, you can use the data-item data
area for any purpose you require. In the above example, between the two
OPEN statements you could use stock-name to store any data string you
need.
External File Assignment.
With external file assignment, the ASSIGN clause contains the name of an
environment variable. You must either set and export, or define the
environment variable in the operating system environment, prior to
running the program to specify the name of the literal (external) file
being referenced. Using this method, you can easily change the literal
(external) file-name without having to change the program.
Environment Division.
In the File-Control paragraph, you specify the SELECT...ASSIGN TO clause
is in the following format:
select file-name
assign to external literal
where:
file-name is the logical (internal) file-name and can be any COBOL
word you define.
literal identifies an operating system environment variable whose
value is used as the file's path name.
Under DOS, Windows and OS/2, you set the environment
variable using the SET command. Under UNIX, you set the
environment variable to the appropriate value and export
the value to the shell.
If you use the ASSIGN"EXTERNAL" directive, you can omit the word EXTERNAL
from this clause.
See the Language Reference for the full syntax of the SELECT...ASSIGN TO
clause.
Example.
select file-1 assign external masterfile
When an OPEN statement is executed, the run-time system searches for the
specified environment variable and loads the file specified by that
environment variable. In the example above, if masterfile equals
myfile.cbl, then myfile.cbl is opened when the statement open file1 is
executed.
On DOS, Windows and OS/2 systems, if literal contains any hyphen ( - )
characters, the variable to be defined in the SET command is all the text
after the last hyphen. (DOS, Windows and OS/2)
For example:
select file-2 assign to external a-b-c-defg
The environment variable required and searched for by the run-time system
is defg.
Multiple Reel Files
You can specify sequential files as multiple reel (or multiple unit)
files. This means that a sequential file can be held on more than one:
* removable disk
* cartridge tape (UNIX)
* tape reel (UNIX)
* disk partition (UNIX)
You must specify the file as a multiple reel file in the SELECT clause of
the Environment Division.
You cannot specify a sequential file as multiple reel if it has
variable-length records, since the file header record (see below) stores
only one record length.
Whenever you specify a sequential file as a multiple reel file, you are
prompted to load the appropriate reel of the file. This applies also to
the first reel of the file, even though it may already be loaded. The
prompt is:
PLEASE LOAD VOLUME nnnn OF FILE file-name FOR access ENTER NEW DEVICE (IF
REQUIRED) AND <CR> WHEN READY
where:
nnnn is the four-digit reel number of the reel expected to be
loaded. nnnn is in the range 0001 to 9999.
file-name is the file-name as specified in the SELECT clause in the
source program.
access is INPUT, OUTPUT or I/O as specified in the source program.
At this prompt you must ensure that the relevant disk or reel is loaded
(media for output must already be formatted), and enter:
* the drive identifier as a single character on DOS, Windows and
OS/2 systems. If you do not specify a drive, the COBOL system
assumes the default drive or the device identifier in the SELECT
clause of your program. (DOS, Windows and OS/2)
* the file-name on UNIX systems.(UNIX)
The system accepts input only in response to this prompt. The system
clears the input buffer each time this prompt is displayed so you cannot
type ahead. If you load the wrong volume of a file, or if the header
information is in some way corrupt, an error is returned.
On UNIX systems, when you have entered the relevant parameters for the
first prompt, another prompt is displayed as follows:(UNIX)
PLEASE ENTER CAPACITY OF DEVICE IN 1024 BYTE BLOCKS
At this prompt, enter the capacity of the device that the file is going
to, in 1024 byte blocks.
On all operating systems, if you decide not to continue, you must enter
at the prompt the key sequence that you have configured to terminate a
run. When this key sequence is executed, it ensures that all the files
are closed and that all the information is saved, as though you had
executed a STOP RUN statement.
The prompt to load a reel is displayed whenever:
* a multiple reel file is opened
* a CLOSE REEL statement is executed
* the reel becomes full while writing to a multiple reel file (this
is a forced reel-swap on WRITE)
* "end of reel" is true for a multiple reel file that is opened for
INPUT (or I/O on READ). This is true provided that a continuation
reel was created when the file was written.
NOTE Although you can specify REWRITE operations on a file opened for
INPUT-OUTPUT, we do not recommend that you do so. If the record
you are rewriting is at the end of a reel, the preceding READ
statement will have forced a reel-swap, so the rewrite will fail.
Multiple Reel File Header Record.
Multiple reel files have a block of header information that is 256 bytes
long. This header occupies the first 256 bytes of each reel and contains
information that describes the reel. This header contains 44 bytes which
is reserved. Under DOS, Windows and OS/2, you can use it for your own
internal identifier. Under UNIX, this is static and cannot be changed.
On DOS, Windows and OS/2 systems, the following routine moves a string to
the reserved area: (DOS, Windows and OS/2)
call x"A8" using your-label
where:
your-label is a PIC X(44) field containing the information to be
put into the header.
Only ASCII printable characters are allowed in this area. Once this
routine has been used, each subsequent OPEN OUTPUT on a multiple reel
file has your string in its header.
On DOS, Windows and OS/2 systems, when running programs to read multiple
reel files created by early versions of Micro Focus COBOL, you must use
the -v run-time switch. This disables checking of the header title.
(DOS, Windows and OS/2)
A multiple reel file header has the following structure:
Bytes Content
-------------------------------------------------------
0-49 Multiple reel header start identification.
50-69 File-name. This is the name of the file
as specified in the SELECT clause.
70-75 Date of file creation in the form yymmdd
(year, month, day in ASCII digits). If
your system does not return the date, this
part of the header contains ASCII zeros.
76-83 Time of file creation in the form hhmmsscc
(hours, minutes, seconds, hundredths of a
second in ASCII digits). If your system
does not return the time, this part of the
header contains ASCII zeros.
84-127 Reserved area for your own use (see
above).
128-131 Reel number. This is a four-digit ASCII
value showing the reel number in the range
0001 to 9999.
132 Continuation flags. A one-byte value that
shows how the reel ends. The value is
ASCII "Y", "A" or "N" as follows:
Y This reel is followed by a continuation
reel.
A CLOSE REEL statement was used to change
the reels.
A This reel is followed by a continuation
reel.
The reels were changed automatically when
this
reel became full.
N This reel has no continuation. It is
the last reel of the
file.
133 Reserved. Currently contains the ASCII
value "N".
134-145 Reel length. A 12-digit ASCII value which
indicates how many bytes of information
are on this reel. This part of the header
contains zeros if your system cannot
determine the reel size.
146-151 Record size. This is a six-digit ASCII
value that shows the record length of
records in this file.
152-157 Block size. This is a six-digit ASCII
value that has the same value as the
record size area of the header.
158-239 Reserved area containing ASCII spaces.
240-255 Multiple reel header end identification.
File Buffering
File buffering is where records are written to a buffer (block) in memory
until the block is full, at which time the block is written to disk.
This method reduces the number of accesses to disk, consequently speeding
up the program.
Similarly when reading records from disk, a block is read from the file
into a memory buffer and the next record extracted from the buffer.
When the file is closed, any data that has not already been written to
disk is written. The COBOL system then requests that the operating
system closes the file.
File Buffering on DOS, Windows and OS/2.
All sequentially accessed data files written by the COBOL system are
buffered. The indexed file handler buffers index records and data
records using pre-defined buffers. The size of the buffers can be
changed using the environment variables EXTFHBUF and IDXDATBUF.
EXTFHBUF controls the size of the buffer to be used for records of the
.idx files of indexed files. It has a minimum size of 4096 bytes, a
default size of 16384 and a maximum size of 65535 bytes. It can be
changed using the operating system command:
set extfhbuf=buffer-size
where:
buffer-size is the size of the buffer to use, in bytes. It
should be a multiple of 4096 bytes.
The environment variableIDXDATBUF controls the size of the buffer to be
used for data records of an indexed file. The default is zero, in which
case records are not buffered. Data buffering will generally improve the
speed of sequential accesses to an indexed file only if the data was
written in key order. You can get a file into key order using the
reorganization facility of Rebuild. See the chapter Rebuild later in
this manual for further details. Data buffering is enabled by the
operating system command:
set idxdatbuf=buffer-size
where :
buffer-size is the size of the buffer to use, in bytes. It
should be a multiple of 4096 bytes.
File Buffering on UNIX.
Variable length sequential files are buffered by default. Fixed length
sequential files are buffered when the environment variable COBEXTFHBUFis
set.
COBEXTFHBUF controls the size of the buffer to be used and is set as
follows:
COBEXTFHBUF=buffer-size
export COBEXTFHBUF
where:
buffer-size is the buffer size in bytes.
It is not possible to change the buffering of indexed files.
File Usage
Under this COBOL system with DOS, Windows and OS/2, the memory required
for open files is calculated at the buffer size + 96 bytes for the header
record. The buffer size is, by default, 4096 bytes for sequential access
files; otherwise it is 512 bytes. If the G run-time switch is turned on,
then all files use 512-byte buffers. Screen input and output and direct
output to a printer are excluded from this calculation. (DOS, Windows
and OS/2)
Under this COBOL system with UNIX, the maximum size of any file you can
create is limited by the environment variableulimit. You may find that
the default limit, as supplied with your UNIX system is rather small.
However, this limit can be increased by a superuser.(UNIX)
The maximum number of files that can be opened at any one time, excluding
the standard input, output and error files, is dependent on the
configuration of your UNIX system. See your Release Notes for details of
configuration for your system.
NOTE
* Each open indexed sequential file counts as two files.
* Whenever the COBOL system executes a GO TO, PERFORM or CALL
statement that causes an overlay to be loaded, another file
is used while the overlay or subprogram is being loaded.
* On DOS, Windows and OS/2, when you are using Animator, a
minimum of two extra files are used. If the RAM starts to
fill up, then the COBOL system will open any number of files
for areas that are temporarily swapped out of RAM. (DOS,
Windows and OS/2)
* On DOS and Windows, the maximum number of open files is
given by FILES in your CONFIG.SYS file. Note that five of
these files are reserved for DOS. On OS/2, you can use the
/F switch to raise the limit on the number of open files.
See the appendix Descriptions of Run-time Switches later in
this manual for further information. (DOS, Windows and
OS/2)
* On UNIX, by default, a maximum of 3 Mbytes of memory can be
used by a SORT or MERGE, although a minimum of 55% of the
original calculated requirement will generally be
sufficient.(UNIX)
If this value is considered too high for your environment,
the environment variable COBSW=-s should be used to restrict
the memory used. If, however, you have more than 3 Mbytes
of memory to deal with larger amounts of data, you can set
COBSW=-s to enable SORT/MERGE to use this extra memory.
In general, if the memory that SORT/MERGE has at its
disposal exceeds the size of the data to be sorted or
merged, this will give SORT/MERGE peak performance.
See the appendix Descriptions of Run-time Switches later in
this manual for further information on the -s setting of
COBSW.
* On UNIX, the open mode of a file can influence the maximum
number of files opened at any one time. Files opened for
INPUT are limited to the maximum number of files per process
which has been configured for your UNIX system. Files with
EXCLUSIVE access (for example, files opened for OUTPUT)
acquire file locks and are thus limited to the maximum
number of file locks per process which has been configured
for your UNIX system. These two limits are not necessarily
the same.(UNIX)
* If a power outage or a system reboot occurs while an
application is executing, the integrity of the file cannot
be guaranteed.
File Compression
This COBOL system enables you to store files in compressed form in order
to save disk space. This can be accomplished either through data
compression or key compression.
Data Compression.
Data compression is a process that enables you to compress the data in a
sequential or indexed file. The compression mechanism provided with this
COBOL system is run-length encoding (type 1).
When a file is defined with run-length encoding, any string of repeating
characters is stored as a single character with a repetition count.
You enable data compression by including the DATACOMPRESS directive in a
$SET statement with an integer value of 1 (run-length encoding). For
details on DATACOMPRESS, see the appendix Directives for Compiler.
On DOS, Windows and OS/2 systems, when compressing data in a sequential
file, you must specify the CALLFH"EXTFH" directive in a $SET statement in
your program.(DOS, Windows and OS/2)
Specifying data compression for a fixed structure sequential file changes
it into a variable structure sequential file. See the appendix File
Formats for further information. This will not affect your program.
Data compression routines are callable from your program using the
Callable File Handler, available with an add-on product.
The compression used by a file is determined by the last processed
DATACOMPRESS directive when the SELECT statement for the file is
processed. Consequently, the compression type can be set for an
individual file by using a line of the form:
$SET DATACOMPRESS
immediately before its SELECT statement. You must not forget to turn it
off with a $SET NODATACOMPRESS before any other files are processed.
Key Compression.
Key compression is a technique that can be applied to the keys of an
indexed file. There are three types of compression available:
* compression of trailing spaces
* compression of identical leading characters
* compression of duplicate alternate key values.
Any combination of these can be used on any key, though the compression
of duplicates is only appropriate to alternate keys with duplicates
enabled.
Key compression is not specified by COBOL syntax. You enable it by
including the KEYCOMPRESS directive in a $SET statement using integers to
indicate which types of compression you want.
The compression used by a file is determined by the last processed
KEYCOMPRESS directive when the SELECT statement for the file is
processed. Consequently, the compression type can be set for an
individual file by using a line of the form:
$SET KEYCOMPRESS
immediately before its SELECT statement. You must not forget to turn it
off with a $SET NOKEYCOMPRESS before any other files are processed.
For details on KEYCOMPRESS, see the appendix Directives for Compiler
later in this manual.
Compression of Trailing Spaces.
When a key is defined with compression of trailing spaces, trailing
spaces in a key value are not stored in the file. However, information
is stored so that the key can be correctly located.
For example, assume you have a prime or alternate key that is 30
characters long, and that you write a record in which only the first 10
characters of the key are used, the rest being spaces. Without
compression, all 30 characters of the key are stored. With compression
of trailing spaces, the key only occupies 11 bytes in the index file (10
bytes for the characters of the key and 1 byte as a count of the trailing
spaces).
Compression of Leading Characters.
When a key is defined with compression of leading characters, all leading
characters that match leading characters in the preceding key are not
stored in the index file. However, information is stored to allow the
key to be correctly reconstructed.
For example, assume that records are written with the following key
values in a key defined with compression of leading characters:
AXYZBBB BBCDEFG BBCXYZA BBCXYEF BEFGHIJ CABCDEF
The keys actually stored in the index file are:
AXYZBBB BBCDEFG XYZA EF EFGHIJ CABCDEF
Compression of Duplicate Keys.
When an alternate key is defined with compression of duplicates, only the
first duplicate key is contained in the file. The rest are not stored,
but information is stored to allow correct recreation of the keys.
For example, suppose you write a record with an alternate key value
"ABC". If you have enabled compression of duplicate keys, and you write
another record with the same key value, the file handler does not
physically store the duplicate key value in the index file. However, the
record is still available along the alternate key path.
Example of Setting Compression.
In the following program, data compression is specified for transfile but
not for masterfile. For key compression, suppression of trailing spaces
and of leading characters that are the same as in the previous key is
specified for keys t-rec-key and m-rec-key. Suppression of repetition of
duplicate keys is also turned on for m-alt-key-1 and m-alt-key-2.
set callfh"extfh"
$set datacompress"1
$set keycompress"6"
select transfile
assign to ...
key is t-rec-key.
$set nokeycompress
$set nodatacompress
select masterfile
assign to ...
organization is indexed
$set keycompress"6"
record key is m-rec-key
$set keycompress"7"
alternate key is m-alt-key-1 with duplicates
alternate key is m-alt-key-2.
$set nokeycompress
Duplicate Keys
All alternate keys in indexed files can be duplicate keys. However, we
do not recommend that you allow duplicates on primary keys. This is
because, with duplicates allowed, it is not possible to uniquely identify
records in a file. See the READ, REWRITE and DELETE statements in your
Language Reference.
To enable duplicate keys, you specify the phrase WITH DUPLICATES in the
ALTERNATE RECORD section of the SELECT statement.
When you use duplicate keys, there are two limitations of which you
should be aware:
* A maximum number of 65535 duplicate keys is allowed for every
individual key in a standard file. Each time you specify a
duplicate key, an increment of one is added to its occurrence
number. However, because the occurrence number is used to ensure
that duplicate key records are read in the order in which they
were created, and any occurrence number whose record you have
deleted cannot be reused, the duplicate key maximum may be
reached.
To overcome this, a different file format is available:
IDXFORMAT"4". You invoke this file format when you specify the
IDXFORMAT"4" compiler directive within the SELECT statement for
individual files or for all files in the program. IDXFORMAT"4"
format files allow a maximum of 4,294,967,297 duplicate keys,
although any additional information is stored in the data record
so that the handling of such a number of keys is quicker. This
causes the data record of such files to be larger than those of
the default files.
MPE/iX 5.0 Documentation