|
by George Stachnik
Operating systems such as MPE/iX are tools that people
use to manage computer systems. The word manage can mean a lot of
things. But if you boil the meaning of this word down to its essence (at
least in the context of commercial computing), you'll find that the verb
manage boils down to doing just two things: You need to tell the
computer which programs to run, and you must tell it where to find or store
the data that those programs will operate upon.
This month, we're going to explore some of the tools application programmers
can use to access data stored in files--beginning with a brief review of
formal file designators, and continuing with the MPE/iX Intrinsic interfaces.
An application, once running, typically reads data from one or more input
files. It also writes new data into one or more output files. In earlier
articles in this series, we saw that the linkages between programs and their
input and output files are typically defined using special names called
formal file designators. We'll begin this month's installment with
a brief review of how these things work.
Formal File Designators
When a programmer designs an application program, he or she must decide
how many files the program will access, and how. For each file, the program's
source code will contain a name called a formal file designator. For example,
suppose you were designing a very simple application program with one input
file and one output file. You might decide to identify them using the formal
file designators INFILE and OUTFILE. There's nothing special about these
names; you can use any names you like. The only restriction MPE makes is
that the formal file designators must follow MPE's file naming conventions.
That is, they can be no more than eight alphanumeric characters long, and
the first character must be alphabetic.
When you run an HP 3000 application program, it will try (by default
at least) to read and write files that have names that match the formal
file designators. For example, suppose you have a program that uses the
formal file designators INFILE and OUTFILE. When you run such a program,
MPE will attempt to read and write files in your logon group that bear those
file names.
Input files are handled slightly differently from output files. For example,
when an application program opens an input file named INFILE, MPE will try
to find an existing file (either temporary or permanent) named INFILE. If
such a file doesn't exist, then the open operation will fail with an error.
If the program doesn't trap this error and handle it, the program will fail
the first time it tries to read the file. Similarly, when an application
program opens an output file named OUTFILE, MPE will attempt to open an
existing file (either temporary or permanent) by that name. If no file named
OUTFILE exists, MPE will create a new file by that name and use it.
In an earlier article in this series, we saw that formal file designators
can be linked with file names other than the ones specified in the program
itself. This is done using a form of the :FILE command called a file equation.
For example, if you want your application program to read its input data
from a file called INPUT01 instead of from INFILE, you simply issue the
following command prior to running the program:
:FILE INFILE=INPUT01
File equations are very versatile tools. You can use them to redirect
your output to other groups and accounts. For example, suppose you wanted
to redirect the output to a file called DATA01 in the PUB group of the MYACCT
account. Once again, referencing the formal file designator (OUTFILE), you
might use a file equation like this one:
:FILE OUTFILE=DATA01.PUB.MYACCT
File equations also can be used to redirect the input or output of a
program to special devices. By default, HP 3000 applications will read and
write files on disk drives (;DEV=DISC). But suppose you wanted to redirect
the records being written to the file designated OUTFILE to a tape drive
instead. Once again, a file equation, issued prior to executing the program,
will do the trick. This time the syntax is:
FILE OUTFILE;DEV=TAPE
As you can see, file equations can be used to make programs work with
files on any device, using any valid file name. As far as our application
programmer is concerned, all that he or she needs to be concerned with are
the names INFILE and OUTFILE (or whatever formal file designators he or
she chooses). People who run the program are free to associate those designators
with whatever files they please, on whatever devices they please using file
equations like those shown above.
The application programmer does not need to be aware of them at all.
File names and device characteristics are hard-coded into each application
program. But they represent only the program's defaults. They can be overridden,
as we have shown, using file equations. This characteristic of MPE/iX is
called "device independence," and it gives MPE/iX system managers
a great deal of flexibility in how they manage their applications.
In spite of this flexibility, there are some things that programmers
do need to be aware of and plan for. For example, application programs must
contain the program logic that defines exactly how they will access their
input and output files. One tool that is used to create this logic is a
set of special MPE routines called intrinsics.
Intrinsics
Suppose we want our application program to access an input file that
we'll designate as INFILE and an output file designated as OUTFILE. We must
make sure our application program does the following three things:
- The program must OPEN the files. For example, in COBOL, it might use
COBOL's OPEN verb. In COBOL, each file can be opened either for READ access
or for WRITE access. This is part of ANSII standard COBOL. HP COBOL is
an implementation of ANSII standard COBOL, with some extensions.
- After the files are OPENed, the application program will then READ
or WRITE the files. In general, each READ operation copies the contents
of one record from the input file into a buffer in the program itself.
Similarly, each WRITE operation copies the contents of a buffer into a
record in the output file. Programs written in HP COBOL typically achieve
this using the ANSII standard READ and WRITE COBOL verbs. There are similar
ANSII standard features for C, FORTRAN, and most other languages.
- When the program has finished executing, one of the last things it
should do before terminating is to CLOSE its files. Once again, this is
typically achieved using ANSII standard program statements such as COBOL's
CLOSE verb. If an HP 3000 application program terminates without closing
its files (as would happen in the case of a program abort), MPE will close
those files automatically.
Generally, the precise techniques that are used to OPEN, READ, WRITE,
and CLOSE files are language dependent. That is, the instructions that you'd
code in COBOL are different from those that you'd use in BASIC or FORTRAN
or Java. Even within a language, there may be different implementations,
depending on which ANSII standard the compiler complies with. On the HP
3000, there have been COBOL compilers that complied with the 1968 ANSII
standard, the 1974 ANSII standard, and the 1980 ANSII standard for COBOL.
In spite of this, there is a common denominator across all versions of all
HP 3000 languages. That common denominator is made up of the MPE intrinsics.
MPE intrinsics are specialized routines that were designed to handle
tasks such as opening, closing, and accessing files. The MPE concept of
an "intrinsic interface" (or "intrinsic" for short)
is quite similar to the UNIX concept of a "system call" or the
MS Windows concept of an "entry point." Each intrinsic is a piece
of operating system code that can be invoked (or "called") by
an application program. Each intrinsic is associated with a specific task.
There are MPE intrinsics for opening files, closing them, reading them,
and writing them. There are also intrinsics that create processes, terminate
them, and communicate between processes. There's an intrinsic to handle
just about any system task you can think of.
Let's begin our exploration of the MPE intrinsics by taking a look at
an intrinsic called FOPEN. FOPEN can be used to open a file (see the sidebar).
There are at least two ways that FOPEN can be invoked from an application
program:
- First of all, an application program can call FOPEN explicitly.
For example, a program written in COBOL can use COBOL's CALL verb and invoke
FOPEN in much the same way you'd invoke a subroutine. Unlike a user-written
subroutine, the intrinsics are not part of the application program. They
are part of the operating system. FOPEN shows up in your compiler listing
as an "unresolved external reference."
- Alternately, FOPEN might be invoked implicitly. For example,
suppose a COBOL program contains an ANSII standard COBOL OPEN verb. In
that case, a call to the intrinsic will be generated by the compiler. You
won't actually see it in the source listing, because you didn't code it.
In this case, the intrinsic is being called implicitly. But once
again, the reference to FOPEN will show up as an "unresolved external
reference."
The idea of unresolved external references was covered in part 12 of
this series, during our discussion of the linkage editor. Before an HP 3000
application program can be successfully executed, it must be processed by
the linkage editor, which will identify any unresolved references to MPE's
intrinsics that the program might contain. These references are not resolved
until the run time, when the program is actually executed. At that time,
the references made by the application program will be resolved using the
system libraries, such as SL.PUB.SYS and NL.PUB.SYS.
Some Details About FOPEN
The FOPEN intrinsic was originally designed to open files on MPE/V systems.
It is supported today on MPE/iX systems as well, although it is little more
than a shell. On MPE/iX, FOPEN calls HPFOPEN, which actually does the work
of opening the file.
The precise syntax of FOPEN is defined in the HP 3000 intrinsics manual.
If you're new to intrinsics, learning about FOPEN is a good place to begin.
It is in some ways typical of MPE intrinsics.
The first thing to understand about intrinsics is that when you call
them, you typically pass them a list of parameters. Table 1
shows some of the parameters that can be passed to FOPEN. The
first parameter passed to FOPEN is the formal designator. This is a byte
array (MPE-speak for a character string) containing the formal file designator
of the file to be opened. The formal designator points FOPEN to the file
that you want to open.
Table 1: FOPEN
Parameter | Description |
Formal designator | A byte array that contains the formal file
designator of the file to be opened. |
Foptions | An array of 16 bits defining the kind of
file to be opened. (See Table 2) |
Aoptions | An array of 16 bits defining the way this
file is to be accessed. |
Recsize | An integer value defining the record size
of the file to be opened. |
Device | A byte array defining the name of the device
that the file resides upon (DISK, TAPE, etc.). |
Filesize | A long integer value that defines the
number of records this file will contain. |
Filenumber | An integer value that is passed back from
FOPEN. This integer value will be used to identify the file that has been
opened to other intrinsics, such as FREAD and FWRITE etc. |
The formal designator is generally followed by two 16-bit binary words.
These words are referred to as the FOPTIONS array and the AOPTIONS array.
The FOPTIONS array is a string of 16-bits in which each bit has a special
meaning. Table 2 describes some of the combination of bits that are
typically used in the FOPTIONS word. The notation used in this table is
a little cryptic, but it's worth understanding because it is used throughout
the MPE documentation. The bit strings that appear in the first column of
Table 2 are described using two numbers. The
first number is the number of the starting bit. In a 16-bit word, the bits
are numbered starting with 0 (0,1,2,....15). The second number is the number
of bits in the string.
Table 2: FOPTIONS
Bits | Value | Meaning | Comment |
2:3 | 000 | Standard file |
Used to specify that the file to be opened is an ordinary (standard) file,
suitable for access sequentially or directly. |
2:3 | 001 | CM KSAM file |
Used to specify that the file to be opened is a CM KSAM file. Note that
NM KSAM files cannot be opened with FOPEN. HPFOPEN must be used. |
5:1 | 0 | Allow file equations |
File equations can be used to redirect file input or output. |
5:1 | 1 | Disallow file equations |
File equations cannot be used to redirect file. |
8:2 | 00 | Fixed length records |
File will have fixed length records. |
8:2 | 01 | Variable length records |
File will have variable length records. |
10:3 | 000 | Filename |
This file will use the file name specified in the formal file designator. |
10:3 | 001 | $STDLIST |
This file will use the system file = $STDLIST. |
10:3 | 100 | $STDIN |
This file will use the system file = $STDIN. |
13:1 | 0 | Binary |
This is a binary file. |
13:1 | 1 | ASCII |
This is an ASCII file. |
14:2 | 00 | New file |
This is a new file. When FOPEN executes, it will create the file using
the specifications passed to FOPEN. |
14:2 | 01 | Old permanent file |
This is an old permanent file. When FOPEN executes, it will look for an
existing permanent file. File specifications such as the record size, which
are passed to FOPEN, will be ignored. The record size of the existing permanent
file will be used instead. |
14:2 | 10 | Old temp file |
This is an old temporary file. When FOPEN executes, it will look foran
existing temp file. File specifications such as the record size, which
are passed to FOPEN, will be ignored. The record size of the existing temp
file will be used instead. |
14:2 | 11 | Old permanent or temporary file |
This is an existing file. Both temporary and permanent or temporary file
domains will be searched when the FOPEN intrinsic executes. |
Let's look at a couple of examples. In a 16-bit word, the leftmost 4
bits would be designated using the expression 0:4. This expression literally
means: 4 bits starting with bit 0. The next 8 bits would be designated as
4:8--which is to say 8 bits, beginning with bit 4. Keep in mind that bit
4 is actually the 5th bit--counting left to right-- because we start counting
at 0. So the expression 4:8 references bits 5, 6, 7, 8, 9, 10, 11, and 12.
Here's one more example. The string 15:1 refers to the rightmost bit in
a 16-bit word.
For example, the first row of Table 2 is labelled 2:3.
This refers to bits 2, 3, and 4 of the 16-bit word (counting left to right,
starting with 0). The table shows that if these 3-bits are all 0, then the
file to be opened is a standard file. But if these 3-bits are 001, the file
to be opened is a CM KSAM file.
Take a look at the row of Table 2 that's
labelled 14:2. The last 2-bits of the FOPTIONS word tell FOPEN whether
it's going to open an existing file (01, 10, or 11), or create a new file
(00). If you are opening an existing file, you don't need to set the bits
that tell FOPEN things like what kind of file it is or what its record size
is. For example, if you're opening an existing KSAM file, it will figure
that out and handle it appropriately. But if you are opening a new file,
(bits 14:2=00) then FOPEN will be creating the file for you. In that case,
you'll have to pay attention to the other FOPTIONS bits, because they tell
FOPEN what kind of file to create.
Returning to Table 1, the third parameter
that is passed to FOPEN is another 16-bit word called the AOPTIONS word.
Once again, this is a binary array in which each bit specifies something
about how the file is to be accessed. Table 3
contains some of the values found in the AOPTIONS array.
Bits 12:4 determine whether the file is to be opened for READ access
(0000) or WRITE access (0001). There are other combinations that are used
to support direct access with FREADDIR and FWRITEDIR (0100) or FUPDATE (0101).
Bits 8:2 determine whether and how the file can be shared among other processes
on the system.
Table 3: AOPTIONS
Bits | Value | Meaning | Comment |
8:2 | 01 | Exclusive |
The process opening the file demands exclusive access to the file. Other
processes will not be able to access it while you have it open. |
8:2 | 10 | Exclusive allow read |
The process opening the file is the only one that can write to the file.
Others can read the file while you have it open. |
8:2 | 11 | Share |
The file is shared. Others can read or write the file. The FLOCK intrinsic
should be used to avoid corruption. |
10:1 | 0 | No FLOCK allowed |
|
10:1 | 1 | FLOCK is allowed |
|
12:4 | 0000 | Read only |
This file is opened for read access (FREAD is used to read the file). |
12:4 | 0001 | Write only |
This file is opened for write access (FWRITE is used to write new records
to the file). |
12:4 | 0011 | Append only |
This file is opened for append access (like write access, but records are
appended to the end of the file instead of replacing them). |
12:4 | 0100 | Read/Write |
This file is opened for read and write access (FREADDIR and FWRITEDIR will
be used to access the file). |
12:4 | 0101 | Update |
This file is opened for update access (FUPDATE is used). |
FOPEN and HPFOPEN
It's worth noting that the intrinsics have evolved over time as the HP
3000 has evolved. For example, there are two different MPE/iX intrinsics
that open files: FOPEN and HPFOPEN.
The FOPEN intrinsic dates back to the original models of the HP 3000--the
16-bit so-called "classic" systems. When HP introduced the newer
32-bit PA-RISC systems, support for FOPEN continued as part of the strategy
to maintain compatibility with the older models. FOPEN is a 16-bit compatibility
mode routine. As such, it is typically used by 16-bit compatibility mode
application programs that were ported from the classic environment.
HPFOPEN is a part of the PA-RISC version of MPE. It does not appear on
the older classic systems. The PA-RISC version of MPE was originally called
MPE XL, and later renamed MPE/iX. MPE/iX includes both intrinsics: HPFOPEN
and FOPEN.
Both intrinsics fundamentally serve the same purpose: They open files.
But the functionality provided by HPFOPEN is a superset of the functionality
supported by the older FOPEN intrinsic. The compatibility mode FOPEN intrinsic
is basically the same functionality that was available on the classic 16-bit
models. To use many of the new features of the file system that have been
implemented on MPE/iX, you must use the native mode HPFOPEN intrinsic.
We've seen that when you compile an ANSII standard COBOL program, the
compiler will generate intrinsic calls for you. If you compile a program
on an old 16-bit classic system, the compiler will only generate calls to
16-bit compatibility mode intrinsics such as FOPEN. On newer PA-RISC models
of the HP 3000, the situation is more complex. For one thing, depending
on the language you are using, you may have your choice of at least two
different compilers.
MPE/iX supports compatibility mode compilers such as the COBOLII compiler.
These compilers generate 16-bit machine code suitable for execution either
on classic HP 3000s or on PA-RISC models. MPE/iX also supports native mode
compilers such as COL85XL. These compilers generate 32-bit machine code
suitable for execution only on the PA-RISC machines.
If you compile a program that opens files, the compatibility mode COBOL
compiler will generate calls to FOPEN, but the native mode compiler will
generate calls to HPFOPEN. Virtually all HP 3000 applications use intrinsics.
Even if a program doesn't call an intrinsic explicitly, it's a pretty good
bet that it will call a number of them implicitly. Even if your applications
don't call intrinsics explicitly, it's a lot easier to troubleshoot applications
if you have a working knowledge of the MPE/iX intrinsics.
Table 4 contains a summary of the most
commonly used file system intrinsics on the HP 3000. We've seen how FOPEN
and HPFOPEN are used to open files. Next we're going to explore some of
the other intrinsics.
Some Basic File
Table 4: File System Intrinsics
System Intrinsics | Description |
FOPEN |
This intrinsic is found on all models of the HP 3000, including the older
16-bit MPE/V models and the newer 32-bit PA-RISC machines based on MPE/iX.
Its purpose is to open files. This intrinsic originated on the older 16-bit
systems, and supports all the functionality of the MPE/V file system. The
MPE/iX file system is a superset of the MPE/V file system. Therefore, if
an application program uses FOPEN to open files on an MPE/iX machine, it
may not be able to use the full functionality of the MPE/iX file system.
When a file is opened using FOPEN, the application program calling FOPEN
must specify whether the file is being opened for INPUT or for OUTPUT.
If the intrinsic executes successfully, it returns a unique file number
to the calling program. This number is used in subsequent calls to other
file system intrinsics, such as FREAD and FWRITE. |
HPFOPEN |
This intrinsic is implemented in MPE/iX on all PA-RISC models of the HP
3000. Like FOPEN, it is used to open files. Unlike FOPEN, it supports the
full functionality of the MPE/iX file system. HPFOPEN can be thought of
as the "native mode" intrinsic for opening files on an MPE/iX
system. The older FOPEN intrinsic is there primarily to provide compatibility
for applications from older MPE/V systems. (If an MPE/iX application program
calls the older FOPEN intrinsic on an MPE/iX system, FOPEN will do little
more than invoke HPFOPEN, which is what actually opens the file.)
When a file is opened using HPFOPEN, the application program calling HPFOPEN
must specify whether the file is being opened for INPUT or for OUTPUT.
If the intrinsic executes successfully, it returns a unique file number
to the calling program. This number is used in subsequent calls to other
file system intrinsics, such as FREAD and FWRITE. |
FREAD |
This intrinsic reads the next record in sequential order from an input
file that has been opened using FOPEN or HPFOPEN. FREAD is typically used
for sequential access. |
FWRITE |
This intrinsic writes the next record in sequential order to an output
file that has been opened using FOPEN or HPFOPEN. FWRITE is typically used
for sequential access. |
FREADDIR |
This intrinsic reads a specific record from an input file that has been
opened using FOPEN or HPFOPEN. FREADDIR is used for direct access. For
example, FREADDIR could be used to read the 100th record from a file without
first having to read the 99 records that precede it. |
FWRITEDIR |
This intrinsic can be used to write a specific record to an output file
that has been opened using FOPEN or HPFOPEN. Like FREADDIR, the FWRITEDIR
intrinsic is typically used for direct access to records in an output file.
For example, FWRITEDIR could be used to update the 100th record in a file
without first having to access the 99 records that precede it |
FCLOSE |
This intrinsic closes files that have previously been opened using FOPEN
or HPFOPEN. |
Files and Databases
When designing an application for the HP 3000, you must decide whether
to store the application's data in files or in databases. These days, most
commercial applications use databases to store critical user data. The advantages
of databases are well known, and we'll be discussing them in future articles
in this series, when we explore HP 3000 databases (particularly IMAGE/SQL)
in detail. But for the present, we're going to focus on what can be done
with ordinary files. In spite of the superior recoverability, security,
and versatility offered by databases, ordinary files still have their place
and are still used by many HP 3000 applications.
We've seen how FOPEN and HPFOPEN can be used to open files for access.
The FCLOSE intrinsic is used to close files when an application has finished
accessing them. There's only one FCLOSE intrinsic; it is used regardless
of whether the files were opened with FOPEN or with HPFOPEN. The actual
reading and writing of files is handled with intrinsics called FREAD, FWRITE,
FREADDIR, and FWRITEDIR. Next we will see when each of these is used.
The most common way to access files on the HP 3000 is sequentially. To
access a file sequentially, open the file for input access using either
FOPEN or HPOPEN. Then READ the file, one record at a time, using repeated
calls to FREAD. The first read operation retrieves the first record from
the file. Subsequent read operations retrieve the second record, the third,
the fourth, and so on until the end of file is reached. At that point, another
call to FREAD will return an "end of file" condition. This is
a signal to the application program that all the records in the file have
been accessed and the file should now be closed.
Sequential access to an output file works in much the same way, but with
one important difference. Opening a file for sequential output access effectively
erases any data the file contains. After opening the file for sequential
output access, a program's first call to FWRITE creates the first (and at
that point, the only) record in the file. Subsequent calls to FWRITE will
append additional records after the first one. When the file is closed,
the file will contain the records that were placed there by the calls to
FWRITE, in the order in which they were written.
Sequentially accessed files are widely used on the HP 3000. They are
most often found in batch environments and in large sorts.
Flat Files: Direct Access
Ordinary MPE files also provide you with another useful capability: direct
access. The intrinsics FREADDIR and FWRITEDIR can be used to access the
records in a file directly, using a relative record number. The best way
to explain the power of direct access is with an example.
Imagine a large table of 10,000 rows. Suppose that the whole table is
stored in a file on the HP 3000 so that each row is represented by one record
of the file. The file could be accessed sequentially as we've seen earlier.
For batch applications, sequential access would be appropriate, because
batch applications typically act on all the rows of the table. But what
about online applications? Users of online applications usually need to
select one or more rows from the table and then act on them. Suppose a user
wants to access the 9,999th row of a table. Sequential access means that
in order to access the 9,999th record in the table, you'd have to read the
9,998 entries that precede it. From a performance perspective alone, this
is totally unacceptable.
But with direct access, the application program can simply specify the
number of the row (record) that it's interested in. For direct-read access,
the FREADDIR intrinsic will retrieve the contents of a specified record.
Similarly, for direct-write access, the FWRITEDIR intrinsic will update
the contents of the specified record (without affecting other records in
the file).
Direct access provides a very fast means of accessing data directly,
although there is one very important (and fairly obvious) limitation. Records
must be accessed by their record number. In other words, if you want to
access the 975th record in a file, you have to know that the one you want
is the 975th one in the file. You cannot tell FREADDIR to find the record
containing the name "John Smith." Direct access does not provide
you with any kind of key beyond the record number. Keyed access is provided
by using another kind of file called a KSAM file, or by using a database.
KSAM Files
KSAM is an acronym that stands for "Keyed Sequential Access Method."
The original HP 3000 implementation of KSAM is similar in many respects
to the keyed access methods found on UNIX systems and on older IBM mainframes
(ISAM and VSAM). KSAM files can be accessed sequentially, just like ordinary
files. But they can also be read or written using keyed access.
Keyed access allows an application to select a particular record from
a file and read it directly. Unlike direct access, which required that the
application program select the desired record by a relative record number,
KSAM files allow you to use a key value. For example, instead of selecting
the 9,999th record, you'd be able to select the "John Smith" record
without having to know the number of the record that contains John's data.
KSAM was originally implemented on MPE/V systems. This implementation
of KSAM is also supported on MPE/iX systems--where it is known as compatibility
mode KSAM, or CM KSAM. A CM KSAM file is actually two files: a key file
and a data file. The data file contains the data. The key file contains
key values that can be used to access records in the data file. CM KSAM
files are created using a utility program called KSAMUTIL. This utility
program is also used to synchronize the key and data files, which can become
corrupted by system aborts.
In the early 1990s, HP brought a native mode version of KSAM (called
NM KSAM) to MPE/iX. The native mode version boasts better recoverability
than the older CM version. Currently, both versions of KSAM are supported
on MPE/iX.
We've seen three different ways to access files in this article: sequential,
direct, and keyed. Next month we're going to move beyond files, and begin
to explore HP 3000 databases.
George Stachnik works in technical training in HP's Network Server Division.
|