Spool File Recovery At Boot Time [ COMMUNICATOR 3000 MPE/iX Release 5.0 (Core Software Release X.50.20) ] MPE/iX Communicators
COMMUNICATOR 3000 MPE/iX Release 5.0 (Core Software Release X.50.20)
Spool File Recovery At Boot Time
by Larry Byler
Commercial Systems Division
Understanding the Process
The input SPool File DIRectory (SPFDIR) and the output SPFDIR are two
Known System Objects (KSOs) created by Progen near the end of the system
startup (boot) process, and filled with information from spool files in
IN.HPSPOOL and OUT.HPSPOOL. This provides a run-time "cache" for spool
file management.
The time necessary to create and fill these structures depends on the
number of files in each HPSPOOL group, because each file needs to be
examined. After examination, a few files are not recovered, but this
article does not discuss those files. The required time is also a
function of the distribution of (output) spool files among various device
queues (LP, CIPER, PP, and so on), with the worst case being all files on
a single queue (for example, LP).
Currently, the Native Mode Spooler has a limit of 10000 files in either
SPFDIR. Internal overhead reduces this capacity to 9900+ spool files. In
lab tests using 9000 output spool files directed to a single device queue
(LP), it required 53 minutes of system boot time to recover the output
SPFDIR. This article describes an enhancement to improve this recovery
time.
NOTE Input spool files usually consist entirely of job $STDIN files.
The only other input spool file is the :DATA file, rarely used
anymore. The number of input spool files is typically so small
that the time spent recovering them to the input SPFDIR is not
significant. Therefore, the process of recovering the input SPFDIR
was not changed, and the remaining discussion confines itself to
output spool file recovery.
Improvements to the Process
General System Performance.
The original 53 minutes was distributed as follows:
* Two minutes to assemble the list of 9000 spool files in
OUT.HPSPOOL (an internal LISTF).
* 51 minutes to recover the output SPFDIR from that list.
The time spent in the spooler subsystem portion of the boot process is
now just those first two minutes, but the other 51 minutes have not
totally disappeared. A local enhancement, which takes advantage of
internal SPFDIR organization, reduces the 51 minutes to no more than
eighteen minutes (often less, depending on hardware configuration, and
definitely less when there are fewer SPFDIR entries to recover).
All recovery, other than the first two minutes, now takes place in a
separate system process created at CBASE priority under Progen. This
process is a privileged NL procedure running in a process environment.
It continues until all output spool files have been recovered into the
output SPFDIR (or discarded, if they cannot or should not be recovered),
and then terminates. The boot process completes in parallel with the
SPFDIR recovery process, and the system then becomes available to users.
The SPFDIR recovery process then competes with other system processes and
user processes for access to the CPU and the output SPFDIR.
Although we observed eighteen minutes of recovery time in a lab
environment, you should be aware of the following considerations that may
result in different recovery times in your environment:
* The eighteen minutes occurred in a worst-case lab environment,
namely:
* Minimum memory (24 Mbytes),
* Low-performance CPU (series 930) and discs (7935),
* Almost maximum number of spool files (9000),
* One device queue (LP).
Improving any of these factors should improve overall recovery
time. For example, more main memory decreases the effects of poor
locality (swapping).
* Although the recovery process is a child of Progen, and therefore
is a system process, it is created in the CS queue. This means
that it competes with user processes once the system is available.
Because it is in the CS queue, it is subject to being time-sliced;
however, as a system process its priority does not decay. If the
recovery process is favored most of the time, user response time
may be degraded. If not, spool file recovery time will be
prolonged.
In early factory tests, again on a worst-case system, user
processes competed reasonably against the recovery process until
the SPFDIR file count approached 6000. From then on, user process
response time degraded badly.
To deal with this, the recovery process now pauses one second for
every 200 SPFDIR entries it recovers. This forces the Dispatcher
to give more consideration to user processes, thus improving
response time. The tradeoff is that the recovery process takes
longer to complete.
Specific User Considerations.
Depending on the numbers involved, there may be a period after the system
is available to users when some spool files in OUT.HPSPOOL do not have an
entry in the output SPFDIR. This section describes the features of the
spooling subsystem that are affected during this period, and how they are
affected or restricted.
All restrictions described below are with respect to SPFDIR recovery by
the recovery process. Existing capability, resource limits, and security
restrictions have not changed. Once the recovery process has terminated,
all existing features of the spooling subsystem are fully available, as
in the past.
While the SPFDIR recovery process is running:
* Users can stream jobs. No restrictions.
* Jobs can log on (that is, $STDLISTs can be created). No
restrictions.
* Spooler processes can open, print, and delete spool files. The
SPFDIR entry must exist. Spool files in OUT.HPSPOOL, but whose
SPFDIR entry has not yet been recovered, will not be selected for
printing.
* Recovery of an SPFDIR entry by the recovery process will not wake
an idle spooler process, even if the entry's priority is above the
outfence. There are several methods for dealing with such a
situation:
* Do nothing. Eventually a user creates a new spool file
destined for the device managed by the idle spooler. When
that spool file enters the READY state, the spooler is
notified. It then prints all available files above the
outfence.
* Wake the process with a command such as SPOOLER 6; SUSPEND
followed by SPOOLER 6; RESUME. You must:
1. be at the system console,
2. have been ALLOWed the SPOOLER command, or,
3. have associated a class that includes LDEV 6
to use this command.
* Any user with access to a newly-recovered #O1369, for
example, can wake the spooler process for LDEV 6 with
SPOOLF #O1369;DEV=6 (assuming that #O3169's priority is
above the system (or device) outfence). See below for
restrictions on the use of the SPOOLF command while the
recovery process is recovering SPFDIR entries.
* All output spool file management commands (ALTSPOOLFILE,
DELETESPOOLFILE, LISTSPF, SHOWOUT, and SPOOLF) are available with
restrictions. These are described next.
Behavior varies depending on whether the command argument is a
single spoolid or list of specific spoolids (for example, LISTSPF
#O8072 or LISTSPF (#O8072, #O7963, #O8010), or an argument that
resolves to a wildcarded fileset (such as LISTSPF O@, or SHOWOUT
SP;JOB=@). Note that ALTSPOOLFILE and DELETESPOOLFILE only accept
a single spoolid argument.
* For the single spool file argument, a file in OUT.HPSPOOL,
whose SPFDIR entry is not yet recovered, is treated the
same as a non-existent spool file. CIWARN 4653 is
returned:
Spoolfile "!" either does not exist on the system, or you have
insufficient capabilities to access it. (CIWARN 4563)
Note that a LISTF of this <Onnnn>.OUT.HPSPOOL displays the
filename. The spool file does exist; only its SPFDIR entry
does not, as yet.
* The two AIFs AIFSPFGET and AIFSPFPUT only deal with one
spool file at a time, so they behave similarly to the above
commands with a single argument. If either AIF finds a
spool file with no corresponding SPFDIR entry, it returns
error -8039 (Cannot find the spool file).
* LISTSPF O@..., or a similar SHOWOUT command that results in
a multiple spool file argument, snapshots only those SPFDIR
entries that exist at the time the command is entered.
* The SPOOLF O@; ALTER and SPOOLF O@; DELETE forms of the
SPOOLF command are disallowed, and a new message:
'SPOOLF ;ALTER' or 'SPOOLF ;DELETE' of a wildcarded fileset is
disabled until the output spoolfile directory has been rebuilt
following a system startup. (CIWARN 4652)
is displayed to any user attempting either of these
commands. When the output SPFDIR is fully recovered, the
following message:
The system has finished rebuilding the output spoolfile directory.
is displayed on the system console. It is a Green (no
change) event for OpenView console users. The above forms
of the SPOOLF command are then re-enabled.
* The SPOOLF...; PRINT command is not affected by the SPFDIR
recovery process.
Other Considerations.
Work file HPDISU00.PUB.SYS.
Progen creates a file in the permanent domain, HPDISU00.PUB.SYS, for use
by the recovery process. The recovery process purges this file before it
terminates.
\ \ \ Important Details \ Please Read If HPDISU00.PUB.SYS exists at
system startup, Progen purges it
in order to create the new one
needed by the recovery process.
Do not create a permanent file
with this name.
MPE/iX Communicators