Primary System Failure [ Silhouette Reference Manual ] MPE/iX 5.0 Documentation
Silhouette Reference Manual
Primary System Failure
When the primary system fails, the Secondary Communication Process (SP)
on the secondary system is usually aborted by MPE and the Recovery
Process (RP) learns of the problem through missed heartbeats. The RP
informs the Silhouette Operator the first two times it misses a heartbeat
and periodically thereafter until corrective action is taken. Figure 3-5
shows what occurs as the systems prepare for a SWITCH.
Figure 3-5. Preparing for a >SWITCH Command
The Silhouette Operator on the secondary system has the option of
switching systems so that the existing secondary system becomes the new
primary system.
To do this the secondary system Silhouette Operator issues a >SWITCH
command to the Silhouette Manager Program (MP) on the secondary system.
The RP takes the tape produced on the primary before it failed (which is
produced only if the primary system exceeded the high water mark) and
uses this tape to bring the secondary database and logfile up to date
with the primary as of the time the primary failed. Since Silhouette
uses record numbers, the RP is able to avoid duplicating information it
has already processed. Once the RP completes the update task, the
Silhouette Operator can start a new logging cycle with or without a
database backup (depending on time constraints) and all primary system
activity can be switched to the secondary system where processing can
continue as if nothing had occurred.
The old secondary now functions as the new primary. The Silhouette
Operator issues the >START command to the MP which creates a CP process
on the new primary. The new CP soon discovers that there are no
communication lines, or that there is no reply to heartbeats and then
creates a Silhouette Tape Process (TP) on the new primary. This new TP
waits until the high water mark is exceeded and then writes records
directly from the logfile to tape. Figure 3-6 shows what occurs after
the >SWITCH command is issued.
Figure 3-6. After the >SWITCH
The old secondary system is now the new Silhouette primary system. All
users need do if a system failure occurs is switch their terminals to the
secondary system and wait for the Silhouette Operator to tell them that
processing may begin again.
Figure 3-7 shows that when the original primary system operation is
restored, the system can be left as is, with the original primary
functioning as a new secondary. The Silhouette software will bring its
logfile and database automatically into line with the new primary.
Figure 3-7. Reverse Duplication
If it is necessary to return to the original primary/secondary
relationship, the Silhouette Operator on the new primary issues the >STOP
command and the roles can be reversed if all transactions are sent from
the new primary to the new secondary and no new transactions are applied
to the database. The Operator on the new secondary issues a >START
command and the systems revert to their original functions. The user
terminals are returned to the original primary system and processing
continues as it did before the failure.
To allow this feature of Silhouette to work smoothly, the application
software must be duplicated on the secondary system and there must be a
way of easily transferring terminals from one system to another.
MPE/iX 5.0 Documentation