|
|
HP Part Number: 5958-5824-2627
Published: July 1 1986
Notice
A high volume of calls placed to the Response Center involve terminals that do
not respond. Oftentimes it is difficult to determine if the problem is a
hardware problem with the terminal or communications equipment, a software
problem with the program running on the terminal causing it to "hang", or an
error detected by the software in the system terminal controller. On the HP 3000
(series 30-70), the Advanced Terminal Processor (ATP) and the Asynchronous Data
Commugnations Controller (ADCC) have the ability to mark a terminal port as
"broken" if they encounter errors from which they can not recover. This is
sometimes referred to as a "port failure".
This Application Note discusses how to isolate such terminal problems, how to
handle ATP and ADCC port failures, how to use the Online Diagnostic/Support
Monitor (TERMDSM), and what information you should collect before calling the
Response Center for a port-related problem. Although a WARMSTART will, in most
cases, clear hung or broken ports, it is often possible to do so without
resorting to a system restart. This guide is intended to minimize port and
system downtime and to maximize the effectiveness of PICS calls.
The first step is to determine whether your terminal port is hung or if it is
considered "broken" by the software. If you receive a console message such as
the following:
ATP FAILURE 6810 ON LDEV nn. RUN TERMDSM TO ANALYZE FAILURE.
or
ADCC FAILURE 9518 ON LDEV nn. RUN TERMDSMTO ANALYZE FAILURE.
then you have a broken port -- refer to Section I, "Handling Port Failures".
If you do not receive a console message then you need to follow the steps in
Section II, "Handling Port Hangs", to isolate the problem.
I. HANDLING PORT FAILURES
An ATP or ADCC failure occurs when the ATP/ADCC software (which handles I/O to
and from the terminal) encounters an unexpected event it cannot recover from.
This could be the result of a hardware failure or possibly a problem in the
ATP/ADCC code. Isolated failures do not usually indicate a serious problem, and
often can be corrected without replacing software or hardware. Multiple
failures, where a number of ports fail within a short time, or one or two ports
fail a number of times, are more cause for concern. Multiple failures generally
call for Response Center involvement, where we can analyze the failures and
initiate appropriate action.
An occasional single failure usually doesn't warrant a call to the Response
Center unless you wish to pursue the reason for the failure or need help in
correcting it. We do, however, recommend keeping a log sheet or log book of
system problems and including port failures in this log.
Returning the Port to Service
If you see a console message that says you had a failure, log the time, data,
LDEV number and failure code in the log book. Next run TERMDSM to DUMP and RESET
the port as described in Section III.
Calling the Response Center
If a port fails repeatedly, or if a number of different ports fail, collect
information as described below and call the Response Center.
When you call the Response Center to report port failures, it would be helpful
to know the following information:
- LDEV numbers of failed ports - this is critical!
- Port type, ATP or ADCC - if you're not sure which, check the port unit
numbers on the system I/O configuration listing. If the unit numbers are
zero, the ports are ADCCs. If the unit numbers are non~zero, the ports are
ATPs.
- Type, subtype and termtype of failed ports - these also appear on the
system I/O configuration listing.
- Unit number of an ATP or DRT Number of an ADCC - from the I/O
configuration
- Port failure number(s) - these appear on the system console when the port
failure occurs. If they have already left the screen, the RC engineer can
obtain the failure number from the port dumps.
- ATP/ADCC software versions ~ these appear when you run TERMDSM.
- Failure dates and times - keeping a record of these can help in tracking
down problems.
- MPE version - obtain this by typing the :SHOWME command
- Current ATP/ADCC patches - if any patches have been applied for ATP/ADCC
problems, record the patch number in your system log book. This will help
us in identifying problems.
- What is attached to the port - terminal, modem, printer, plotter,
multiplexer, data switch, etc.
- What was happening? - what port activity, if any, was taking place at the
time of the failure.
- TeleSupport phone numbers and passwords - we may want to dial in and look
at the port dumps or system configuration.
II. HANDLING PORT HANGS
Sometimes a terminal will "hang" without the ATP/ADCC software indicating a
failure has occurred. This can result in a terminal that doesn't respond (if no
session was associated with that terminal) or a "stuck" session (a session that
cannot be aborted). In this section, we describe steps you can take in an
attempt to bring the port back to service before resorting to a WARMSTART.
- Determine if the terminal hang is an isolated problem or if all other
terminals are affected. If they are, then you may have a system hang or
failure. Are there any console messages? Is there any system activity? If
it appears that this is not a system-wide problem continue with checklist
item number 2.
- Is the hung terminal running a program? Hit
to suspend execution of the running process and see if you get the colon
prompt.
- Reset the terminal to make sure the terminal isn't hung. For 264X
terminals, hit the RESET button two or three times in succession. For all
other terminals, hold down
and press RESET. You can also
try turning the power off and back on again on the terminal.
- Check the terminal configuration. Verify that the baud rate and parity are
correct, and the terminal is in REMOTE MODE and not BLOCK MODE. The AUTO
LF should NOT be set.
- Is the correct cable being used and is it securely connected?
- Verify that the terminal is not physically broken. Take the terminal out
of REMOTE MODE and run a self-test on it.
- Perform a -A RECALL on the system
console to see if any console requests are pending. Sometimes a session
appears to be hung when it's actually waiting for a console reply.
- Run TERMDSM.PUB.SYS and use "BROKEN" to see if the port is flagged as a
broken port. In any case, use RESET or DUMP on the port as described in
Section III. A reset or dump will abort the session if it's successful,
so be prepared to lose the session when trying this option.
- Use FCOPY to try to write to the hung port. Many times this will clear up
the port. For example:
:FILE TERM;DEV=nnn where nnn is the LDEV of the hung terminal
:FCOPY FROM;TO=*TERM
HELLO??? TESTING 1,2,3..
:EOF
- Perform an :ABORTIO on the LDEV that appears to be hung.
- Perform an :ABORTJOB on the session (if there is one) attached to the
port. To find out what session is currently on the port, use the SHOWDEV
command. If the :ABORTJOB command does not cause the session to log off,
the session may have "ownership" of some system resource, so you will
need to look at other devices.
- If a session is writing to a printer, perform an :ABORTIO on the printer.
To find out if I/O is pending on a printer, perform a :SHOWOUT SP or
:SHOWDEV n, where n is the LDEV number of the printer. This will often
free up a hung session.
- If a session is writing to tape, perform an :ABORTIO on the tape drive.
To find out if I/O is pending on a tape drive, perform a :SHOWDEV n, where
n is the LDEV number of the tape drive. This may also free up the session.
- If the session is coming across a DS line, try an :ABORTIO on the INP
(or LANIC). This will probably abort the datacomm subsystem running on the
INP or LANIC.
- Isolate the problem to either the port or the terminal by swapping the
terminal with a terminal known to be operating correctly, swap the
connectors on the CPU side to see if the problem follows the port, or swap
the cable(s) used.
If all else fails and it's critical to get the port back into operation as soon
as possible, you may need to restart the system. If you do this, you may want to
call the Response Center for advice on how to avoid restarting the system in the
future. Many WARMSTARTS are not necessary, and can be time consuming and
inconvenient.
III. TERMDSM
TERMDSM is a tool for use in debugging and repairing terminal ports. It allows
you to run diagnostics on one or more ports, abort jobs or I/O, reset ports and
associated tables, display tables, dump (to a disc file) tables for later
analysis, format failure information dumped by the ATP/ADCC software, or
identify broken ports.
TERMDSM runs on MPE V/E (Version G.00.00 or later) operating systems. For a
complete description of the TERMDSM utility please refer to the reference manual
(Part No. 30144-90013).
For pre-MPE V/E (Version E.00.00 or F.00.00 or earlier), use TERMDSM's
predecessor, ATPDSM, which operates on ATP ports but not ADCC ports. ADCC ports
can be tested, however, with the ADCC offline diagnostic, ADCCDIAG. Information
about ATPDSM can be found in the Advanced Terminal Processor (DSN/ATP) On-Line
Diagnostics Manual (Part No. 30144-90004).
To invoke this utility, simply type "RUN TERMDSM.PUB.SYS" at the MPE colon ":"
prompt.
Version numbers of ATP and/or ADCC software appear on the screen immediately
below the TERMDSM banner when the program is run. ATP/ADCC version numbers also
appear in all port dumps.
TERMDSM Requirdments and Considerations
You must have OP capability to run TERMDSM. To save TERMDSM port dumps, you must
have SF capability and Write access to the logon group and account. DI
capability is required to run ATP diagnostics.
TERMDSM will NOT run when the system is down; MPE must be executing for TERMDSM
to run.
Several TERMDSM commands, if successfully executed, will abort a session. When
using the ABORTIJOB, DUMP or RESET commands, be prepared to lose a session if
one is still logged onto the affected port.
Example of Dumping and Reseting a Broken Port
When you encounter a broken port, run TERMDSM.PUB.SYS by performing the
following steps:
- Log on to the 3000 as MANAGER.SYS or OPERATOR.SYS or some other logon with
OP capability, so you can run TERMDSM.
- Run TERMDSM.PUB.SYS. When the arrow prompt "->" appears, type BROKEN (or
just B) and press to display a list
of broken ports. If an asterisk "*" appears in the "Unfixable" column,
or if several ports show up as broken, call the Response Center.
- At the arrow prompt again, type DUMP (or DU) and press
. When TERMDSM prompts you with
"Enter Idev number:", enter the number of the broken port and press
. A dump will cause an automatic
reset of the port.
- If you are dumping an ATP port, TERMDSM will ask the question, "Do you
want to dump the PCC memory?"" Always reply "YES".
- When TERMDSM asks you "Do you want to include a message", answer YES (or
Y), and include a message with the time and date of the port failure. If
an application was running, you may want to include the application name.
Press at the arrow prompt (->) to
conclude the message and start the dump.
TERMDSM creates a file with the dump information, and names it TERMxxx, where
xxx is the LDEV number of the dumped port. This file is stored in your current
logon group and account. Add the group, account, and filename of the dump file
to your log entry.
This completes the dump process. The DUMP function automatically resets the
port, so the port should now be available for use. If you wish to inspect the
dump, just FCOPY the TERMxxx file to a printer or use a text editor.
If several broken ports failed with the same failure number, you can take dumps
from two or three ports and use the RESET command of TERMDSM to reset the rest
of the ports. You do not need to RESET a port after performing a DUMP, because
a dump also resets the port.
TERMDSM Commands
- ABORTIO and ABORTIOB
These commands work just like their MPE equivalents.
- BROKEN
This command lists all ports on a system that are currently considered
broken. The port may also be flagged as "UNFIXABLE" if:
- a port is configured on a missing AIB, i.e. there is no hardware for a
configured logical device
- an ATP data segment cannot be built
- self test fails on the Port Controller Chip (PCC)
The list of broken ports is not conclusive. Certain errors may go
undetected by ATP software.
- DIAGNOSTICS
The DIAG command initiates ATP diagnostics, of which there are three
flavors. The first one tests the 3000 connection out to the PCC. The
second tests out to the junction panel and requires loopback
connectors. You should have received these connectors when you
installed your ATP/ADCC subsystems. If you can't find them contact
your Account CE. The third test is a Read/Write test to a powered-on
HP terminal.
- DISPLAY
The DIS command can be used to display the values of various ATP and
ADCC tables.
- DUMP
The DU command initiates dialog for dumping the current state of the ATP
or ADCC tables, terminal buffers, and ATP PCC memory contents to a disc
file.
The DUMP command, resets the port and aborts any session. Use this
command with caution.
A user-generated message up to 20 lines long may be included with the
dump.
- EXIT
The E command terminates execution of TERMDSM and returns you to the
MPE colon prompt.
- HELP
You can type HELP at any prompt to get more help on a particular
operation.
- RESET
The RESET command initiates dialog for resetting one or more ATP/ADCC
ports. Sessions logged on will be aborted, ATP/ADCC tables will be
reset, and the port prepared for speed sensing.
RESET, if successful, will abort a session on the port. Use this
command with caution.
|