HPlogo HP3000 Application Notes

HP3000 Application Note #9

PORT FAILURES, TERMINAL HANGS, and TERMDSM

APP Notes

Complete PDF


Application Note #8

Application Note #10

HP Part Number: 5958-5824-2627

Published: July 1 1986

Notice
A high volume of calls placed to the Response Center involve terminals that do not respond. Oftentimes it is difficult to determine if the problem is a hardware problem with the terminal or communications equipment, a software problem with the program running on the terminal causing it to "hang", or an error detected by the software in the system terminal controller. On the HP 3000 (series 30-70), the Advanced Terminal Processor (ATP) and the Asynchronous Data Commugnations Controller (ADCC) have the ability to mark a terminal port as "broken" if they encounter errors from which they can not recover. This is sometimes referred to as a "port failure".

This Application Note discusses how to isolate such terminal problems, how to handle ATP and ADCC port failures, how to use the Online Diagnostic/Support Monitor (TERMDSM), and what information you should collect before calling the Response Center for a port-related problem. Although a WARMSTART will, in most cases, clear hung or broken ports, it is often possible to do so without resorting to a system restart. This guide is intended to minimize port and system downtime and to maximize the effectiveness of PICS calls.

The first step is to determine whether your terminal port is hung or if it is considered "broken" by the software. If you receive a console message such as the following:

    ATP FAILURE 6810 ON LDEV nn. RUN TERMDSM TO ANALYZE FAILURE.
or
    ADCC FAILURE 9518 ON LDEV nn. RUN TERMDSMTO ANALYZE FAILURE.

then you have a broken port -- refer to Section I, "Handling Port Failures". If you do not receive a console message then you need to follow the steps in Section II, "Handling Port Hangs", to isolate the problem.

I. HANDLING PORT FAILURES

An ATP or ADCC failure occurs when the ATP/ADCC software (which handles I/O to and from the terminal) encounters an unexpected event it cannot recover from. This could be the result of a hardware failure or possibly a problem in the ATP/ADCC code. Isolated failures do not usually indicate a serious problem, and often can be corrected without replacing software or hardware. Multiple failures, where a number of ports fail within a short time, or one or two ports fail a number of times, are more cause for concern. Multiple failures generally call for Response Center involvement, where we can analyze the failures and initiate appropriate action.

An occasional single failure usually doesn't warrant a call to the Response Center unless you wish to pursue the reason for the failure or need help in correcting it. We do, however, recommend keeping a log sheet or log book of system problems and including port failures in this log.

Returning the Port to Service

If you see a console message that says you had a failure, log the time, data, LDEV number and failure code in the log book. Next run TERMDSM to DUMP and RESET the port as described in Section III.

Calling the Response Center

If a port fails repeatedly, or if a number of different ports fail, collect information as described below and call the Response Center.

When you call the Response Center to report port failures, it would be helpful to know the following information:
  1. LDEV numbers of failed ports - this is critical!
  2. Port type, ATP or ADCC - if you're not sure which, check the port unit numbers on the system I/O configuration listing. If the unit numbers are zero, the ports are ADCCs. If the unit numbers are non~zero, the ports are ATPs.
  3. Type, subtype and termtype of failed ports - these also appear on the system I/O configuration listing.
  4. Unit number of an ATP or DRT Number of an ADCC - from the I/O configuration
  5. Port failure number(s) - these appear on the system console when the port failure occurs. If they have already left the screen, the RC engineer can obtain the failure number from the port dumps.
  6. ATP/ADCC software versions ~ these appear when you run TERMDSM.
  7. Failure dates and times - keeping a record of these can help in tracking down problems.
  8. MPE version - obtain this by typing the :SHOWME command
  9. Current ATP/ADCC patches - if any patches have been applied for ATP/ADCC problems, record the patch number in your system log book. This will help us in identifying problems.
  10. What is attached to the port - terminal, modem, printer, plotter, multiplexer, data switch, etc.
  11. What was happening? - what port activity, if any, was taking place at the time of the failure.
  12. TeleSupport phone numbers and passwords - we may want to dial in and look at the port dumps or system configuration.

II. HANDLING PORT HANGS

Sometimes a terminal will "hang" without the ATP/ADCC software indicating a failure has occurred. This can result in a terminal that doesn't respond (if no session was associated with that terminal) or a "stuck" session (a session that cannot be aborted). In this section, we describe steps you can take in an attempt to bring the port back to service before resorting to a WARMSTART.
  1. Determine if the terminal hang is an isolated problem or if all other terminals are affected. If they are, then you may have a system hang or failure. Are there any console messages? Is there any system activity? If it appears that this is not a system-wide problem continue with checklist item number 2.

  2. Is the hung terminal running a program? Hit (BREAK) to suspend execution of the running process and see if you get the colon prompt.
  3. Reset the terminal to make sure the terminal isn't hung. For 264X terminals, hit the RESET button two or three times in succession. For all other terminals, hold down (CONTROL) (SHTFT) and press RESET. You can also try turning the power off and back on again on the terminal.
  4. Check the terminal configuration. Verify that the baud rate and parity are correct, and the terminal is in REMOTE MODE and not BLOCK MODE. The AUTO LF should NOT be set.
  5. Is the correct cable being used and is it securely connected?
  6. Verify that the terminal is not physically broken. Take the terminal out of REMOTE MODE and run a self-test on it.
  7. Perform a (CONTROL)-A RECALL on the system console to see if any console requests are pending. Sometimes a session appears to be hung when it's actually waiting for a console reply.
  8. Run TERMDSM.PUB.SYS and use "BROKEN" to see if the port is flagged as a broken port. In any case, use RESET or DUMP on the port as described in Section III. A reset or dump will abort the session if it's successful, so be prepared to lose the session when trying this option.
  9. Use FCOPY to try to write to the hung port. Many times this will clear up the port. For example:
    
            :FILE TERM;DEV=nnn    where nnn is the LDEV of the hung terminal
    
            :FCOPY FROM;TO=*TERM
            
            HELLO??? TESTING 1,2,3..
    
            :EOF
    
  10. Perform an :ABORTIO on the LDEV that appears to be hung.
  11. Perform an :ABORTJOB on the session (if there is one) attached to the port. To find out what session is currently on the port, use the SHOWDEV command. If the :ABORTJOB command does not cause the session to log off, the session may have "ownership" of some system resource, so you will need to look at other devices.
  12. If a session is writing to a printer, perform an :ABORTIO on the printer. To find out if I/O is pending on a printer, perform a :SHOWOUT SP or :SHOWDEV n, where n is the LDEV number of the printer. This will often free up a hung session.
  13. If a session is writing to tape, perform an :ABORTIO on the tape drive. To find out if I/O is pending on a tape drive, perform a :SHOWDEV n, where n is the LDEV number of the tape drive. This may also free up the session.
  14. If the session is coming across a DS line, try an :ABORTIO on the INP (or LANIC). This will probably abort the datacomm subsystem running on the INP or LANIC.
  15. Isolate the problem to either the port or the terminal by swapping the terminal with a terminal known to be operating correctly, swap the connectors on the CPU side to see if the problem follows the port, or swap the cable(s) used.

If all else fails and it's critical to get the port back into operation as soon as possible, you may need to restart the system. If you do this, you may want to call the Response Center for advice on how to avoid restarting the system in the future. Many WARMSTARTS are not necessary, and can be time consuming and inconvenient.

III. TERMDSM

TERMDSM is a tool for use in debugging and repairing terminal ports. It allows you to run diagnostics on one or more ports, abort jobs or I/O, reset ports and associated tables, display tables, dump (to a disc file) tables for later analysis, format failure information dumped by the ATP/ADCC software, or identify broken ports.

TERMDSM runs on MPE V/E (Version G.00.00 or later) operating systems. For a complete description of the TERMDSM utility please refer to the reference manual (Part No. 30144-90013).

For pre-MPE V/E (Version E.00.00 or F.00.00 or earlier), use TERMDSM's predecessor, ATPDSM, which operates on ATP ports but not ADCC ports. ADCC ports can be tested, however, with the ADCC offline diagnostic, ADCCDIAG. Information about ATPDSM can be found in the Advanced Terminal Processor (DSN/ATP) On-Line Diagnostics Manual (Part No. 30144-90004).

To invoke this utility, simply type "RUN TERMDSM.PUB.SYS" at the MPE colon ":" prompt.

Version numbers of ATP and/or ADCC software appear on the screen immediately below the TERMDSM banner when the program is run. ATP/ADCC version numbers also appear in all port dumps.

TERMDSM Requirdments and Considerations

You must have OP capability to run TERMDSM. To save TERMDSM port dumps, you must have SF capability and Write access to the logon group and account. DI capability is required to run ATP diagnostics.

TERMDSM will NOT run when the system is down; MPE must be executing for TERMDSM to run.

Several TERMDSM commands, if successfully executed, will abort a session. When using the ABORTIJOB, DUMP or RESET commands, be prepared to lose a session if one is still logged onto the affected port.

Example of Dumping and Reseting a Broken Port

When you encounter a broken port, run TERMDSM.PUB.SYS by performing the following steps:
  1. Log on to the 3000 as MANAGER.SYS or OPERATOR.SYS or some other logon with OP capability, so you can run TERMDSM.
  2. Run TERMDSM.PUB.SYS. When the arrow prompt "->" appears, type BROKEN (or just B) and press (RETURN) to display a list of broken ports. If an asterisk "*" appears in the "Unfixable" column, or if several ports show up as broken, call the Response Center.
  3. At the arrow prompt again, type DUMP (or DU) and press (RETURN). When TERMDSM prompts you with "Enter Idev number:", enter the number of the broken port and press (RETURN). A dump will cause an automatic reset of the port.
  4. If you are dumping an ATP port, TERMDSM will ask the question, "Do you want to dump the PCC memory?"" Always reply "YES".
  5. When TERMDSM asks you "Do you want to include a message", answer YES (or Y), and include a message with the time and date of the port failure. If an application was running, you may want to include the application name. Press (RETURN) at the arrow prompt (->) to conclude the message and start the dump.

TERMDSM creates a file with the dump information, and names it TERMxxx, where xxx is the LDEV number of the dumped port. This file is stored in your current logon group and account. Add the group, account, and filename of the dump file to your log entry.

This completes the dump process. The DUMP function automatically resets the port, so the port should now be available for use. If you wish to inspect the dump, just FCOPY the TERMxxx file to a printer or use a text editor.

If several broken ports failed with the same failure number, you can take dumps from two or three ports and use the RESET command of TERMDSM to reset the rest of the ports. You do not need to RESET a port after performing a DUMP, because a dump also resets the port.

TERMDSM Commands

  • ABORTIO and ABORTIOB

      These commands work just like their MPE equivalents.
  • BROKEN

      This command lists all ports on a system that are currently considered broken. The port may also be flagged as "UNFIXABLE" if:

      - a port is configured on a missing AIB, i.e. there is no hardware for a configured logical device

      - an ATP data segment cannot be built

      - self test fails on the Port Controller Chip (PCC)

      The list of broken ports is not conclusive. Certain errors may go undetected by ATP software.
  • DIAGNOSTICS
    • The DIAG command initiates ATP diagnostics, of which there are three flavors. The first one tests the 3000 connection out to the PCC. The second tests out to the junction panel and requires loopback connectors. You should have received these connectors when you installed your ATP/ADCC subsystems. If you can't find them contact your Account CE. The third test is a Read/Write test to a powered-on HP terminal.

  • DISPLAY

      The DIS command can be used to display the values of various ATP and ADCC tables.
  • DUMP

      The DU command initiates dialog for dumping the current state of the ATP or ADCC tables, terminal buffers, and ATP PCC memory contents to a disc file.

      The DUMP command, resets the port and aborts any session. Use this command with caution.

      A user-generated message up to 20 lines long may be included with the dump.
  • EXIT

      The E command terminates execution of TERMDSM and returns you to the MPE colon prompt.
  • HELP

      You can type HELP at any prompt to get more help on a particular operation.
  • RESET

      The RESET command initiates dialog for resetting one or more ATP/ADCC ports. Sessions logged on will be aborted, ATP/ADCC tables will be reset, and the port prepared for speed sensing.

      RESET, if successful, will abort a session on the port. Use this command with caution.



Application Note #8

Application Note #10