HPlogo Using the HP 3000 Workload Manager: HP 3000 MPE/iX Computer Systems > Appendix A Troubleshooting

Troubleshooting Workgroup Problems

» 

Technical documentation

Complete book in PDF
» Feedback

 » Table of Contents

 » Glossary

 » Index

Each of the following sections lists a potential problem and a list of possible solutions.

Uncaptured process

Problem: You have created user-defined workgroups but a process does not fall into the expected workgroup.

Solutions:

system process

The process might be a system process, which cannot be captured by user-defined workgroups. System processes are always members of one of the system-defined default workgroups. The SHOWPROC command displays an asterisk ("*") next to the pin number of system processes.

artificial member

The process might be an artificial member of its workgroup. An artificial member remains in that workgroup until the workgroup is deleted, the process is moved to another workgroup, or the process is returned to its natural workgroup. The DETAIL format of the SHOWPROC command places a percent sign ("%") next to the workgroup name if the process is an artificial member. Also, a percent sign ("%") appears before the pin number on the DETAIL format of the SHOWPROC command or the PROCS format of SHOWWG command. To return a process to its natural workgroup, issue the command ALTPROC pin;WG=Natural_wg.

membership criteria

The process might not meet the membership criteria for the specified workgroup. The DETAIL format of the SHOWPROC command displays the process attributes on which membership can be based (logon, program, queue) and the workgroup of the process. The DETAIL format of the SHOWWG command displays the membership criteria of the workgroup. To be a member, the process must match one value of each of the specified categories. If the process doesn't match, it will not be a member. If it does match, it might be a member of the workgroup, depending on its position in the ordered list of workgroups.

workgroup order

Workgroup membership is determined by scanning the workgroup membership criteria of all workgroups in order. The three process attributes on which membership can be based (logon, program, queue) are compared to the workgroup membership criteria, and the process is placed in the first matching workgroup. Due to this, you place workgroups with the most specific membership at the beginning of the list, and place workgroups with more general membership criteria near the end. The workgroup with the most general membership criteria are the five system-defined workgroups, which always appear last.

To see the current order, which is the order workgroups appear in the workgroup configuration file, issue the SHOWWG command . (You can control this order by introducing a new workgroup configuration, or by specifying the POSITION parameter on the NEWWG command line when you create a new workgroup.)

Starving workgroup

Problem: All processes within a workgroup are not receiving sufficient CPU time.

Solutions:

low priority

The priority range assigned to the workgroup might be low when compared to the priority ranges of other workgroups. Use the SHOWWG command to display the base and limit priorities of the various workgroups. If the starving workgroup is at low priority when compared to the other workgroups, moving it to a higher priority range would help it get more CPU. Or, you can determine which workgroup(s) are impacting the affected workgroup and change them.

minimum(s) too low

Minimum CPU values provide a guarantee that the workgroup will receive the specified amount of CPU, if the workgroup requires it. There are two situations in which CPU minimums can starve a workgroup:

  • A minimum CPU value has been assigned to the starving workgroup that is not sufficient to meet the demand of processes within that workgroup. In this situation, raise the CPU minimum for that workgroup, making sure that the cumulative minimums for all workgroups is less than 100%.

  • The other workgroups have been assigned (and use) CPU minimum values that do not leave sufficient CPU for the starving workgroup. In this case, lower the CPU minimums of one or more workgroups to allow sufficient CPU for the starving workgroup.

maximum too low

The workgroup may also be starving because it has been assigned a maximum amount of CPU that is insufficient. For example, the workgroup may be assigned a maximum CPU value of 20% and be using that 20%, but requires 30% for adequate response time. Alternatively, other workgroups that should be constrained by their maximums might have those values set too high. A workgroup with a maximum of 80% would be allowed to consume 80% of the system, provided it did not violate any minimum guarantees for other workgroups.

Starving process

Problem: A workgroup might be receiving its share of the CPU, but a process within that workgroup is not getting sufficient CPU time.

Solutions:

check priority

The MPE/iX Dispatcher remains priority-driven, allocating the CPU to the processes within the workgroup based on their priority. Use the PROCS format of the SHOWWG command to display the priorities of member processes of the workgroup(s) you specify. If the process is of lower priority than other member processes, it will only receive the CPU that the other processes do not require.

change workgroup

The process might belong in a workgroup with higher priority values. For example, a batch job might be particularly important and deserve to run at higher priority than most batch jobs. Use the ALTPROC command to move the process to another workgroup.

Note: In the interests of being proactive, you might want to define a workgroup with membership criteria that would naturally capture that process, placing it at an appropriate priority.

enable oscillation

The process may be in the proper workgroup, but is having trouble competing with other processes in the workgroup since its transaction time is greater than the average. Recall that the priority of a process will decay based on its CPU consumption. The process may have decayed to the limit of the queue and is unable to compete for the CPU with other processes of higher priority in that workgroup. Use the ALTWG command to enable oscillation, which will boost the priority of any process that decays to the limit of the workgroup.

adjust quantum bounds

If the process is at lower priority than other processes in the workgroup, but hasn't decayed to the limit priority (so that oscillation will take place), the rate of priority decay can be changed. Use the ALTWG command to change the quantum bounds to reduce the quantum. A smaller quantum ensures faster priority decay so that processes decay to the limit more quickly and can be oscillated.

CPU minimum not met

Problem: The observed CPU allocation to a workgroup is less than the minimum CPU percentage.

Solutions:

insufficient demand

If the processes within the workgroup do not require the amount of CPU they have been guaranteed, then the observed CPU allocation will be lower than the set value. If the processes can consume only 20% of the CPU, but have been given a minimum of 30%, they will not be able to consume the minimum amount.

too few processes

If you are using a system with multiple processors and there are a small number of processes running, the CPU demand may not reach the minimum assigned to the workgroup. For example, suppose your system has four CPUs and you have assigned a 40% minimum CPU percentage to a workgroup. If there is only one process running in that workgroup, the workgroup can consume a maximum of 25% of the total CPU capacity of the system, or one processor.

System or process hang

Problem: You have set the maximum CPU percentage of a workgroup to zero and it has starved. Or, a workgroup that captures processes running CI.PUB.SYS has no CPU access (either by setting the CPU maximum to zero or by placing them at low priority on a busy system). Or, the CPU minimum percentage guarantees do not allow sufficient CPU for processes in the default workgroups.

Solutions:

if you can enter commands

Adjust the scheduling characteristics of the problem workgroups or delete the entire workgroup configuration. If you are able to identify the problem workgroup(s) (e.g., a workgroup with a maximum CPU % of zero), use the ALTWG command to alter the scheduling characteristics. If you are uncertain of the problem workgroup(s) and wish to remove all user-defined workgroups, leaving only the five system-defined default workgroups, issue the command PURGEWG @.

if you can't enter commands

Reboot the system and at the ISL prompt, enter the command START SINGLE-USER. This initiates single-user mode, in which only the five default workgroups are available; all user-defined workgroups are purged. Next, use the NEWWG command to invoke a more appropriate workgroup configuration (or you can choose to stay with just the five default workgroups). Finally, enter the command START or START MULTI-USER to bring the system up multi-user.

Feedback to webmaster