HPlogo HP-UX Process Management: White Paper > Chapter 1 Process Management

Process Scheduling

» 

Technical documentation

Complete book in PDF

 » Table of Contents

To understand how threads of a process run, we have to understand how they are scheduled. Although processes appear to the user to run simultaneously, in fact a single processor is executing only one thread of execution at any given moment.

Several factors contribute to process scheduling:

  • Kind of scheduling policy required -- timeshare or real-time. Scheduling policy governs how the process (or thread of execution) interacts with other processes (or threads of execution) at the same priority.

  • Choice of scheduler. Four schedulers are available: HP-UX timeshare scheduler (SCHED_HPUX), HP Process Resource Manager (a timeshare scheduler), HP-UX real-time scheduler (HPUX_RTPRIO), and the POSIX-compliant real-time scheduler.

  • Priority of the process. Priority denotes the relative importance of the process or thread of execution.

  • Run queues from which the process is scheduled.

  • Kernel routines that schedule the process.

Scheduling Policies

HP-UX scheduling is governed by policy that connotes the urgency for which the CPU is needed, as either timeshare or real-time. The following table compares the two policies in very general terms.

Table 1-20 Comparison of Timeshare vs Real-time scheduling

TimeshareReal-Time
Typically implemented round-robin.Implemented as either round-robin or first-in-first-out (FIFO), depending on scheduler.
Kernel lowers priority when process is running; that is, timeshare priorities decay. As you use CPU, your priority becomes weaker. As you become starved for CPU, your priority becomes stronger. Scheduler tends to regress toward the mean.Priority not adjusted by kernel; that is, real-time priorities are non-decaying. If a real-time priority is set at 50 and another real-time priority is set at 40 (where 40 is stronger than 50), the process or thread of priority 40 will always be more important than the process or thread of priority 50.
Runs in timeslices that can be preempted by process running at higher priority.Runs until exits or is blocked. Always runs at higher priority than timeshare.

 

The principle behind the distribution of CPU time is called a timeslice. A timeslice is the amount of time a process can run before the kernel checks to see if there is an equal or stronger priority process ready to run.

  • If a timeshare policy is implemented, a process might begin to run and then relinquish the CPU to a process with a stronger priority.

  • Real-time processes running round-robin typically run until they are blocked or relinquish CPU after a certain timeslice has occurred.

    Real-time processes running FIFO run until completion, without being preempted.

Scheduling policies act upon sets of thread lists, one thread list for each priority. Any runnable thread may be in any thread list. Multiple scheduling policies are provided. Each nonempty list is ordered, and contains a head (th_link) as one end of its order and a tail (th_rlink) as the other. The purpose of a scheduling policy is to define the allowable operations on this set of lists (for example, moving threads between and within lists).

Each thread is controlled by an associated scheduling policy and priority. Applications can specify these parameters by explicitly executing the sched_setscheduler() or sched_setparam() functions.

Hierarchy of Priorities (overview)

All POSIX real-time priority threads have greater scheduling importance than threads with HP-UX real-time or HP-UX timeshare priority. By comparison, all HP-UX real-time priority threads are of greater scheduling importance than HP-UX timeshare priority threads, but are of lesser importance than POSIX real-time threads. Neither POSIX nor HP-UX real-time threads are subject to degradation.

This will be demonstrated in detail shortly.

Schedulers

As of release 10.0, HP-UX implements four schedulers, two time-share and two real-time.

To choose a scheduler, you can use the user command, rtsched(1), which executes processes with your choice of scheduler and enables you to change the real-time priority of currently executing process ID.

rtsched -s scheduler -p priority command [arguments] rtsched [ -s scheduler ] -p priority -P pid

Likewise, the system call rtsched(2) provides programmatic access to POSIX real-time scheduling operations.

RTSCHED (POSIX) Scheduler

The RTSCHED POSIX-compliant real-time deterministic scheduler provides three scheduling policies, whose characteristics are compared in the following table.

Table 1-21 RTSCHED policies

Scheduling PolicyHow it works
SCHED_FIFO Strict first in-first out (FIFO) scheduling policy. This policy contains a range of at least 32 priorities. Threads scheduled under this policy are chosen from a thread list ordered according to the time its threads have been in the list without being executed. The head of the list is the thread that has been in the list the longest time; the tail is the thread that has been in the list the shortest time.
SCHED_RRRound-robin scheduling policy with a per-system time slice (time quantum). This policy contains a range of at least 32 priorities and is identical to the SCHED_FIFO policy with an additional condition: when the implementation detects that a running process has been executing as a running thread for a time period of length returned by the function sched_rr_get_interval(), or longer, the thread becomes the tail of its thread list, and the head of that thread list is removed and made a running thread.
SCHED_RR2 Round-robin scheduling policy, with a per-priority time slice (time quantum). The priority range for this policy contains at least 32 priorities. This policy is identical to the SCHED_RR policy except that the round-robin time slice interval returned by sched_rr_get_interval() depends upon the priority of the specified thread.

 

SCHED_RTPRIO Scheduler

Realtime scheduling policy with nondecaying priorities (like SCHED_FIFO and SCHED_RR) with a priority range between the POSIX real-time policies and the HP-UX policies.

For threads executing under this policy, the implementation must use only priorities within the range returned by the functions sched_get_priority_max() and sched_get_priority_min() when SCHED_RTPRIO is provided as the parameter.

NOTE: In the SCHED_RTPRIO scheduling policy, smaller numbers represent higher (stronger) priorities, which is the opposite of the POSIX scheduling policies. This is done to provide continuing support for existing applications that depend on this priority ordering.

The strongest priority in the priority range for SCHED_RTPRIO is weaker than the weakest priority in the priority ranges for any of the POSIX policies, SCHED_FIFO, SCHED_RR, and SCHED_RR2.

SCHED_HPUX Scheduler

The SCHED_OTHER policy, also known as SCHED_HPUX and SCHED_TIMESHARE, provides a way for applications to indicate, in a portable way, that they no longer need a real-time scheduling policy.

For threads executing under this policy, the implementation can use only priorities within the range returned by the functions sched_get_priority_max() and sched_get_priority_min() when SCHED_OTHER is provided as the parameter. Note that for the SCHED_OTHER scheduling policy, like SCHED_RTPRIO, smaller numbers represent higher (stronger) priorities, which is the opposite of the POSIX scheduling policies. This is done to provide continuing support for existing applications that depend on this priority ordering. However, it is guaranteed that the priority range for the SCHED_OTHER scheduling policy is properly disjoint from the priority ranges of all of the other scheduling policies described and the strongest priority in the priority range for SCHED_OTHER is weaker than the weakest priority in the priority ranges for any of the other policies, SCHED_FIFO, SCHED_RR, and SCHED_RR2.

Process Resource Manager

The Process Resource Manager (PRM) is an optional HP-UX product coded into the kernel as fss, or Fair Share Scheduler. This time-share scheduler operates on timeshare processes. Real-time processes (RTPRIO and POSIX real-time) are not affected by the fss, but they do affect it, because the fss allocates portions of the CPU to different groups of processes. Unlike the default SCHED_HPUX scheduler, the Process Resource Manager allows the system administrator to budget CPU time to groups of processes with a high degree of specificity.

The remainder of this section will not be dealing with the ramifications of the PRM.

Scheduling Priorities

All processes have a priority, set when the process is invoked and based on factors such as whether the process is running on behalf of user or system and whether the process is created in a time-share or real-time environment.

Associated with each policy is a priority range. The priority ranges foreach policy can (but need not) overlap the priority ranges of other policies.

Two separate ranges of priorities exist: a range of POSIX standard priorities and a range of other HP-UX priorities. The POSIX standard priorities are always higher than all other HP-UX priorities.

Processes are chosen by the scheduler to execute a time-slice based on priority. Priorities range from highest priority to lowest priority and are classified by need. The thread selected to run is at the head of the highest priority nonempty thread list.

Internal vs. External Priority Values

With the implementation of the POSIX rtsched, HP-UX priorities are enumerated from two perspectives -- internal and external priority values.

  • The internal value represents the kernel's view of the priority.

  • The external value represents the user's view of the priority, as is visible using the ps(1) command.

In addition, legacy HP-UX priority values are ranked in opposite sequence from POSIX priority values:

  • In the POSIX standard, the higher the priority number, the stronger the priority.

  • In legacy HP-UX implementation, the lower the priority number, the stronger the priority.

The following macros are defined in pm_rtsched.h to enable a program to convert between POSIX and HP-UX priorities and internal to external values:

  • PRI_ExtPOSIXPri_To_IntHpuxPri

    To derive the HP-UX kernel (internal) value from the value passed by a user invoking the rtsched command (that is, using the POSIX priority value).

  • PRI_IntHpuxPri_To_ExtPOSIXPri()

    To convert HP-UX (kernel) internal priority value to POSIX priority value.

  • PRI_IntHpuxPri_To_ExtHpuxPri

    To convert HP-UX internal to HP-UX external priority values.

rtsched_numpri Parameter

A configurable parameter, rtsched_numpri, controls:

  • The number of scheduling priorities supported by the POSIX rtsched scheduler.

  • The range of valid values is 32 to 512 (32 is default)

Increasing rtsched_numpri provides more scheduling priorities at the cost of increased context switch time, and to a minor degree, increased memory consumption.

Schedulers and Priority Values

There are now four sets of thread priorities: (Internal to External View)

Table 1-22 Scheduler priority values

Type of SchedulerExternal ValuesInternal Values
POSIX Standard 512 to 480 0 to 31
Real-time 512 to 640 0 to 127
System, timeshare 640 to 689128 to 177
User, timeshare 690 to 767 178 to 255

 

NOTE: For the POSIX standard scheduler, the higher the number, the stronger the priority. For the RTPRIO scheduler, the lower the number, the stronger the priority.

The following figure demonstrates the relationship of the three schedulers ranked by priority and strength.

Figure 1-26 Schedulers and Priority values

[Schedulers and Priority values]

The following lists categories of priority, from highest to lowest:

  • RTSCHED (POSIX standard) ranks as highest priority range, and is separate from other HP-UX priorities.

    RTSCHED priorities range between 32 and 512 (default 32) and can be set by the tunable parameter rtsched_numpri.

  • SCHED_RTPRIO (real-time priority) ranges from 0-127 and is reserved for processes started with rtprio() system calls.

  • Two priorities used in a timeshare environment:

    • User priority (178-255), assigned to user processes in a time-share environment.

    • System priority (128-177), used by system processes in a time-share environment.

The kernel can alter the priority of time-share priorities (128-255) but not real-time priorities (0-127).

The following priority values, internal to the kernel, are defined in param.h:

PRTSCHEDBASE

Smallest (strongest) RTSCHED priority

MAX_RTSCHED_PRI

Maximum number of RTSCHED priorities

PRTBASE

Smallest (strongest) RTPRIO priority. Defined as PRTSCHED + MAX_RTSCHED_PRI.

PTIMESHARE

Smallest (strongest) timeshare priority. Defined as PRTBASE + 128.

PMAX_TIMESHARE

Largest (weakest) timeshare priority. Defined as 127 + PTIMESHARE.

Priorities stronger (smaller number) than or equal to PZERO cannot be signaled. Priorities weaker (bigger number) than PZERO can be signaled.

RTSCHED Priorities

The following discussion illustrates the HP-UX internal view, based on how the user specifies a priority to the rtsched command. Each available real-time scheduler policy has a range of priorities (default values shown below).

Scheduler Policy highest priority lowest priority
SCHED_FIFO 31 0
SCHED_RR 31 0
SCHED_RR231 0
SCHED_RTPRIO0127

The user may invoke the rtsched(1) command to assign a scheduler policy and priority. For example,

rtsched -s SCHED_RR -p 31 ls

Within kernel mode sched_setparam() is called to set the scheduling parameters of a process. It (along with sched_setscheduler()) is the mechanism by which a process changes its (or another process') scheduling parameters. Presently the only scheduling parameter is priority, sched_priority.

The sched_setparam() and sched_setscheduler() system calls look up the process associated with the user argument pid, and call the internal routine sched_setcommon() to complete the execution.

sched_setcommon() is the common code for sched_setparam() and sched_setscheduler(). It modifies the threads scheduling priority and policy. The scheduler information for a thread is kept in its thread structure. It is used by the scheduling code, particularly setrq(), to decide when the thread runs, with respect to the other threads in the system. sched_setcommon() is called with the sched_lock held.

sched_setcommon() calls the macro PRI_ExtPOSIXPri_To_IntHpuxPri, defined in pm_rtsched.h. The priority requested is then converted. Since priorities in HP-UX are stronger for smaller values, and the POSIX specification requires the opposite behavior, we merge the two by running the rtsched priorities from ((MAX_RTSCHED_PRI-1) - rtsched_info.rts_numpri) (strongest) to (MAX_RTSCHED_PRI-1) (weakest).

Based on the macro definition using the value passed by the user, the internal value seen by the kernel is calculated as follows:

((MAX_RTSCHED_PRI - 1) - (ExtP_pri))

512 - 1 - 31 = 480

The kernel priority of the user's process is 480. The value of 480 is the strongest priority available to the user.