HPlogo STREAMS/UX for the HP 9000 Reference Manual > Chapter 4 STREAMS/UX Multiprocessor Support

Writing MP Scalable Modules and Drivers

» 

Technical documentation

Complete book in PDF

 » Table of Contents

 » Index

Overview of STREAMS/UX MP Support

HP-UX STREAMS supports MP scalable drivers and modules. You can configure the amount of parallelism for modules and drivers. Pick a level which is consistent with a module's or driver's use of shared data structures. STREAMS/UX provides five levels of parallelism which are called queue, queue pair, module, elsewhere, and global. They are described below. Also, STREAMS provides extra synchronization for module and driver open and close functions. This synchronization is also described below. The term module is used in this discussion to mean both modules and drivers, unless otherwise stated.

Figure 4-1 “Understanding STREAMS/UX MP Support” is useful for understanding STREAMS/UX MP support. The diagram shows four streams, ECHO-A, ECHO-B, DLPI-A and SAD-A. ECHO-A and ECHO-B both contain the echo driver. DLPI-A contains dlpi, and SAD-A has sad. Each driver contains a read and a write queue. echo_rput and echo_rsrv operate on an echo driver's read queue. echo_wput and echo_wsrv access the write queue. The dlpi and sad driver functions are similar to the echo driver functions. STREAMS/UX executes echo, dlpi, and sad driver functions differently depending on the MP synchronization level configured for the drivers.

Figure 4-1 Understanding STREAMS/UX MP Support

Understanding STREAMS/UX MP Support

The queue synchronization level provides the most concurrency. It serializes access to a queue so that only one function at a time can manipulate the queue. Applications can take advantage of multiple processors because functions that operate on different queues run in parallel. For example, assume that the echo driver in Figure 4-1 “Understanding STREAMS/UX MP Support” uses queue synchronization. STREAMS/UX does not run ECHO-A's echo_rput and echo_rsrv in parallel. Also, STREAMS/UX does not execute ECHO-A's echo_wput and echo_wsrv concurrently. However, STREAMS/UX can run ECHO-A's echo_rput at the same time as ECHO-A's echo_wput. STREAMS/UX allows ECHO-A's read queue functions to run in parallel with ECHO-A's write queue routines. Also, any of ECHO-A's procedures can run at the same time as ECHO-B, DLPI-A or SAD-A routines. If a module uses queue synchronization, a queue's put and service routines can easily share data with each other because STREAMS/UX does not execute the routines concurrently.

The queue pair synchronization level serializes access to a read and write queue pair so that only one of the queue pair's functions can run at a time. Queue pair synchronization still allows concurrency because functions for different queue pairs run in parallel. (A queue pair is also known as a module instance.) For example, assume that the echo driver in Figure 4-1 “Understanding STREAMS/UX MP Support” is configured for queue pair synchronization. STREAMS/UX runs ECHO-A's echo_rput, echo_rsrv, echo_wput, and echo_wsrv one at a time. In other words, STREAMS/UX does not execute any of ECHO-A's echo driver functions concurrently, nor will STREAMS/UX run any of ECHO-B's echo driver functions in parallel. However, STREAMS/UX can run an ECHO-A function at the same time as an ECHO-B function. Also, any of ECHO-A's functions can run in parallel with DLPI-A or SAD-A routines. If a module uses queue pair synchronization, a queue pair's functions run one at a time and can share data.

The module synchronization level serializes access to all of a module's queue pairs or instances. STREAMS/UX runs only one function at a time for all of a module's queue pairs. However, STREAMS/UX runs functions for different modules in parallel. Modules are different if they have different master file entries. For example, timod and tirdwr are different modules. Assume that the echo driver in Figure 4-1 “Understanding STREAMS/UX MP Support” is configured for module synchronization. STREAMS/UX does not run echo driver functions in ECHO-A and ECHO-B in parallel.

However, STREAMS/UX can run an echo driver function at the same time as dlpi or a sad driver function. Because STREAMS/UX allows only one function for all of a module's queue pairs to run at a time, the module's queue pairs can share data.

The elsewhere synchronization level serializes a group of different modules. STREAMS/UX runs only one function at a time for the group of modules. STREAMS/UX runs functions in different groups concurrently. Suppose the echo and dlpi drivers in Figure 4-1 “Understanding STREAMS/UX MP Support” are configured to be members of an elsewhere synchronization group. Also, assume the sad driver is configured to be in a different elsewhere group. Only one driver function in ECHO-A, ECHO-B and DLPI-A can run at a time. However, a function in ECHO-A, ECHO-B or DLPI-A can run in parallel with a function in SAD-A. Also, a function in ECHO-A, ECHO-B or DLPI-A can run at the same time as a function in a module which uses a different synchronization level than elsewhere. The modules in a group can share data.

The global synchronization level does not provide parallelism within STREAMS/UX. Only one module out of those configured for global synchronization can run at a time. Suppose that in Figure 4-1 “Understanding STREAMS/UX MP Support”, the echo, dlpi and sad drivers use global synchronization. Only one driver function in ECHO-A, ECHO-B, DLPI-A and SAD-A can run at a time. However, one of these drivers could run in parallel with a module configured for a synchronization level other than global. All modules configured with global synchronization can share data.

The STREAMS/UX synchronization levels also apply to open and close. For example, if a module is configured for queue pair synchronization, none of the put or service routines for the queue pair can run at the same time as the queue pair's open or close. Also, open cannot run at the same time as close. The least amount of protection that STREAMS/UX provides for opens and closes is queue pair. Even if a module is configured with queue synchronization, it will run as if it were configured with queue pair synchronization during opens and closes.

STREAMS/UX provides additional protection for opens and closes. STREAMS/UX executes only one open or close across all streams at a time. For example in Figure 4-1 “Understanding STREAMS/UX MP Support”, if STREAMS/UX is executing the ECHO-A echo driver's open routine, the DLPI-A dlpi open cannot run nor can any other module's or driver's open or close. An exception to this occurs if an open or close sleeps. When this happens, other opens and closes can occur. An open or close function that sleeps may need to use a spinlock together with the get_sleep_lock, SV_WAIT or SV_WAIT_SIG utilities to prevent missing wakeups. These utilities are described in the "HP-UX Modifications to STREAMS/UX Utilities" section in Chapter 3. Also, SV_WAIT and SV_WAIT_SIG are discussed in the SVR4.2 Driver manual.

STREAMS does not synchronize the running of timeout and bufcall callback functions with modules and drivers. This chapter lists some restrictions on what these callback functions can do.

Suggestions for Designing MP Scalable Modules and Drivers

This section contains recommendations for designing MP scalable modules and drivers:

  • Modules and drivers that run over UP emulation hardware drivers must run under UP emulation. Before changing STREAMS/UX modules and drivers to be MP scalable, modify hardware drivers to be MP scalable.

  • You can improve the performance of modules and drivers by using the elsewhere synchronization level. Configure all modules and drivers in a subsystem to be in the same group. They can all share data. However, STREAMS/UX will not synchronize bufcall and timeout callback functions or any non-STREAMS/UX code with the modules or drivers. You may be able to use the streams_put utility described in Chapter 3. In general, UP emulation provides more protection for bufcall, timeout, and non-STREAMS functions.

  • To change modules and drivers to be MP scalable, analyze how the code shares data structures. Determine which structures are shared and which module and driver entry points read and write to the structures. Using this information, choose synchronization levels for modules and drivers that correctly serialize access to shared data.

  • If all modules and drivers of a product share the same structure, consider changing the module and driver data structures and algorithms to allow for more parallelism. Alternatively, consider using spinlocks to protect shared structures that are accessed infrequently or for short amounts of time. Using spinlocks is a good way to protect structures which are not accessed on the main read and write paths. You can either use the native HP-UX spinlock primitives or the SVR4 MP LOCK, TRYLOCK, UNLOCK, LOCK_ALLOC and LOCK_DEALLOC utilities. The SVR4 MP utilities are discussed under "HP-UX Modifications to STREAMS/UX Utilities" in Chapter 3 and in the SVR4.2 Driver manual.

  • Use service routines only for flow control, recovering from resource shortages or executing interrupt completions in a process context. Service routines degrade performance.

  • Be careful when writing timeout and bufcall callback functions, as well as non-STREAMS code that calls STREAMS/UX utilities or shares data with modules and drivers. See the "Guidelines for MP Scalable Modules and Drivers" section.

Configuring MP Scalable Modules and Drivers

This section describes how to configure MP scalable modules and drivers.

MP Scalable Module and Driver Configuration

If you want a module or driver to be MP scalable, you must specify additional configuration parameters. You need to:

  • Add a flag indicating that the module or driver is MP scalable

  • Add a keyword which specifies the synchronization level the module or driver uses

  • Add a sync name if the module or driver requires elsewhere synchronization

    The sync name indicates which modules and drivers belong to a group. Choose a sync name with eight characters or less, and configure the name for each member of the group. See Chapter 5 for more information about configuring STREAMS/UX modules and drivers.

Master File $DEVICE Table Configuration

To configure an MP scalable module or driver using a master file $DEVICE table entry, add the 0x10000 (MGR_IS_MP) flag to the mask value. Also add an entry to the master file $STREAMS_DVR_SYNC table. This entry contains the module or driver's name, a keyword specifying the synchronization level, and a sync name if the module or driver requires elsewhere synchronization. There are five synchronization level keywords: sync_global, sync_elsewhere, sync_module, sync_qpair, and sync_queue. The STREAMS/UX master file contains a list of the valid keywords in the $STREAMS_SYNC_LEVEL table. The examples below show $DEVICE and $STREAMS_DVR_SYNC table entries.

* name    handle       type    mask    block   char
*
$DEVICE
strlog loginfo 21 120FC -1 73 /* Added 0x10000 to mask */
dlpi dlpiinfo 21 120FC -1 119 /* Added 0x10000 to mask */
tirdwr tirdwrinfo 40 12000 -1 -1 /* Added 0x10000 to mask */
A Ainfo 40 12000 -1 -1 /* Added 0x10000 to mask */
B Binfo 40 12000 -1 -1 /* Added 0x10000 to mask */
C Cinfo 40 12000 -1 -1 /* Added 0x10000 to mask */
D Dinfo 21 120FC -1 116 /* Added 0x10000 to mask */
$$$
* name sync level sync name
*
$STREAMS_DVR_SYNC
strlog sync_module /* Added sync level */
dlpi sync_qpair /* Added sync level */
tirdwr sync_queue /* Added sync level */
A sync_elsewhere ABsync /* Added sync level & name
*/
B sync_elsewhere ABsync /* Added sync level & name */
C sync_elsewhere netsync /* Added sync level & name */
D sync_elsewhere netsync /* Added sync level & name */
$$$

Module and Driver Install Function Configuration

If a module or driver is configured using an install function, add the MGR_IS_MP flag to the inst_flags field in the streams_info_t structure. Also, if you are configuring a driver, set the DRV_MP_SAFE flag in the drv_info_t structure. Specify a synchronization level in the inst_sync_level field. The possible values are SQLVL_GLOBAL, SQLVL_ELSEWHERE, SQLVL_MODULE, SQLVL_QUEUEPAIR and SQLVL_QUEUE. If the module or driver is using the elsewhere synchronization level, add a sync name to the inst_sync_info field. Note that a module or driver which uses an install function for configuration needs an entry in the master file $DRIVER_INSTALL table. (Do not put an entry in the $DEVICE table if an install function is used.) The examples below show MP scalable module and driver install functions.

STRLOG DRIVER

static drv_info_t strlog_drv_info = { /* driver information */
"strlog", /* name */
"pseudo", /* class */
DRV_CHAR | DRV_PSEUDO | /*NOTE* DRV_MP_SAFE flag specified */
DRV_MP_SAFE,
-1, /* block major number */
73, /* character major number */
NULL, NULL, Null, /* cdio, gio_private, and cdio_private
structures
}

static drv_ops_t strlog_drv_ops = { /* driver entry points */
NULL, /* open */
NULL, /* close */
NULL, /* strategy */
NULL, /* dump */
NULL, /* psize */
NULL, /* mount */
NULL, /* read */
NULL, /* write */
NULL, /* ioctl */
NULL, /* select */
NULL, /* option1 */
NULL, NULL, NULL, NULL, /* reserved entry points */
0, /* device flags */
}

static streams_info_t strlog_str_info = { /* streams information */
"strlog", /* name */
73, /* major number */
{&logrinit, &logwinit, NULL, NULL}, /* streamtab */
STR_IS_DEVICE | STR_SYSV4_OPEN | /*NOTE* MGR_IS_MP flag specified */
MGR_IS_MP,
SQLVL_MODULE, /*NOTE* synch level specified */
"", /* elsewhere sync name */

}

}
int
strlog_install()
{
int retval;

if ((retval = install_driver(&strlog_drv_info, &strlog_drv_ops)) != 0)
return(retval);

if ((retval = str_install(&strlog_str_info)) != 0) {
uninstall_driver(&strlog_drv_info);
return(retval);
}

/* success */
return 0;

TIRDWR MODULE

static streams_info_t tirdwr_str_info = { /* streams information */
"tirdwr", /* name */
-1, /* major number */
{ &rinit, &winit, NULL, NULL }, /* streamtab */
STR_IS_MODULE | STR_SYSV4_OPEN | /*NOTE* MGR_IS_MP flag specified */
MGR_IS_MP
SQLVL_QUEUE, /*NOTE* synch level specified */
"", /* elsewhere sync name */
}

int
tirdwr_install()
{
int retval;

return(str_install(&tirdwr_str_info));
}

C MODULE
static streams_info_t c_str_info = { /* streams information */
"C", /* name */
-1, /* major number */
{ &crinit, &cwinit, NULL, NULL }, /* streamtab */
STR_IS_MODULE | STR_SYSV4_OPEN | /*NOTE* MGR_IS_MP flag specified
MGR_IS_MP
SQLVL_ELSEWHERE, /*NOTE* synch level specified */
"netsync", /*NOTE* sync name specified */
}
int
C_install()
{
int retval;

return(str_install(&c_str_info));

}

D DRIVER

static drv_info_t d_drv_info = { /* driver information */
"D", /* name */
"pseudo", /* class */
DRV_CHAR | DRV_PSEUDO | /*NOTE* DRV_MP_SAFE flag specified */
DRV_MP_SAFE,
-1, /* block major number */
-1, /* dynamically assigned character major number */
NULL, NULL, NULL, /* cdio, gio_private, and cdio_private
structures */
}

static drv_ops_t d_drv_ops = { /* driver entry points */
NULL, /* open */
NULL, /* close */
NULL, /* strategy */
NULL, /* dump */
NULL, /* psize */
NULL, /* mount */
NULL, /* read */
NULL, /* write */
NULL, /* ioctl */
NULL, /* select */
NULL, /* option1 */
NULL, NULL, NULL, NULL, /* reserved entry points */
0, /* device flags */
}
static streams_info_t d_str_info = { /* streams information */
"D", /* name */
-1, /* dynamically assigned major number */
{ &drinit, &dwinit, NULL, NULL}, /* streamtab */
STR_IS_DEVICE | STR_SYSV4_OPEN | /*NOTE* MGR_IS_MP flag specified */
MGR_IS_MP,
SQLVL_ELSEWHERE, /*NOTE* synch level specified */
"netsync", /*NOTE* sync name specified */
}

int
D_install()
{

int retval;

/* Configure driver and obtain dynamically assigned major number. */
if ((retval = install_driver(&d_drv_info, &d_drv_ops)) != 0)
return(retval);

/* Configure streams specific parameters. */
if ((retval = str_install(&d_str_info)) != 0) {
uninstall_driver(&d_drv_info);
return(retval);
}

/* Success */
return 0;

}

Configuring the NSTRSCHED Tunable

STREAMS/UX provides a new tunable, NSTRSCHED, which allows you to set the number of STREAMS/UX scheduler daemons running on a multiprocessor system. The default value is 0, which indicates that STREAMS/UX will determine the number of daemons based on the number of processors in the system. The minimum value is 0 and the maximum is 32.

You should leave NSTRSCHED set to the default value. STREAMS/UX will set the number of daemons based on the number of processors in the system. STREAMS/UX will create fewer daemons than there are processors. There is no benefit to creating more daemons than processors. You might want to increase the value of NSTRSCHED if the system does a lot of STREAMS/UX processing or decrease it if the system does very little STREAMS/UX work. You can determine the number of scheduler daemons running on the system by executing the ps -ef command, and counting the number of smpsched processes.

Guidelines for MP Scalable Modules and Drivers

  • It is easier to develop STREAMS/UX-based software that runs completely MP scalable or completely under UP emulation. Try to avoid mixing MP scalable and UP emulation modules and drivers in the same stream or multiplexor.

  • MP scalable STREAMS/UX modules and drivers cannot call UP emulation software. A put or service routine cannot acquire the I/O semaphore because put and service routines cannot block. This means, for example, that modules and drivers which run over a UP emulation hardware driver must run under UP emulation.

  • Modules and drivers which can run both MP scalable and under UP emulation must use queue or queue pair synchronization. An example of an MP scalable module which can run in UP emulation mode is timod. Although timod is configured to be MP scalable, it is pushed onto many streams, some of which run in UP emulation mode.

  • The MP scheduler runs differently from the uniprocessor scheduler. This may affect STREAMS/UX application programs. On multiprocessor systems, the scheduler may not run a service routine before the process which scheduled the routine returns to user level.

  • A module or driver's synchronization level determines the entities with which it can share data. It also determines the entities with which it can share its STREAMS/UX queues. For example, if a module uses queue pair synchronization, the write put routine can call insq to insert a message onto the module's read queue. But, if the module uses queue synchronization, the write put routine can only call insq to insert messages onto the write queue. The synchronization level determines which queues a module or driver can pass to STREAMS/UX utilities.

    In general, a put or service routine can only pass its own queue or queues belonging to entities with which it can share data. The restricted utilities are backq, bcanputnext, canputnext, flushband, flushq, freezestr, getq, insq, putbq, putnext, putnextctl, putnextctl1, putnextctl2, putq, qreply, qsize, rmvq, SAMESTR, strqget, strqset and unfreezestr. The putq utility is not restricted when it is passed a driver's read queue or a lower mux's write queue. Any put or service routine can call putq if it passes a driver's read queue or a lower mux's write queue. However, putq's caller must guarantee that the queue passed in is still allocated.

    Some STREAMS/UX utilities, such as canput, are commonly passed a parameter of the form q->q_next. These routines are restricted in a different way from those listed above. A put or service routine can only pass its own queue's q_next field or the q_next field of queues belonging to entities with which it can share data. These requirements apply to bcanput, canput, put, putctl, putctl1, putctl2, and streams_put. These utilities are not restricted when they are passed a parameter of the form q, except that the queue must still be allocated.

  • Some restrictions exist for timeout and bufcall callback routines as well as non-STREAMS/UX code in the kernel. This software cannot share data structures with STREAMS/UX modules and drivers, unless spinlocks are used to protect critical sections. Also, the code cannot call the following utilities: backq, bcanputnext, canputnext, flushband, flushq, freezestr, getq, insq, putbq, putnext, putnextctl, putnextctl1, putnextctl2, qreply, qsize, rmvq, SAMESTR, strqget, strqset, and unfreezestr.

    Callback routines and non-STREAMS code cannot call bcanput, canput, put, putctl, putctl1, putctl2 or streams_put if they pass the utility a parameter of the form q->q_next. They can call these utilities if they pass a parameter of the form q (q must be a valid, allocated queue). Callback and non-STREAMS code can call putq only if they pass it a driver's read queue or a lower mux's write queue. Callback and non-STREAMS code can use the new streams_put utility documented in the section "HP-UX Modifications to STREAMS/UX Utilities" in Chapter 3.

  • Some restrictions exist on free routines passed to esballoc. A free routine can call STREAMS/UX utilities in the same way as the put or service routine that calls freeb. A free routine can access the same data structures as the put or service routine that calls freeb.

  • A protect_q parameter can be passed to the weldq utility. The protect_q parameter specifies which queue the func parameter can access safely. The func function can use the same STREAMS/UX utilities as the protect_q put and service routines. Also, the function can access the same data structures as the protect_q put and service routines.

  • put and service routines cannot be called directly. They must be executed by calling STREAMS/UX utilities such as putnext, put, putq, or qenable. They cannot be called using the function pointer stored in the q_qinfo structure.

  • STREAMS/UX applications in which multiple processes access the same stream need to know how STREAMS/UX will synchronize operations on the stream. See "Multiple Processes Accessing the Same Stream" in Chapter 3.

  • Modules and drivers can allocate their own spinlocks to protect data structures. If they do, they should use the lock orders reserved for them in /usr/include/sys/semglobal.h or /usr/conf/h/semglobal.h: STREAMS_USR1_LOCK_ORDER, STREAMS_USR2_LOCK_ORDER, and STREAMS_USR3_LOCK_ORDER.

    The lock order is passed in the order parameter of the native HP-UX alloc_spinlock primitive and the hierarchy parameter of the SVR4 MP LOCK_ALLOC utility. The HP-UX kernel uses this information to check for deadlocks when the kernel is compiled with SEMAPHORE_DEBUG. When a module acquires a spinlock, the spinlock's order must be higher than the order of any spinlocks the module already holds. Modules and drivers cannot hold spinlocks when calling some STREAMS/UX utilities. See Table 4-1 “Holding Module or Driver Defined Spinlocks While Calling Utilities” at the end of this chapter for more information. See the SVR4.2 Driver manual for more information about SVR4 MP hierarchies.

  • To reduce contention and improve performance, you should minimize the amount of time that modules and drivers hold spinlocks.

  • To improve performance, modules and drivers should verify that they are actually running on a multiprocessor system before calling the HP-UX native spinlock primitives. The SVR4 MP LOCK and UNLOCK routines described in Chapter 3 do this for the caller. If a spinlock is being used only to protect against software running on other processors, but not interrupts, modules or drivers can call the MP_SPINLOCK and MP_SPINUNLOCK macros in /usr/include/sys/spinlock.h (or /usr/conf/h/spinlock.h). These macros obtain only the requested spinlock if they are executing on a multiprocessor system. If a spinlock is being used to protect against both software running on other processors and interrupts, modules and drivers should check the uniprocessor flag and raise the spl level if they are running on a uniprocessor system. Example code is shown below.

    if (uniprocessor)
    x = splstr();
    else
    spinlock(mylock);
  • Be careful when choosing a multiplexor's synchronization level. When a driver is linked under a mux, STREAMS/UX changes the driver's Stream head to be the lower mux. STREAMS/UX uses the upper mux's synchronization level for the lower mux. So if the upper mux uses global, elsewhere, or module synchronization, the lower and upper muxes can share data. If the upper mux uses queue or queue pair synchronization, the lower and upper muxes cannot share data.

    The synchronization level also influences how messages can be passed across the mux. If the upper mux uses global, elsewhere, or module synchronization, it can pass messages downward by passing the lower mux's write queue to putq, put, or putnext. Likewise, the lower mux can pass messages upward by passing the upper mux's read queue to putq, put, or putnext. If the upper mux uses queue or queue pair synchronization, it can only use putq and put to pass messages to the lower mux. To use putnext, the upper mux must ensure that the driver stays linked under the mux until after the putnext completes. Also, the lower mux can only use putq and put to pass messages to the upper mux. To use putnext, the lower mux must guarantee that the driver stays linked under the mux, that the mux stays open, and that modules are not pushed or popped until after the putnext completes.

    No matter which utility is used to pass messages across the mux, you must make sure that the queues passed to the utilities are still allocated. You may also want to check that the driver is still linked under the mux.

  • Follow the design guidelines in the SVR4.2 STREAMS manual. The guidelines are located at the end of these chapters: Overview of STREAMS Modules and Drivers, STREAMS Modules, STREAMS Drivers, and STREAMS Multiplexing. For STREAMS/UX, you do not need to follow some of these guidelines. However, if you ignore them, your software will not be portable to SVR4 STREAMS. For HP-UX STREAMS, you do not need to call qprocson or qprocsoff as you do for SVR4 MP STREAMS. Also, you can use synchronization levels to protect module and driver private structures instead of SVR4 MP locks and synchronization primitives. Lastly, you do not need to use SVR4 MP canputnext and bcanputnext instead of canput and bcanput on STREAMS/UX.

© 1995 Hewlett-Packard Development Company, L.P.