HPlogo HP-UX MultiProcessing: White Paper > Chapter 1 MultiProcessing

Spinlocks

» 

Technical documentation

Complete book in PDF

 » Table of Contents

Spinlocks are at the heart of controlling concurrency within an MP system. Their chief purpose is to protect global data structures by controlling access to critical data. When entering an area of code that modifies a global data structure, the kernel acquires an associated spinlock and then releases it when leaving the affected area of code.

Figure 1-5 Conceptual view of a spinlock

[Conceptual view of a spinlock]

Spinlocks are a more fundamental way of protecting critical sections than semaphores, in that they are used in the construction of the semaphore services; semaphore implementations themselves are critical sections.

spinlock (lock);
 [critical section]
spinunlock (lock);

The spinlock routines operate on a binary flag of type lock_t, to guarantee mutual exclusion of threads of control. The functionality of spinlock/spinunlock to raise the spl priority to mask out external interrupts and prevent preemption:

old_priority = raise_priority (HIGHEST_PRIORITY);

while (test_and_clear (lock) == 0); ...U: lock = 1;

restore_priority (old_priority);

To avert deadlocks, spinlock acquisition enforces a simple ordering constraint: Do not attempt to lock a lower or equal-order spinlock to one already held.

Spinlock Rules

The following rules govern use of spinlocks:

  • Spinlocks must be held for as short a time as possible (preferably less than the time it takes to make one context switch).

  • Spinlocks are a non-blocking primitive. Code protected by spinlocks must not generate traps that can block. (Thus, you may not hold a spinlock across an operation that might take a page fault.)

  • Code protected by spinlocks must not cause a context switch. Resources that are never held longer than the time it takes to perform a context switch should be protected with spinlocks. This prevents useless preemption.

  • Spinlocks are used to guarantee access to global data structures by a single thread of execution. Thus, they must be acquired prior to the section of code that accesses the global data structures.

  • When a lock is unavailable, the spinlock waits until the busy lock is free

  • Resources manipulated by an interrupt service routine (ISR) should be protected with a spinlock. ISRs may not block. This applies also to kernel routines that might potentially be called from an ISR.

  • Under spinlocks, interrupts are disabled and the thead of control is not allowed to sleep.

  • Spinlocks can be acquired on the ICS. It is necessary to prevent interrupts when the top half acquires a spinlock, so that an interrupt does not occur and spin for the same spinlock, thus causing a deadlock.

NOTE: MP_SPINLOCK is a macro that checks to see if the code is being executed on an MP system and call spinlock() if it is. On a uniprocessor system, there is no need to lock the spinlock since only one thread can execute at a time and it will not sleep until it leaves kernel mode.

Numerous spinlocks are created at the time of kernel initialization with a call to alloc_spinlock(), which primarily allocates memory for the spinlock data structure and initializes its fields. The kernel creates these spinlocks from init_spinlocks() and by calls to vm_initlock() for the VM spinlocks. The table below lists some of the spinlocks allocated at the time of kernel initialization. Other spinlocks are created and destroyed during runtime.

Table 1-3 Spinlocks allocated when kernel is initialized

Type of SpinlockNames

Process

Management

sched_lock, activeproc_lock,activethread_lock, rpregs_lock,callout_lock, cred_lock

File System

file_table_lock, devvp_lock,dnlc_lock, biodone_lock, bbusy_lock,v_count_lock, unrm_table_lock,inode_lock, inode_move_lock,rootvfs_lock, kmio_lock,sysV_msgque_lock, sysV_msghdr_lock,sysV_msgmap_lock, reboot_lock,devices_lock, audit_spinlock

Networking

netisr_lock, ntimo_lock,bsdskts_lock, nm_lock

Virtual Memory

Management

(VM)

msem_list_lock, buf_hlist_lock,swap_buf_list_lock, vaslst_lock,text_hash, lost_page_lock,rlistlock, rmap_lock, kmemlock,pswap_lock, rswap_lock, pfdat_lock,pfdat_hash, eq_lock, bcvirt_lock,bcphys_lock, alias_lock,psl_random_lock, mprot_list_lock

General

semaphore_log_lock, ioserv_lock,swtrig_lock, time_lock, vmsys_lock,lpmc_log_lock, itmr_sync_lock,itmr_state_lock, pdce_proc_lock,pfail_cntr_lock, printf_lock,io_tree_lock, dma_buflet_lock,space_id_lock, lofs_lo_lock,lofs_lfs_lock, lofs_li_lock

 

Spinlock Inlining

To improve the performance of the spinlock code, HP-UX implements a technique called "dynamic inlining."

A macro is used for select performance-sensitive spinlocks that reserves space for inlining the spinlock instead of simply calling the spinlock function. This is done at compile time. At execution time, if the system has more than one processor, the macro is replaced with inline spinlock code.

For systems with more than one processor, the mutual exclusion algorithm now uses an LDCW instruction, which reduces the pathlength of the spinlock routines.

Spinlock Data Structures

HP-UX uses two types of data structures for its spinlock implementation:

  • The lock_t structure represents a single spinlock.

  • Hashed spinlocks are used for locating a spinlock within a pool. (Hashed spinlocks will be explained shortly.)

Spinlock Data Structure (lock_t)

There is one lock_t data structure for every spinlock. The table that follows describes the elements in the structure.

Table 1-4 Elements of the spinlock data structure lock_t

ElementPurpose
sl_lockUsed in the LDCW instruction to acquire the lock. A nonzero value indicates the lock is free.
sl_ownerPointer to the per-processor data area (&mpinfo[cpunum]) for the processor owning the lock. If the lock is not owned, the value is 0.
sl_flagA flag that indicates another CPU might want this lock.
sl_next_cpuThe cpu number of the last CPU that acquired the lock under arbitration
sl_padPadding to bring lock_t to a reasonable cache line size.

 

Hashed Spinlocks

A single spinlock works well for a single instance of a global data structure or one that is accessed in synchronously. However, contention occurs when using a single spinlock for a data structure with multiple instances (such as a vnode structure). Conversely, using a single spinlock for each vnode would be overcompensating.

To compromise, HP-UX allocates the capability to use a pool of "hashed spinlocks" that are accessed by a hash function to deal with data structures having multiple instances of individual entries or a group of entries.

Figure 1-6 Hashed spinlocks point to singular spinlock data structures

[Hashed spinlocks point to singular spinlock data structures]

When developing code, a programmer can choose to do a hash for a hash pool covering any particular requirement (for example, one for vnodes, one for inodes, etc).

The routine alloc_h_spinlock_pool() is used to allocate a pool of hashed spinlocks. From this routine, the kernel calls from init_hashed_spinlocks(). A spinlock for a particular instance is then accessed by hashing on the address or some other unique attribute of that instance. You can see spinlocks obtained through this hash pool by a call to MP_H_SPINLOCK() in the kernel. Some of the hashed pools currently allocated by the kernel are

  • vnl_h_sl_pool

  • bio_h_sl_pool

  • sysv_h_sl_pool

  • reg_h_sl_pool

  • io_ports_h_sl_pool

  • ft_h_sl_pool

Table 1-5 Key elements in hash_sl_pool structure

Hash pool elementPurpose
n_hash_spinlocks

Contains pointers to lock_t structures

**hash_sl_table

Points to an array of size n_hash_spinlocks

 

The hash functions return an index into this array of pointers.

Regardless of whether the spinlock data structure is accessed directly or through a hash table, the acquisition details to be discussed next are the same.

Spinlock Arbitration

To ensure that no processor is kept waiting indefinitely for a spinlock, round-robin arbitration using two modules takes place.

Table 1-6 Modules for spinlock arbitration

Module Purpose
wait_for_lock()

Waits until a spinlock is acquired or a timeout occurs.

Puts the lock address into a table indexed by CPU number.

Sets a flag to indicate that there are CPUs waiting for the lock.

su_waiters()Called from spinunlock when the sl_flag is set. Either releases the lock or passes it to another processor.