vxtunefs(1M)

NAME

vxtunefs — tune a VxFS File System

SYNOPSIS

vxtunefs [-ps] [-f tunefstab] [-o parameter=value] [{mount_point|block_special}]...

DESCRIPTION

vxtunefs sets or prints tuneable I/O parameters of mounted file systems. vxtunefs can set parameters describing the I/O properties of the underlying device, parameters to indicate when to treat an I/O as direct I/O, or parameters to control the extent allocation policy for the specified file system.

With no options specified, vxtunefs prints the existing VxFS parameters for the specified file systems.

vxtunefs works on a list of mount points specified on the command line, or all the mounted file systems listed in the tunefstab file. The default tunefstab file is /etc/vx/tunefstab. You can change the default using the -f option.

vxtunefs can be run at any time on a mounted file system, and all parameter changes take immediate effect. Parameters specified on the command line override parameters listed in the tunefstab file.

If /etc/vx/tunefstab exists, the VxFS-specific mount command invokes vxtunefs to set device parameters from /etc/vx/tunefstab.

If the file system is built on a VERITAS Volume Manager (VxVM) volume, the VxFS-specific mount_vxfs command interacts with VxVM to obtain default values for the tunables, so you need to specify tunables for VxVM devices only to change the defaults.

Only a privileged user can run vxtunefs.

Options

vxtunefs recognizes the following options:

-f filename: Use filename instead of /etc/vx/tunefstab as the file containing tuning parameters.
-o parameter=value: Specify parameters for the file systems listed on the command line. See the "VxFS Tuning Parameters and Guidelines" topic in this section.
-p: Print the tuning parameters for all the file systems specified on the command line.
-s: Set the new tuning parameters for the VxFS file systems specified on the command line or in the tunefstab file.

Operands

vxtunefs recognizes the following operands:

mount_point: Name of directory for a mounted VxFS file system.
block_special: Name of the block_special device which contains the VxFS file system.

Notes

vxtunefs works with Storage Checkpoints; however, VxFS tunables apply to an entire file system. Therefore tunables affect not only the primary fileset, but also any Storage Checkpoint filesets within that file system.

The tunables buf_breakup_sz, qio_cache_enable, pref_strength, read_unit_io, and write_unit_io are not supported on HP-UX.

VxFS Tuning Parameters and Guidelines

The values for all the following parameters except read_nstream and write_nstream can be specified in bytes, kilobytes, megabytes or sectors (1024 bytes) by appending k, K, m, M, s, or S. You do not need for a suffix for the value in bytes.

If the file system is being used with a hardware disk array or another volume manager (such as VxVM), align the parameters to match the geometry of the logical disk. For disk striping and RAID-5 configurations, set read_pref_io to the stripe unit size or interleave factor and set read_nstream to be the number of columns. For disk striping configurations, set write_pref_io and write_nstream to the same values as read_pref_io and read_nstream, but for RAID-5 configurations, set write_pref_io, to the full stripe size and set write_nstream to 1.

For an application to do efficient direct I/O or discovered direct I/O, it should issue read requests that are equal to the product of read_nstream and read_pref_io. In general, any multiple or factor of read_nstream multiplied by read_pref_io is a good size for performance. For writing, the same general rule applies to the write_pref_io and write_nstream parameters. When tuning a file system, the best thing to do is use the tuning parameters under a real workload.

If an application is doing sequential I/O to large files, it should issue requests larger than the discovered_direct_iosz. This performs the I/O requests as discovered direct I/O requests which are unbuffered like direct I/O, but which do not require synchronous inode updates when extending the file. If the file is too large to fit in the cache, using unbuffered I/O avoids losing useful data out of the cache, and lowers CPU overhead.

The VxFS tuneable parameters are:

default_indir_size

On VxFS, files can have up to 10 variable sized extents stored in the inode. After these extents are used, the file must use indirect extents which are a fixed size that is set when the file first uses indirect extents. These indirect extents are 8K by default. The file system does not use larger indirect extents because it must fail a write and return ENOSPC if there are no extents available that are the indirect extent size. For file systems with many large files, the 8K indirect extent size is too small. The files that get into indirect extents use a lot of smaller extents instead of a few larger ones. By using this parameter, the default indirect extent size can be increased so that large files in indirects use fewer larger extents.

Be careful using this tuneable. If it is too large, then writes fail when they are unable to allocate extents of the indirect extent size to a file. In general, the fewer and the larger the files on a file system, the larger default_indir_size can be. The value of this parameter is generally a multiple of the read_pref_io parameter.

This tuneable does not apply to disk layout Version 4.

discovered_direct_iosz

Any file I/O requests larger than the discovered_direct_iosz are handled as discovered direct I/O. A discovered direct I/O is unbuffered like direct I/O, but it does not require a synchronous commit of the inode when the file is extended or blocks are allocated. For larger I/O requests, the CPU time for copying the data into the buffer cache and the cost of using memory to buffer the I/O becomes more expensive than the cost of doing the disk I/O. For these I/O requests, using discovered direct I/O is more efficient than regular I/O. The default value of this parameter is 256K.

hsm_write_prealloc

For a file managed by a hierarchical storage management (HSM) application, hsm_write_prealloc preallocates disk blocks before data is migrated back into the file system. An HSM application usually migrates the data back through a series of writes to the file, each of which allocates a few blocks. By setting hsm_write_prealloc (hsm_write_prealloc=1), a sufficient number of disk blocks will be allocated on the first write to the empty file so that no disk block allocation is required for subsequent writes, which improves the write performance during migration.

The hsm_write_prealloc parameter is implemented outside of the DMAPI specification, and its usage has limitations depending on how the space within an HSM controlled file is managed. It is advisable to use hsm_write_prealloc only when recommended by the HSM application controlling the file system.

initial_extent_size

Changes the default size of the initial extent.

VxFS determines, based on the first write to a new file, the size of the first extent to allocate to the file. Typically the first extent is the smallest power of 2 that is larger than the size of the first write. If that power of 2 is less than 8K, the first extent allocated is 8K. After the initial extent, the file system increases the size of subsequent extents (see max_seqio_extent_size) with each allocation.

Because most applications write to files using a buffer size of 8K or less, the increasing extents start doubling from a small initial extent. initial_extent_size changes the default initial extent size to a larger value, so the doubling policy starts from a much larger initial size, and the file system won't allocate a set of small extents at the start of file.

Use this parameter only on file systems that have a very large average file size. On such file systems, there are fewer extents per file and less fragmentation.

initial_extent_size is measured in file system blocks.

max_buf_data_size

Determines the maximum buffer size allocated for file data. The two accepted values are 8K bytes and 64K bytes. The larger value can be beneficial for workloads where large reads/writes are performed sequentially. The smaller value is preferable on workloads where the I/O is random or is done in small chunks. The default value is 8K bytes.

max_direct_iosz

Maximum size of a direct I/O request issued by the file system. If there is a larger I/O request, it is broken up into max_direct_iosz chunks. This parameter defines how much memory an I/O request can lock at once; do not set it to more than 20% of memory.

max_diskq

Limits the maximum disk queue generated by a single file. When the file system is flushing data for a file and the number of pages being flushed exceeds max_diskq, processes block until the amount of data being flushed decreases. Although this does not limit the actual disk queue, it prevents synchronizing processes from making the system unresponsive. The default value is 1 megabyte.

Although it does not limit the actual disk queue, max_diskq prevents processes that flush data to disk, such as fsync, from making the system unresponsive.

See the write_throttle description for more information on pages and system memory.

max_seqio_extent_size

Increases or decreases the maximum size of an extent. When the file system is following its default allocation policy for sequential writes to a file, it allocates an initial extent that is large enough for the first write to the file. When additional extents are allocated, they are progressively larger (the algorithm tries to double the size of the file with each new extent), so each extent can hold several writes worth of data. This reduces the total number of extents in anticipation of continued sequential writes. When there are no more writes to the file, unused space is freed for other files to use.

In general, this allocation stops increasing the size of extents at 2048 blocks, which prevents one file from holding too much unused space.

max_seqio_extent_size is measured in file system blocks.

read_ahead

In the absence of a specific caching advisory, the default for all VxFS read operations is to perform sequential read ahead. The enhanced read ahead functionality implements an algorithm that allows read aheads to detect more elaborate patterns (such as increasing or decreasing read offsets, or multithreaded file accesses) in addition to simple sequential reads. You can specify the following values for read_ahead:

0: Disables read ahead functionality
1: Retains traditional sequential read ahead behavior
2: Enables enhanced read ahead for all reads

By default, read_ahead is set to 1, that is, VxFS detects only sequential patterns.

read_ahead detects patterns on a per-thread basis, up to a maximum of vx_era_nthreads. The default number of threads is 5, however, you can change the default value by setting the vx_era_nthreads parameter in the system configuration file, /etc/system.

read_nstream

The number of parallel read requests of size read_pref_io to have outstanding at one time. The file system uses the product of read_nstream and read_pref_io to determine its read ahead size. The default value for read_nstream is 1.

read_pref_io

The preferred read request size. The file system uses this in conjunction with the read_nstream value to determine how much data to read ahead. The default value is 64K.

write_nstream

The number of parallel write requests of size write_pref_io to have outstanding at one time. The file system uses the product of write_nstream and write_pref_io to determine when to do flush behind on writes. The default value for write_nstream is 1.

write_pref_io

The preferred write request size. The file system uses this in conjunction with the write_nstream value to determine how to do flush behind on writes. The default value is 64K.

write_throttle

When data is written to a file through buffered writes, the file system updates only the in-memory image of the file, creating what are referred to as dirty buffers. Dirty buffers are cleaned when the the file system later writes the data in these buffers to disk. (Note that data can be lost if the system crashes before dirty buffers are written to disk.)

Newer model computer systems typically have more memory. The more physical memory a system has, the more dirty buffers the file system can generate before having to write the buffers to disk to free up memory. So more dirty buffers can potentially lead to longer return times for operations that write dirty buffers to disk such as sync and fsync. If your system has a combination of a slow storage device and a large amount of memory, the sync operations may take long enough to complete that it gives the appearance of a hung system.

If your system is exhibiting this behavior, you can change the value of write_throttle. write_throttle lets you lower the number of dirty buffers per file that the file system will generate before writing them to disk. After the number of dirty buffers for a file reaches the write_throttle threshold, the file system starts flushing buffers to disk even if free memory is still available. Depending on the speed of the storage device, user write performance may suffer, but the number of dirty buffers is limited, so sync operations will complete much faster.

The default value of write_throttle is zero. The default value places no limit on the number of dirty buffers per file. This typically generates a large number of dirty buffers, but maintains fast writes. If write_throttle is non-zero, VxFS limits the number of dirty buffers per file to write_throttle buffers In some cases, write_throttle may delay write requests. For example, lowering the value of write_throttle may increase the file disk queue to the max_diskq value, delaying user writes until the disk queue decreases. So unless the system has a combination of large physical memory and slow storage devices, it is advisable not to change the value of write_throttle.

FILES

/etc/vx/tunefstab: VxFS file system tuning parameters table.

vxtunefs(1M)

Technical documentation

» Table of Contents

» Index