HP.com home KSAM/3000 Reference Manual: HP 3000 MPE/iX Computer Systems > Appendix B KSAM/3000 INTERNAL STRUCTURE AND TECHNIQUES

KSAM FILE SIZE

» 

Technical documentation

Complete book in PDF
» Feedback

 » Table of Contents

The size of the data file is calculated from the maximum number of data records times the size of each record (for fixed length records). For variable length records, it is calculated from the maximum number of data blocks times the size of each block. By default, a KSAM data file contains 1024 records (or blocks) in which each record (or block) is 1024 words long. This default size fits each block into eight disc sectors (each sector is 128 words long), and results in a data file of 8192 sectors.

Calculation of key file size is more complex. It is based on the total number of keys in the file (primary and alternate), the size of each key entry (including overhead), the expected number of data records specified when the file is built, plus space to allow for block splitting when the number of key entries increases.

The number of key entries per key is usually exactly the same as the number of data records expected. By default, KSAM uses the maximum number of data records specified, or the default value of 1024 records. This number is multiplied by each key in addition to the primary key to arrive at the total number of key entries in the file.

The size of each key entry and the number of key entries per block (the blocking factor) is used to determine key block size. Since all blocks in the key file must be the same size, KSAM adjusts the blocking factor so that all keys, regardless of entry size, use the same block size. Also, this blocking factor may be adjusted so that disc sector space is not wasted. (A block always starts on a sector boundary.) By default, the blocking factor is adjusted so that a block size of 1024 words is used for all key blocks for all keys in the file.

Because of the nature of the B-tree structure, enough room must be left in the key file so that the file can be increased in a balanced manner. When block splitting occurs as a result of adding new key values, up to half of each key block may have empty slots. To allow for such expansion, the key file size is calculated, and then doubled.

The following discussion describes exactly how KSAM calculates the key block size, and then the total key file size. These calculations are useful primarily if you do not use default values for the key blocking factor and for the number of key entries. In such a case, they may help you determine the most effective block size and file size for your application.

KEY BLOCK SIZE

Key block size can affect the complexity of the tree structure and this complexity can affect access time. In order to understand the relation between block size and access time, consider the following general rules:

  • The larger the block, the less often it has to split and the fewer the splits, the fewer levels to the tree.

  • The more levels to the tree, the more mass storage retrieval time is needed to locate a particular y value.

From this it would follow that in order to reduce access time, you should define large blocks. This is true only up to a point. Depending on the total number of key values expected in the file, a large block size may result in a great deal of unused space in each block. Also, the blocking factor may result in unused disc space since all blocks must start on sector boundaries.

KSAM provides a default blocking factor that produces a block with 1024(1K) words. This size has proved to be efficient for many files. You may, however, override this default blocking by specifying a value in the keyblocking parameter of the ;KEY= option in the BUILD command, or in word 19 of the FOPEN ksamparam parameter. Note that any blocking factor you specify is a minimum value since KSAM may increase the blocking factor so that the least amount of disc space is wasted.

After creating a KSAM file, you can use the VERIFY command of KSAMUTIL to determine the number of levels needed by the KSAM file. The VERIFY listing will also tell you the actual blocking factor used in creating the file so you can find out whether your specified blocking factor has been increased.

CALCULATING KEY BLOCK SIZE.

Key block size is based on a number of factors:

  • The key size is bytes (KS)

  • The key entry size in words (ES).

  • The number of key entries per block, the blocking factor (BF).

Once the block size is determined, the number of sectors needed to hold one block is calculated. If this value (NB) wastes sector space, KSAM adjusts the blocking factor to produce a block size that uses the least number of sectors by filling each sector as completely as possible. Note that when KSAM uses the default block size of 1024, it calculates a blocking factor by the same method.

The following steps show how KSAM determines block size based on a specified minimum blocking factor.

NOTE: The notation |_ _| means round down the result of the enclosed algorithm to the next whole number; | | means round it up.
  1. Calculate the entry size (ES) in words from the key size (KS) in bytes, and then add two words for each pointer in the key entry (see Figure B-6 “Key Entry Block Structure”). KSAM uses the following algorithm to perform this calculation:

     
    
              ES = [urcrop](KS+I)/2[ulcrop] + 4 
    
    
    
  2. If the blocking factor (BF) was specified as an odd number, KSAM issues an error message. Otherwise, it uses the specified blocking factor to continue the calculation of block size.

  3. Determine blocksize (BS) by multiplying the key entry size by the blocking factor and adding 5 words. (The five words are for the three words of control information at the beginning of each block, plus two words for the final pointer in the block. See Figure B-6 “Key Entry Block Structure”). KSAM uses the algorithm:

     
    
              BS = (ES x BF) + 5 
    

    Since blocks always start on sector boundaries, this calculated block size may leave a lot of unused sector space. The following steps show how KSAM determines the most efficient block size and, if this size differs from the size calculated from the specified blocking factor, how KSAM adjusts this blocking factor upwards to produce the optimum block size.

  4. Determine the number of sectors required to hold the block at its calculated size. If the result is not a whole number, round it up to the nearest whole sector. KSAM determines the number of sectors per block (NB) as follows:

               NB = [ulcrop] BS/128 [urcrop]
    
  5. Multiply the number of sectors per block by 128 to determine the optimum block size:

     
    
              BS = NB x 128 
    
  6. If the optimum block size calculated in step 5 differs from the block size calculated in step 3, or if the default block size 1024 is used, KSAM adjusts the blocking factor to one that produces the optimum block size. It uses the following algorithm to determine the number of key entries that fit in the block and, if this is an odd number, reduces it by one. (Blocking factor must be an even number.)

              BF = F [ulcrop] (|_(BS-5)/ES _| -1)/2 [urcrop] x 2 
    

KEY FILE SIZE

KSAM uses the blocking factor and the number of sectors per block to detenrmine the maximum file size in sectors to allocate to the key file. In addition, KSAM needs to know the maximum number of key entries to expect, and the number of keys (primary and alternate) defined for the key file at creation.

The maximum number of key entries for each key may be specified in the numentries parameter of the KEYENTRIES= option of the BUILD command, or in the ksamparam parameter of FOPEN. However, this file limit is usually based on the maximum number of data records. This value is specited in the numrec parameter of the DISC= option of the BUlLD command, or in the filesize parameter of FOPEN. If not specified in either of these places, KSAM assumes a default file limit of 1024 key entries.

Since the number of records in the data file can be used to calculate the maximum number of keys for only one key, each additional key defined for the file causes the file sizes to be summed.

When key file size has been calculated, KSAM uses this value to allocate that number of sectors on disc to the key file whenever the file is opened.

Key file size is based on the following factors:

  • The number of key entries per block, or the blocking factor (BF).

  • The number of blocks per sector (NB).

  • The maximum number of key entries for one key (FL).

  • The number of keys defined for the key file.

KSAM uses the following formula to determine key file size in sectors for a file with one key:

     FS = ([ulcrop] FL/BF [urcrop]x 2) x NB 

This formula is derived through the following steps:

  1. The maximum number of key entries (FL) is divided by the number of key entries per block (BF) to find the number of blocks in the file. If not a whole number, it is rounded up to include all blocks.

  2. The resulting number of blocks is multiplied by 2. This doubling of the number of key blocks is done to allow room for expansion when the number of levels in a key file expands due to block splitting.

  3. Finally, the number of blocks is multiplied by the number of blocks per sector (NB) to find the total number of sectors needed to contain all key entries.

NOTE: The file size (FS) calculated by the above algorithm does not include the two sectors required for control information.

If the file has only one key, add 2 to the file size (FS+2) to get the total file size. The value 2 represents the two sectors at the beginning of each key file containing control and key descriptor information.

If the key file has multiple keys, then the optimum block size of each key must be determined. The largest block size is then selected as the standard block size for all keys in the file (KSAM does not allow variable-length key blocks). Since the block size of some keys has changed, the blocking factor (BF) must be recalculated for these keys using the algorithm:

            BF = [ulcrop] (|_(BS-5)/ES_|-1)/2 [urcrop]x 2 

A separate file size for each key is then calculated based on their various blocking factors. The total key file size is equal to the sum of all these file size (FS) values plus 2 for the two control sectors.

Figure B-8 “Formula to Determine File Space per Key” summarizes the steps KSAM uses to calculate file size for one key. Figure B-9 “Calculation of Total Key File Size with Two Keys” shows how KSAM calculates key file size for a file with two keys. Each key file size (FS) is calculated separately, and then the blocking factor and file size of the key with the smaller block size is recalculated.

For a file with one key, KSAM simply adds 2 sectors to the file size (FS) calculated for the single key.

Figure B-8 Formula to Determine File Space per Key

[Formula to Determine File Space per Key]

Figure B-9 Calculation of Total Key File Size with Two Keys

[Calculation of Total Key File Size with Two Keys]