DBA Guidelines [ ALLBASE/SQL Performance and Monitoring Guidelines ] MPE/iX 5.0 Documentation
ALLBASE/SQL Performance and Monitoring Guidelines
DBA Guidelines
Several aspects of database administration affect performance. To
improve performance in these areas, you need DBA authority.
Validating Your Applications Before Run Time
When you run an application, ALLBASE/SQL first checks each section for
validity before executing it. Then, ALLBASE/SQL revalidates invalid
sections (if possible) before executing them.
You may notice a slight delay as the sections are being revalidated.
Revalidating sections can reduce concurrency, because it involves
updating system catalog tables. For each update, ALLBASE/SQL must obtain
an exclusive page lock on the system catalog tables. The locks are
retained until the transaction is committed.
To improve performance, you can revalidate your applications before run
time by either:
* Preprocessing the program again.
* Using the VALIDATE statement on the affected sections. For the
complete syntax of the VALIDATE statement, refer to the
ALLBASE/SQL Reference Manual.
You should also be careful when you use the UPDATE STATISTICS statement,
because it invalidates sections. For best performance, after the initial
statistics for a table have been established, use UPDATE STATISTICS only
during periods of low DBEnvironment activity. Then, after you use UPDATE
STATISTICS, use VALIDATE to revalidate modules and procedures.
You may want to use VALIDATE after executing an UPDATE STATISTICS
statement or any DDL statement, since these statements invalidate stored
sections. If you issue such statements during a period of low activity
for the DBEnvironment (at night, for example), the DBEnvironment
experiences minimal performance degradation.
Developing Application Programs
The following tips relate to preprocessing of application programs:
* Use separate development and production DBEnvironments, because
during development, frequent re-preprocessing of applications
locks system catalog pages for extended periods. You can later
use the ISQL INSTALL command to move finished modules into your
production DBEnvironments. The INSTALL command locks the same
system catalog pages as the preprocessor, but it only does so a
single time.
* Avoid using DDL when running in multiuser mode, since DDL places
exclusive locks on system catalog resources and can reduce
throughput significantly. (DDL includes the UPDATE STATISTICS and
CREATE INDEX statements, among other statements.) You can also
restrict DDL to single-user applications that run during off-peak
hours. When you are not using DDL, you can set the DDL Enabled
flag to NO to achieve significantly higher throughput in a
multiuser environment.
* Use section caching. Note that the first connect to a
DBEnvironment is slower when DDL is disabled, since extra caching
is being done. See the section "Section Caching and Directory
Caching" in this chapter for more information.
* Use the TERMINATE USER statement to abort a user program or ISQL
session, if necessary. DBCore does a set critical while accessing
the DBEnvironment, and it only allows a program to abort at
certain strategic points. If DBCore is filling the tuple buffer,
it completes this operation before allowing a break. TERMINATE
USER allows the program to abort immediately.
Balancing System Load
Balancing system load is an effort to spread out I/O in your system among
the available disk devices for the most efficient operation. Large
systems can benefit significantly from an analysis of system load,
followed by reallocating the DBEFiles for specific DBEFileSets or within
a DBEFileSet. In general, the DBEFiles on which the most I/O is
performed should be stored on the fastest devices available.
For best performance, separate data DBEFiles, index DBEFiles, and log
files across the available disks on your system. Specifically, make sure
to:
* Store log files on separate devices from data DBEFiles to improve
performance and ensure against a possible head crash. For the
same reason, store dual log files on different devices.
* Scatter files over several devices by storing them in different
volume sets, which specify different devices. By default, any
files over 10 MB are scattered among devices.
* Move DBEFiles to different devices with the SQLUtil MOVEFILE
command (for DBEFiles) and the SQLUtil MOVELOG command (for log
files).
* Separate index DBEFiles on large tables from their data DBEFiles.
You can use the SQLUtil MOVEFILE command to move the index
DBEFile. For small tables, this technique probably yields a less
noticeable performance gain.
Placing Concurrently Used Objects on Different Drives.
If two database objects are used concurrently, you should store them on
different disk drives to minimize disk head contention.
This includes any two tables that show I/O at the same time, any two
B-tree indexes that show I/O at the same time, or a table that shows I/O
at the same time as the B-tree index defined on it. To monitor I/O, run
SQLMON and access the SampleIO subsystem. When you access objects
simultaneously, the disk heads may move back and forth between the
objects. This movement slows access to data. To avoid this problem, use
the SQLUtil MOVEFILE command to place the competing objects on different
drives.
Remember that the mapping of database objects to DBEFiles is not
one-to-one, and that MOVEFILE moves DBEFiles, not DBEFileSets. In order
to achieve the goal of placing different objects on different drives, you
must ensure that they are in different DBEFiles. Therefore, it is useful
to create separate INDEX and TABLE DBEFiles, and to define a different
DBEFileSet for each table or index that may need to be separate from
other objects.
Calculating Shared Memory Allocation
As you adjust the various configurable parameters in ALLBASE/SQL, be sure
you do not exceed the amount of shared memory available on your system.
You can use the following formula to derive an approximate figure (in
kilobytes) for the total shared memory used:
SM = (4.2*DBP) + (0.5*LBP) + (0.16*NTXN) + (4.1*CBP)
where
SM = Total Shared Memory in kilobytes
DBP = Data Buffer Pages [ 15 - 50,000 ]
LBP = LogBufferPages [ 24-1024; default 24 ]
NTXN = Number of Concurrent Transactions [ No limit ]
CBP = Control Block Pages [ 17-2000; default 37 ]
The result of this equation must be within the amount of shared memory
you have available. This equation also gives an idea of the relative
effect of changing different parameters on the total shared memory size.
For the default parameters, you can use
SM = 4.2*DBP + 164
And for the maximum parameters, you can use
SM = 4.2*DBP + 3830
These equations will help you arrive at a first approximation of the
amount of shared memory that has been used.
Choosing a Number of Data Buffer Pages
Data buffers are used to cache data pages touched by ALLBASE/SQL. Up to
the limit of buffer space, a data page is retained in memory until the
space it occupies is needed for another page that is on disk. You can
set the number of data buffer pages your DBEnvironment uses. The minimum
number of data buffer pages is 15, the maximum is 50,000, and the default
is 100. How many data buffer pages should you allocate? There is no
explicit equation to find the ideal buffer pool size. The following
general suggestions may help, though this is not an exhaustive list:
* Total shared memory should not exceed free real memory available
on the system. Available real memory can be determined only at
run time, since it depends on the total number of processes on the
system and on the type of application programs running on the
system. The DBA may be able to make estimates. Excessive shared
memory causes page faults.
* You never get a performance benefit by defining more page space
than stays in real memory. If the data buffers force paging of
virtual space, having too many buffers degrades performance. When
you are not using pseudomapped files, use about 6 to 12 times the
number of active transactions. Shared memory only holds buffers
temporarily, since real buffering occurs in the operating system.
With pseudomapped files, try to use a large buffer pool within the
limits of the first two suggestions. However, a larger pool
usually slows down checkpoints if the fraction of dirty pages
remains the same. There is a trade-off between checkpoint time
and response time.
* When possible, with pseudomapped files, the buffer pool should
accomodate all pages that are accessed frequently. If a small
table or index is frequently accessed, then the buffer pool should
accomodate all its pages, if possible.
* If users are accessing a table without much locality of reference
(that is, with almost random access) and if the total size of the
table is much larger than any practical buffer pool size, then
increasing the size of the buffer pool is not helpful. For
example, if a table is 100 MBytes large, then a 2000-page buffer
pool does not work much better than a 1000-page pool.
* If you are using pseudomapped files, and if there is a high hit
ratio, increasing the number of buffer pages can help performance.
For example, with pseudomapped files, if the table has a 2000-page
index that is used frequently, then a 2000-page buffer pool
performs better than a 1000-page pool.
When should you increase the number of buffer pages? If you are not
using pseudomapped files, it is not to your advantage to increase the
buffer pool size. If the set of applications does not have locality of
data and index pages, it may not matter how many data buffers there are.
Once you have enough data buffer pages, performance is not very sensitive
to adding more.
Keeping a Small Group of Pages in Memory.
How do you keep a small, frequently used group of pages in memory
consistently? The trick is to cause other pages to be swapped out first.
ALLBASE/SQL employs a least recently used (LRU) algorithm to discard
older pages and make room for new pages. To keep a special group of
pages in memory, you must have other pages in memory as well that can age
past the pages in the special group. If you use a large enough number of
pages outside the special group, each page will age to the point of being
least recently used before any of the special group is swapped out.
Use the following suggestions to determine how many data buffer pages is
enough:
* Examine the performance-crucial transactions. (Usually there are
only a few that are sensitive or dominate performance). Look at
the tables that are used in these transactions.
* INSERT actions tend to cluster at the end of a table, so if
there is insert activity, add a buffer page for each active
user (that is, the instantaneously active count).
* If a table or index is small enough to fit in real memory
and is heavily accessed, add the number of pages in the
table or index.
* If an index is large, but the number of high level index
pages is small and the table is heavily accessed, add a
number of pages corresponding to the number of index levels
above the leaf level.
* If a table is too big to fit in real memory and the access
is random, add a buffer page for aging for each transaction
before the small table is accessed. This allows for LRU.
* Estimate the table access pattern in your transactions. If
small tables are accessed in 50% of them or more, you
should gain by having a large enough buffer pool to hold
the small tables.
* The right number of buffers is approximately the number needed to
hold the small tables and the aged pages of the large tables.
Basic Example.
For this example, assume that a sample application
* updates a small table
* updates a large table
* inserts to a third table
Assume that the tables for the sample occupy the following space in
DBEFiles:
* The small table and its index use 50 pages: one root index page,
10 leaf index pages, and 39 data pages.
* The large table uses 50,000 pages: one root index page, 100
second-level index pages, 11,899 leaf index pages, and 38,000 data
pages.
Also assume that:
* Each insert into the third table requires 40 bytes, and the pages
of the third table fill up about every 100 transactions.
* There are about 10 active transactions in the database at any
instant.
At a given instant, the buffer pool looks like this:
-------------------------------------------------------------------------------------------------
| | | | |
| Source | Oper- | Pages Already | Pages Requiring |
| | ation | In Data Buffer | Physical I/O |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Small table | Update | 11 (index root and leaf) | |
| | | | |
| | | 10 (data) | 40 |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Large table | Update | 1 (index root) | |
| | | | |
| | | 10 (second-level index pages | 100 (total index second-level |
| | | active per transaction) | pages) |
| | | | |
| | | 10 (leaf pages active per | 12,000 (total index leaf |
| | | transaction) | pages) |
| | | | |
| | | 10 (data pages active per | 50,000 (total data pages) |
| | | transaction) | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Third table | Insert | 10 (data pages active per | |
| | | transaction) | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Total in | | 62 | |
| cache | | | |
| | | | |
-------------------------------------------------------------------------------------------------
Based on the numbers of pages given above, and assuming the default
number of data buffer pages (100), the application will require an
average of 7 I/Os per transaction, as shown in the following table:
----------------------------------------------------------------------------------------------
| | | |
| Source | Operation | Number of I/Os |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Small table | Read data | 1 |
| | | |
| | Write data | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Large table | Read second-level index | 1 |
| | | |
| | Read leaf index | 1 |
| | | |
| | Read data | 1 |
| | | |
| | Write data | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Third table | Write data | .01 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Commit | Write log | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Total | | 7.01 |
| | | |
----------------------------------------------------------------------------------------------
There is a 25% chance that a random access to the small table will want a
page that is already in the buffer pool. There is a 10% chance that the
large index second-level page is already in the buffer pool. These
reduce the I/O count to about 6.4 instead of 7, as shown in the following
table:
----------------------------------------------------------------------------------------------
| | | |
| Source | Operation | Number of I/Os |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Small table | Read data | .75 |
| | | |
| | Write data | .75 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Large table | Read second-level index | .9 |
| | | |
| | Read leaf index | 1 |
| | | |
| | Read data | 1 |
| | | |
| | Write data | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Third table | Write data | .01 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Commit | Write log | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Total | | 6.41 |
| | | |
----------------------------------------------------------------------------------------------
First Threshold for Performance Gain.
System performance improves if you add more data buffers. The first
threshold of performance gain is when there are enough data buffers to
prevent the pages of the small table from being swapped out. This will
happen only if the small table's pages never become least recently used.
Suppose that we want 62 pages to stay in memory from the previous
calculation, and that we will insert about 10 pages. In each
transaction, accessing the large table will require touching a root page,
two more index pages, and a data page. If the two nonroot index pages
and the data page are read each time, we touch three pages that are new.
We need enough pages in memory so that pages from the large table are
less recently used than pages from the small table. A minimal estimate
would be 120 pages; a more comfortable estimate would allow 160 pages.
Using the 120-page minimum, the buffer pool would look like this:
-------------------------------------------------------------------------------------------------
| | | | |
| Source | Oper- | Pages Already | Pages Requiring |
| | ation | In Data Buffer | Physical I/O |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Small table | Update | 11 (index root and leaf) | |
| | | | |
| | | 40 (all pages in buffer) | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Large table | Update | 1 (index root) | |
| | | | |
| | | 40 (second-level index | 100 (total index second-level |
| | | pages--active and "aging") | pages) |
| | | | |
| | | 40 (leaf pages--active and | 12,000 (total index leaf |
| | | "aging") | pages) |
| | | | |
| | | 40 (data pages--active and | 50,000 (total data pages) |
| | | "aging") | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Third table | Insert | 10 (data pages active per | |
| | | transaction) | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Total in | | 182 | |
| cache | | | |
| | | | |
-------------------------------------------------------------------------------------------------
Here is the resultant pattern of I/Os:
----------------------------------------------------------------------------------------------
| | | |
| Source | Operation | Number of I/Os |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Small table | Read data | 0 |
| | | |
| | Write data | 0 (deferred until |
| | | checkpoint) |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Large table | Read second-level index | .9 |
| | | |
| | Read leaf index | 1 |
| | | |
| | Read data | 1 |
| | | |
| | Write data | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Third table | Write data | .01 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Commit | Write log | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Total | | 4.91 |
| | | |
----------------------------------------------------------------------------------------------
Second Threshold for Performance Gain.
The next performance gain comes from fitting all the pages from the
second level of the large table's index into the data buffer cache.
There are 100 pages in the second-level index, so the buffer pool in this
scenario looks like the following:
-------------------------------------------------------------------------------------------------
| | | | |
| Source | Oper- | Pages Already | Pages Requiring |
| | ation | In Data Buffer | Physical I/O |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Small table | Update | 11 (index root and leaf) | |
| | | | |
| | | 40 | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Large table | Update | 1 (index root) | |
| | | | |
| | | 100 (second-level index pages) | |
| | | | |
| | | 100 (leaf pages--active and | 12,000 (total index leaf |
| | | "aging") | pages) |
| | | | |
| | | 100 (data pages--active and | 50,000 (total data pages) |
| | | "aging") | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Third table | Insert | 10 (data pages active per | |
| | | transaction) | |
| | | | |
-------------------------------------------------------------------------------------------------
| | | | |
| Total in | | 362 | |
| cache | | | |
| | | | |
-------------------------------------------------------------------------------------------------
Here is the resultant pattern of I/Os:
----------------------------------------------------------------------------------------------
| | | |
| Source | Operation | Number of I/Os |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Small table | Read data | 0 |
| | | |
| | Write data | 0 (deferred until |
| | | checkpoint) |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Large table | Read second-level index | 0 |
| | | |
| | Read leaf index | 1 |
| | | |
| | Read data | 1 |
| | | |
| | Write data | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Third table | Write data | .01 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Commit | Write log | 1 |
| | | |
----------------------------------------------------------------------------------------------
| | | |
| Total | | 4.01 |
| | | |
----------------------------------------------------------------------------------------------
This analysis predicts one plateau of performance at about 180-200 pages
in the buffer pool, a second plateau at about 360-400 pages. The next
plateau would require something like 25,000 pages, which is impractical
for most installations.
In practice, there is little penalty for extra pages, so use a generous
allocation (that is, at least 200 or 400 pages), if you can afford to.
NOTE The previous calculations assume only the pages involved in one
transaction type involving a small and a large table. If you have
other transactions contending for the database, be sure to allow
adequate buffer space for them too.
Cautions.
* The size of the page pool combined with other demands on memory
should not exceed available real memory.
* Checkpoints may take longer if the buffer pool is larger. A
checkpoint writes to disk all dirty pages in the buffer pool. If
there is a consistent fraction of dirty pages, the checkpoint
takes longer with a large buffer pool. However, if the large
buffer pool is filled with read-only data, the checkpoint may not
take much longer. A rough estimate can be obtained by looking at
the buffer pool size derived in the analysis above.
An Empirical Approach.
Using SQLMON, you can observe the data buffer miss rate and adjust the
number of data buffer pages accordingly. Start by allocating the largest
number of data buffer pages you are willing to reserve for ALLBASE/SQL.
This number should be less than the following:
(Total Size Virtual Memory) - (Number of Log Buffer Pages) * 0.5K
- (Number of Run-Time Control Block Pages) * 4K
- 45K
As you observe the data buffer miss rate on SQLMON's IO screen, gradually
reduce the number of data buffer pages until the miss rate increases
sharply. Allocate just enough data buffer pages to avoid the sharp
increase of the data buffer miss rate. Since the load on your system
probably varies, you should monitor the data buffer miss rate at
different times throughout the day.
Choosing the Size of the Runtime Control Block
The runtime control block is an area of shared memory containing global,
runtime information for the DBEnvironment. ALLBASE/SQL allocates control
blocks from the runtime control block. Lock management is the single
greatest user of control blocks. Each table, page, or row lock acquired
needs one control block. As the granularity of the lock decreases, the
number of locks required is likely to increase.
For example, locking an entire table with row level locking requires more
locks than locking the table with the LOCK TABLE statement. The
following table lists the maximum number of locks that can be associated
with table, page, and row-level locking:
-----------------------------------------------------------------
| | |
| Granularity | Maximum Number of Locks |
| | |
-----------------------------------------------------------------
| | |
| Table | 1 |
| | |
-----------------------------------------------------------------
| | |
| Page | (number of pages in table) + 1 |
| | |
-----------------------------------------------------------------
| | |
| Row | (number of pages in table) + (number of rows |
| | in table) + 1 |
| | |
-----------------------------------------------------------------
An application that manages locks well is less likely to deplete the
amount of shared memory available. If the runtime control block is too
small, ALLBASE/SQL is unable to allocate the necessary control blocks, an
error is returned, and the transaction is rolled back. If the runtime
control block is too large, other processes on the system may not have
adequate access to memory.
To allocate pages for the runtime control block, you can use the START
DBE statement or the ALTDBE command. To monitor the usage of the runtime
control block, start SQLMON and go to the Overview screen, as described
in the chapter "Getting Started with SQLMON."
Choosing a Number of Log Buffer Pages
The number of log buffers is independent of the number of data buffers.
The log buffer pages are only 512 bytes (in contrast with the data buffer
pages, which are 4096 bytes). If the log buffer pool becomes full
before any transaction completes, then at least two I/Os are needed to
complete the transaction.
You should provide enough log buffers to allow all active transactions to
have log space. Additional buffers do not help performance, but adding
pages probably doesn't hurt either. To monitor the I/O incurred by
logging, run SQLMON and access the IO screen.
For example, suppose that a user is making changes to a sample database
and that
* Before and after images of the changed data cause about 600 bytes
to be logged.
* The insert is 100 bytes.
* We allow 100 bytes for log overhead.
The figures listed above total 800 bytes. You can then round up to 1024
bytes, or 2 512-byte pages. This implies that 20 log buffer pages should
be sufficient for 10 transactions. If there are occasional bursts of
more than the average, 40 or 50 pages might be a good idea.
Choosing the Number and Size of Log Files
ALLBASE/SQL gives you great flexibility in choosing the number and size
of log files. The following suggestions may be useful in making choices
for your system.
Nonarchive Log Guidelines.
The size of nonarchive log files determines the frequency of automatic
system checkpoints, and it therefore determines rollback recovery time.
The total size of the log (all files taken together) determines the
largest transaction that can be accomodated in the DBEnvironment.
Some guidelines for determining the size of the nonarchive log are listed
below:
* Total log space should be large enough to accomodate changes from
the largest transaction. Otherwise, large transactions may abort.
To determine the amount of log file space available, run SQLMON
and go to the Overview screen.
* ALLBASE/SQL performs an automatic checkpoint each time a log file
becomes full. By choosing a specific file size, you can force a
checkpoint to take place at specific intervals, thereby
controlling the amount of time required for rollback recovery.
* In general, a large log file is good for nonarchive logging.
However, a checkpoint can cause significant loading to be put on
the system when the log file is large. When a checkpoint occurs,
all data pages relating to the transactions currently held in the
log file must be forced to disk if they have not already been
swapped out. If you use very large log files, the
checkpoint process can flood the I/O subsystem and cause
interference with other higher priority processing. If such
flooding occurs, use a smaller log file or use the CHECKPOINT
statement to force a checkpoint at regular intervals.
* The transaction that runs out of log space triggers a checkpoint,
and this causes all transactions to wait until the buffers are
flushed.
* When you use a large log buffer size, you avoid frequent flushing
of the buffer to disk.
* Large nonarchive log files can reduce the number of system
checkpoints. To display the number of system checkpoints, run
SQLMON and examine the IO screen. If you observe a large number
of checkpoints and log buffer writes on the IO screen, you should
increase the size of the nonarchive log files.
* Checkpoints should be postponed as far as possible, as long as
rollback recovery time is within limits. The reason is that
checkpoints flush dirty pages, and this prevents the saving of
writes in those cases where the same page gets updated frequently.
Also, checkpoints increase disk contention.
Archive Log Guidelines.
* The transaction that runs out of log space triggers a checkpoint,
and this causes the transaction to wait until the buffers are
flushed. Other transactions may continue if logging is in archive
mode.
* In archive mode, use at least two log files, so that one can be
backed up while the other is being used.
* In archive mode, use large log files to minimize the frequency of
log backups and checkpoints.
Sorting Operations
The sort method in ALLBASE/SQL is memory-intensive, using the tournament
sort algorithm for sorting and merging. It sorts in two phases:
* Run generation: It reads the input rows, sorts them in the
tournament tree, and generates runs. (A run is a list of sorted
tuples.) The number of runs generated depends on the input data
and on the sort memory size. For large sort operations, the
sorted runs (intermediate results) may be written to scratch files
created dynamically in temporary spaces you define and purged
after sorting is completed. If all input tuples fit in sort
memory, then the merge phase is completely avoided, and no scratch
files are used.
* Merge phase: In the second phase, the sorted runs are merged.
The merge width depends on the size of the sort memory. If the
number of generated runs is less than the merge width, only one
merge is done. Otherwise, several merge passes may be necessary.
Creating Temporary Spaces.
Use the CREATE TEMPSPACE statement to create a group that will contain
temporary space for scratch files. ALLBASE/SQL needs scratch files to
store the intermediate results of sorting and creating indexes. You
should create at least one temporary space in the DBEnvironment you are
working in. If you do not create a temporary space, ALLBASE/SQL opens
scratch files in the current group.
The CREATE TEMPSPACE statement itself does not use any disk space, and
ALLBASE/SQL does not create any scratch files when you issue the CREATE
TEMPSPACE statement. Instead, they are dynamically created and purged
during sort operations. When you define a temporary space, the amount of
available disk space may change dynamically as ALLBASE/SQL creates and
purges scratch files.
By default, the number of pages used for each temporary space file is
256. The total amount of space used is all that is available in the
current group.
Tips for Using Temporary Spaces.
* If you do large sorts or create large indexes, use CREATE
TEMPSPACE to define temporary spaces. Be sure to specify a
MaxTempFileSize large enough for your needs.
* Create one temporary space per volume set to spread scratch files
across volume sets and avoid filling up a single volume set.
* ALLBASE/SQL opens a new scratch file after reaching the limit
specified in the CREATE TEMPSPACE statement. If the sort is large
and the default is small, ALLBASE/SQL may have to open many small
scratch files. To avoid this, specify a large MaxTempFileSize
value. For small sorts, ALLBASE/SQL does not create large scratch
files even if MaxTempFileSize is large; hence, it is safe to
specify MaxTempFileSize generously.
* If you define more than one temporary space in the DBEnvironment,
ALLBASE/SQL opens scratch files by rotation in each temporary
space.
Disk Space for Sorting.
All intermediate sort results are written to scratch files, not to any
DBEFileSet in the DBEnvironment. Sorting may require 100% to 200% of the
input data size in temporary disk files for intermediate results. If a
join and an ORDER BY are both involved, the amount could be as much as
300%. However, scratch files are purged as soon as the sort operation is
finished.
Final sort output for queries with GROUP BY, ORDER BY or
DISTINCT is written to the SYSTEM DBEFileSet, requiring space worth 100%
of the sorted output. For CREATE INDEX, the final output is directly
written to the DBEFileSet containing the index, requiring space worth the
size of the index.
The disk space required for sort operations is summarized below:
* CREATE INDEX: 100-200% of the size of the input key data in
temporary files and 100% in the INDEX DBEFileSet.
* DML statements using a sort: 100-300% of the size of the input
data in temporay files and 100% in the SYSTEM DBEFileSet.
Controlling the Use of Temporary Space.
The order in which you carry out database operations may determine the
amount of temporary space required. For example, if you create a large
table, load it, then create an index, you will probably use scratch file
space for sorting. However, if you create the table, then create the
index, then load the data, you will not use scratch files. In the second
case, the load is 3 to 10 times slower because ALLBASE/SQL creates index
entries during loading. However, if you load unsorted data, the size of
the index is about 50% larger.
Memory Utilization in Sorting.
Sorting is memory-intensive, and the amount of memory required is
estimated from the number of input tuples. The number of input tuples is
in turn estimated from table statistics. Therefore, it is important to
have up-to-date statistics when you execute ORDER BY, GROUP BY, or CREATE
INDEX statements, especially on large tables.
Performance Hints for Large Sorts.
* Issue the UPDATE STATISTICS statement after loading a large table
and before issuing the CREATE INDEX statement. After you issue
CREATE INDEX, issue another UPDATE STATISTICS statement to get the
correct statistics for the index.
* Make sure that you have allocated enough temporary space to hold
the scratch files.
Join Methods.
ALLBASE/SQL uses two join methods: nested loop join and sort/merge join.
Nested loop joins are usually much faster than sort/merge joins. By
default, the optimizer chooses a join method depending on the query and
the statistics in the system catalog for the tables involved. To
override the join method the optimizer chooses, you can use the SETOPT
statement.
The nested loop method scans the second table once for each qualifying
row in the first table. If the scan on the second table is a serial scan
(rather than an index scan), the nested loop join can require
considerable I/O and CPU time when the table is larger than the buffer
cache. However, a nested loop join may be the best approach for joins
involving small tables.
A sort/merge join has two phases: the sort phase and the merge/join
phase.
* The sort phase: If the scan on the table is an index scan on the
joined column (that is, there is a join on col1 and the index
being picked is also on col1 or on col1, col2), then the sort can
be skipped, since the returned tuples from the scan are already
sorted.
* The merge/join phase: Multiple tables are merged or joined
without scanning a table repeatedly, as in a nested loop join.
This saves a significant amount
of I/O and CPU time, especially for large tables.
A sort/merge join can help with tables that have indexes that are larger
than the buffer cache size.
Temporary Space in the SYSTEM DBEFileSet.
Query results from queries using ORDER BY and GROUP BY clauses are stored
in temporary pages in the SYSTEM DBEFileSet. Be sure to include enough
TABLE or MIXED DBEFile space in SYSTEM to accomodate your largest sorted
query. As your system catalog grows, you should monitor the DBEFile
space in SYSTEM occasionally (after doing an UPDATE STATISTICS on at
least one system view) to make sure enough temporary space is still
available. As needed, add DBEFiles to SYSTEM to supply temporary pages.
Section Caching and Directory Caching.
Section caching holds a number of sections in memory between transactions
so that they do not need to be read again for the next execution.
ALLBASE/SQL can keep up to 12 sections in memory. Section caching is
more efficient when DDL Enabled is set to NO. Directory caching stores in
memory the DBCore directory, which contains the locations of tables and
indexes. Directory caching is turned on when you set the DDL Enabled
flag to NO by using ALTDBE in SQLUtil.
Section caching is subject to the following rules:
* When there are 12 or fewer sections in memory, the system does not
try to delete any permanent sections at the end of a transaction.
* When there are more than 12 sections in memory, the system deletes
only those sections that are not opened. Sections are considered
opened if they are cursor sections and have been opened by the
OPEN CursorName statement.
* The system does not delete dynamic sections, even if the cache
limit is exceeded.
* The system can keep more than 12 (or 4) sections in memory, if all
of them are opened cursors. The only limit is the amount of user
heap space available.
This feature is helpful for an application that executes the same
sections repeatedly, especially if it has a small number of sections.
Setting Limits for Section Caching
By default, ALLBASE/SQL allocates shared memory for up to 12 sections.
You can increase this number by setting the HPSQLSECTCACHE jcw variable
to any value from 4 to 128.
Using Multiconnect Functionality
It is possible to establish a maximum of 32 simultaneous database
connections. When your application must access more than one
DBEnvironment, there is no need to release one before connecting to
another. Performance is greatly improved by using this method rather
than connecting to and releasing each DBEnvironment individually.
For more detailed information, refer to the chapter "Transaction
Management with Multiple DBEnvironment Connections" in the ALLBASE/SQL
Advanced Application Programming Guide and the section "Using Multiple
Connections and Transactions with Timeouts" in the ALLBASE/SQL Reference
Manual chapter "Using ALLBASE/SQL."
Using Timeouts to Tune Performance
When an application requests a database resource that is unavailable, the
application is placed on a wait queue. If the application waits longer
than the timeout value specified for its DBEnvironment connection, an
error occurs, and the transaction is rolled back. Your strategy for
specifying timeout values and handling timeout errors depends on the
specific needs of your application and on your business procedures. By
default, the timeout value is infinite.
For example, a transaction may require resources that are locked by a
second transaction that requires a great deal of execution time. In this
case, you could specify a reasonable amount of time for the first
transaction to wait before a timeout occurs. Such a strategy might be
essential in a distributed environment. For more detailed information,
refer to the chapter "Transaction Management with Multiple DBEnvironment
Connections" in the ALLBASE/SQL Advanced Application Programming Guide
and the section "Setting Timeout Values" in the ALLBASE/SQL Reference
Manual chapter "Using ALLBASE/SQL."
MPE/iX 5.0 Documentation