Designing Indexes [ ALLBASE/SQL Database Administration Guide ] MPE/iX 5.5 Documentation
ALLBASE/SQL Database Administration Guide
Designing Indexes
An index is defined on one or more columns in a table to facilitate rapid
retrieval of data. ALLBASE/SQL decides whether to use an index in the
process of creating an optimal access path for the data during query
optimization. By default, you cannot specify the use of an index in an
SQL statement; the optimizer will decide to use an index if it will
improve performance during query processing. However, if you use the
SETOPT statement you can override the optimizer and specify the use of an
index.
ALLBASE/SQL uses a B-tree (balanced tree) design for indexes, as shown in
Figure 2-7 .
Figure 2-7. B-Tree Index Design
The tree has different levels:
* Root level-- a single root page containing key values that
subdivide the index search.
* Intermediate levels-- pages containing key values that further
subdivide an index search.
* Leaf level-- pages which point to data values in the indexed
column or columns.
When an index is used to access a row on the basis of a key value,
ALLBASE/SQL first looks at the root page, then follows pointers down the
tree until it finds an appropriate leaf entry, which points to a specific
location in a data page. In Figure 2-7, the root page contains one
subdividing value, Larry. In order to find student Glen, we would first
examine the root page, which points to the leftmost page on the next
level, since Glen comes before Larry in this collating sequence.
This page in turn points to the second leaf page from the left, since
Glen comes after Dave and before Larry. Finally, the leaf page points to
a specific data page where the desired row is found.
Note that ALLBASE/SQL B-tree indexes are doubly linked, in that the leaf
pages point to other leaf pages, so that the next higher or lower value
in key order can be quickly located without "backing out" to a higher
level in the tree.
Keep in mind the following general guidelines when designing indexes:
* Indexes can only be defined on tables.
* An index can only be defined on one table.
* The maximum number of columns allowed per index is 16.
* There is no limit to the number of indexes you can define.
* Too many indexes can degrade performance because when a row is
inserted, deleted, or updated in a table, the indexes must also be
updated.
* A small table may not need an index.
* Cyclic applications requiring multiple access paths (for example,
accounting applications for quarterly and annual reports) can
create indexes, then drop them when finished. This reduces index
overhead and maintenance and improves application performance.
DDL must be enabled for this approach to be possible.
* An index may improve the performance of queries containing a WHERE
clause involving a comparison on the first column of a multicolumn
index key.
ALLBASE/SQL updates indexes automatically as rows in the table are
updated, inserted, or deleted. This adds considerable overhead to the
UPDATE, DELETE, and INSERT operations. ALLBASE/SQL maintains index
information and statistics in the SYSTEM.INDEX view in the system
catalog, which is presented fully in the chapter, "System Catalog."
Determining Index Keys
All of the columns in an index together compose what is called a key. An
index on a single column has a simple key. An index on multiple columns
has a compound key.
Any column that you specify as a search item in a query will cause
ALLBASE/SQL to look for an index with that column specified as part of a
key. If no appropriate index is found, or if the optimizer decides not
to use the index, ALLBASE/SQL performs a sequential search on that table.
Sequential searches may degrade performance when most of the performed
queries have a WHERE clause.
The ideal index has the one or two most common columns used in a WHERE
clause defined as keys. Keep the following guidelines in mind when
determining which columns to use for a key:
* A simple key can be as long as 1010 bytes.
* An index should contain a column that is used in the WHERE clause
of a frequently executed query.
* An index should contain a column that is used to verify the
existence of a value, especially if it is a unique index.
If you are considering a compound key, you should also keep the following
in mind:
* A compound key can have up to 16 columns.
* The length of a compound key must be less than or equal to 1024
bytes. Use the following formula to determine whether or not a
proposed compound key is acceptable:
Where NC is the number of columns, and KL is the sum of the column
lengths in the key. A single column index may not exceed 1010
bytes.
* The order in which you specify columns in the CREATE INDEX
statement determines the sort ordering: the first column listed
is the primary, or first, order of sort; the second column listed
is second in order; and so on.
The most common index is created on the primary key of a table. For
example, the order number is the primary key in the Orders table.
Therefore an index with the OrderNumber column as a key should be created
on the Orders table.
How Index Keys are Used.
ALLBASE/SQL uses simple and compound key indexes differently. Suppose
most of your queries have a WHERE clause for the VendorNumber and
OrderNumber, but the two columns are never used in the same WHERE clause:
WHERE VendorNumber='StringConstant'
WHERE OrderNumber=Integer
In this case, you should create two separate indexes, each with a simple
key.
Even if the two columns are used in the same WHERE clause, but they are
compared against each other, only one of the columns can be used as a
search item.
WHERE VendorNumber=OrderNumber
In this case as well, you should create two indexes with simple keys.
On the other hand, suppose the two columns are always used together in a
WHERE clause:
WHERE VendorNumber='StringConstant' AND OrderNumber=Integer
In this case you should create one index with a compound key.
While it is not required to have an index on each table in a join,
defining an index on one or both of the tables may improve performance.
In general, you should define an index on the columns specified in the
join. For example, if the sample database has an application that joins
the SupplyPrice table with the OrderItems table, then an index with
VendPartNumber (the common column) should be defined on one of the tables
if not both of them.
Determining Index Type
You can design one of four different types of index:
* Unique
* Clustering
* Clustering and unique
* Neither clustering nor unique
The unique index will not permit duplicate values for the key columns
specified. The clustering index attempts to place rows with similar key
values physically close to each other on disk.
Defining Unique Indexes.
If you specify the unique option when creating an index, the columns
named in the index key are kept unique. Unique indexes prevent duplicate
data in the columns used as keys of the index.
In the sample database, the Vendors table has a unique index created on
the VendorNumber column to ensure that a vendor number refers to only one
vendor. A table can have multiple unique indexes. However, ALLBASE/SQL
will not allow you to create a unique index on a table that already
contains rows with duplicate values in the key columns.
Defining Clustering Indexes.
Clustering indexes are closely related to physical design of the
database. To understand what a clustering index does, you must know that
ALLBASE/SQL arranges data in sections of a DBEFile called pages. A
clustering index attempts to place all rows with similar key values on
the same or consecutive pages. Because the rows are physically close,
I/O overhead is reduced and performance improved whenever the rows are
retrieved in key order. This can be helpful with queries in which you
make use of LIKE and BETWEEN predicates.
A clustering index should be defined on a table after the initial loading
of the table but before any additional rows are inserted in the table.
It is recommended that you sort data on the clustering key before you
load it into the table. Rows inserted before a clustering index is
created are not repositioned after the index is created. Therefore, if
you want to create a clustering index on a table that already contains
data, you should unload the table using a SELECT statement with an ORDER
BY clause, reload the table, then create the clustering index.
Keep the following guidelines in mind when considering a clustering
index:
* Regardless of the number of indexes you define on a table, only
one index can be a clustering index.
* A clustering index can be either non-unique or unique.
* A clustering index should be based on the most commonly used sort
sequence.
* Clustering benefits applications that must search large amounts of
data to retrieve rows in sequential order based on a key value.
* A clustering index can be defined on an existing table; however,
existing rows will not be repositioned.
* A clustering index may improve the performance of a query
containing a DISTINCT, GROUP BY or ORDER BY clause if the columns
in the sort list match the first columns in the index definition.
* A clustering index may improve the performance of a query
containing a WHERE clause involving a comparison on the first few
columns listed in the index.
* Any index makes INSERTs and UPDATEs more expensive.
* Clustering of data is maintained when there is a relatively large
volume of DELETE operations followed by a similar volume of INSERT
operations on the table. Clustering of data is not maintained
when there is a larger volume of INSERT operations than DELETE
operations.
MPE/iX 5.5 Documentation