ALLBASE/SQL Reference Manual
> Chapter 2 Using ALLBASE/SQLUnderstanding Data Access Paths |
||||||||||||||||||||||||
|
Serial AccessSerial access does not require the existence of any special object in addition to the table itself. If ALLBASE/SQL chooses serial access when you issue a query, it starts reading data from the first page in the table and continues to the end. Serial access is probably the best access method when you intend to read all the data in the table. For example, an application that updates every row in a table in exactly the same way would perform best using a serial scan. Indexed AccessIndexed access requires the use of a named index defined on specific columns in the table to be accessed. Indexes can be plain, or they can be unique and/or clustering. Tables having a unique index cannot have duplicate data values in the key column(s). A clustering index causes rows with similar key values to be stored near to each other on disk when this is possible. A table that is to use a clustering index should be loaded in the key order specified by the clustering index. A clustering index can be defined on a unique or referential constraint. Whenever you issue a query, the query processor checks to see if an index exists for one or more of the columns in the query. If an index is available and if the optimizer decides that using the index is the fastest way to access the data, ALLBASE/SQL looks up the key values in the index first, then goes directly to the pages containing table data. For example, in the following query, assume that PurchDB.Parts contains a large number of rows and that a unique index exists on the PartNumber column: isql=> SELECT PartName, SalesPrice FROM PurchDB.Parts > WHERE PartNumber = '1323-D-01';The optimizer would probably choose this unique index for access to the single row because the alternative choice--a serial scan--would require reading each page in the table until the qualifying row is reached. You define an index with the CREATE INDEX statement. The components of an index definition are as follows:
1 | CREATE UNIQUE INDEX PartIndex --2 ON PurchDB.Parts --3 (Partno) --4ALLBASE/SQL can choose to use an index when processing the SELECT, UPDATE, or DELETE statements if the following criteria are satisfied:
Hashed AccessHashed access requires you to specify hashing when you create the table, before loading data. Because a hash structure is specified as part of the table definition, you do not assign a name to it, as you do with an index. However, you must identify specific key columns and a number of primary pages for data storage. ALLBASE/SQL determines the placement of rows based on specific unique key values. You can define one hash structure per table at table creation time; and if a hash is defined, you cannot define a clustering index on the table. You can define a multiple-column key for a hash structure; up to 16 columns are permitted in the key. A hash structure is a group of designated pages in a DBEFile that are set aside for the storage of tuples according to the values in a unique hash key. The key enforces uniqueness; duplicate values cannot exist in the hash key column(s). A well-chosen hash key, like a good index key, provides the optimizer with the choice of a potentially faster data access method than a serial scan. Create a hash structure at the time you create a table. In addition to the components of a table definition, a hash structure definition includes:
CREATE PUBLIC TABLE PurchDB.Vendors (VendorNumber INTEGER NOT NULL, VendorName CHAR(30) NOT NULL, ContactName CHAR(30), PhoneNumber CHAR(15), VendorStreet CHAR(30) NOT NULL, VendorCity CHAR(20) NOT NULL, VendorState CHAR(2) NOT NULL, VendorZipCode CHAR(10) NOT NULL, VendorRemarks VARCHAR(60) ) UNIQUE HASH ON (VendorNumber) -- 1 PAGES = 101 -- 2 IN PurchFSUse the UNIQUE HASH clause or the HASH ON CONSTRAINT clause to specify one or more columns for a hash key. Use the PAGES= clause to define a number of primary pages in which to store the data in the table. This is different from ordinary data storage, which does not require a number of primary pages. Based on the key and the number of primary pages you specify, ALLBASE/SQL calculates a page number for each row before insertion into the table. The page number depends directly on the data in the key. Because a specific number of primary pages is specified, you must create the hash structure as you create the table; you cannot modify a table from normal to hash storage at a later time. The optimizer can decide to use hashed access provided the statement contains a WHERE clause with an EQUAL factor for each column in the hash key. This makes hashing especially useful for tables on which you need quick random access to a specific row. For example, assuming you have defined a hash key on VendorNumber, the optimizer might choose hashed access for the following: isql=> SELECT * FROM PurchDB.Vendors > WHERE VendorNumber = 9002;However, it would not consider hash access for the following: isql=> SELECT * FROM PurchDB.Vendors > WHERE VendorNumber > 9002 > ORDER BY VendorName;Hash structures operate like unique indexes; that is, they enforce the uniqueness of each key in the table. If you attempt to insert a duplicate key, ALLBASE/SQL will return an error message. Differences between Hashed and Indexed AccessHashing may provide faster access than B-tree lookups for many types of common queries, and it does not require the overhead of additional file space required by B-tree indexes. In addition, hashing is not subject to the overhead of updating index pages when you insert or modify rows. However, updating key values in a hash table requires you to delete the row containing the key value and then insert a row containing the new value. This means that you should choose a non-volatile key for hashing whenever possible. When to Use a Hash StructureHashing offers high performance when you need essentially random access to individual tuples. It is not appropriate for applications that require sorting of the query result. In cases where both random access and sorting are required at different times, you can define a B-tree index as well as a hashing structure. This allows the optimizer the choice of the most efficient method for the specific query. The best candidates for the use of hash structures are applications in which the following occur:
TID AccessEach row of a table has a unique address called the tuple identifier, or TID. TID functionality provides the fastest possible data access to a single row. You can obtain the TID of any row with the SELECT statement. For more information on TID access refer to the ALLBASE/SQL application programming manual for the language you are using.
|