HP 3000 Manuals

Using Subqueries [ ALLBASE/SQL Performance Guidelines ] MPE/iX 5.0 Documentation


ALLBASE/SQL Performance Guidelines

Using Subqueries 

You can nest queries within the predicates of other queries.  This makes
it easier to express complex queries, and it makes it possible to
formulate queries in several different ways.  Thus you can create
different queries which return the same rows but which do so with vastly
different performance.  In the following discussion, queries 1, 2, and 3
are all equivalent.

Most subqueries can also be expressed in an equivalent join form.  For
example, assuming that the PartNumber column in PurchDB.Parts is unique,
the query:

     (1) SELECT * FROM PurchDB.SupplyPrice
         WHERE PartNumber IN (SELECT PartNumber from PurchDB.Parts
                          WHERE PartName = 'Cache Memory Unit')

can also be written:

     (2) SELECT PurchDB.SupplyPrice.*
         FROM PurchDB.SupplyPrice, PurchDB.Parts
         WHERE PurchDB.SupplyPrice.PartNumber = PurchDB.Parts.PartNumber
         AND PurchDB.Parts.PartName = 'Cache Memory Unit'

Both queries return the same information.  In general, while the subquery
is more easily understood and easier to formulate, the join actually
improves performance because it gives the optimizer more efficient
choices of how to execute the query. 
There are two types of subqueries:  correlated and non-correlated.  A
correlated subquery is one in which the subquery makes reference to one
or more columns of an outer query.  For example,

     (3) SELECT * FROM PurchDB.SupplyPrice
         WHERE EXISTS (SELECT PartNumber
                       FROM PurchDB.Parts
                       WHERE PartName = 'Cache Memory Unit'
                       AND PartNumber = PurchDB.SupplyPrice.PartNumber)

This is a correlated subquery, since the subquery makes reference to
PartNumber from the outer query.  In this case, the subquery must be
executed for each row returned from the outer query.  For the above
query, since the SupplyPrice table has 69 rows in it, there must be one
scan on the outer query and 69 scans on the inner query for a total of 70
scans.  The equivalent join shown in query (2) can be executed using a
sort/merge technique, which takes two sorts and two scans.

It should be noted that the predicate WHERE PartNumber IN of query (1) is
internally the same as the predicate WHERE EXISTS in query (3).  All
quantified predicates involving subqueries get transformed into EXISTS
predicates internally.  Therefore the nested query using the IN predicate
also takes 70 scans.

In general, non-correlated subqueries are faster than joins, and
correlated subqueries are slower than joins.  You should know whether the
subquery will return a single row or not so as to take advantage of the
speed of non-correlated subqueries.  Since we know that each part in the
Parts table is unique, there is no reason why we could not express query
(1) as a non-correlated subquery by simply replacing the IN with an equal
sign (=).  Then the subquery does not depend on the outer query and the
subquery can be executed only once.  In this case, there is one scan for
the outer query and one scan for the subquery for a total of two scans.
This is even faster than the sort/merge join since there is no sorting
involved.

One exception to the rule that correlated subqueries are slower than an
equivalent join is when the equivalent join involves an aggregate.  For
example, consider the following query, which is transformed into a
correlated subquery internally by the query processor:

     SELECT VendorNumber, PartNumber, DiscountQty
     FROM PurchDB.SupplyPrice
     WHERE DiscountQty < ALL (SELECT DiscountQty
                              FROM PurchDB.SupplyPrice
                              WHERE VendorNumber = 9010)

becomes

     SELECT VendorNumber, PartNumber, DiscountQty
     FROM PurchDB.SupplyPrice sp1
     WHERE NOT EXISTS (SELECT DiscountQty
                       FROM PurchDB.SupplyPrice sp2
                       WHERE VendorNumber =9010
                       AND sp2.DiscountQty <= sp1.DiscountQty)

The equivalent join would require the use of an aggregate, and would
therefore be slower:

     SELECT sp1.VendorNumber, sp1.PartNumber, sp1.DiscountQty
     FROM PurchDB.SupplyPrice sp1, PurchDB.SupplyPrice sp2
     WHERE sp2.VendorNumber = 9010
     GROUP BY sp1.VendorNumber, sp1.PartNumber, sp1.DiscountQty
     HAVING sp1.DiscountQty < MIN(sp2.DiscountQty)

In this case, because of the complexity of the query, the subquery is a
better choice than the join.  The best solution, however, is an
equivalent non-correlated subquery:

     SELECT VendorNumber, PartNumber,
            DiscountQty from PurchDB.SupplyPrice
     WHERE DiscountQty < (SELECT MIN(DiscountQty)
                          FROM PurchDB.SupplyPrice
                          WHERE VendorNumber = 9010)



MPE/iX 5.0 Documentation