Managing the HP3000 Part 28

⇓ ⇐

The HP 3000--For Complete Novices
Part 28: IMAGE/SQL Application Logic--DBUPDATE and DBPUT

by George Stachnik

In the previous article in this series, we continued to explore IMAGE/SQL applications, and the intrinsics that they use to interact with databases. We saw examples of how IMAGE/SQL uses the DBGET intrinsic to retrieve records using a variety of different access methods (chained, serial, direct). This time around, we're going to begin examining intrinsics that are used to write records to a database. There are two of them: DBUPDATE and DBPUT. In this article, we'll focus on DBUPDATE.

To begin with, you might be wondering why IMAGE/SQL has two different intrinsics for the same thing. DBUPDATE and DBPUT are both used for essentially the same purpose--to write records (or data entries, if you prefer) to an IMAGE/SQL database. So what's the difference?

The answer is that the two intrinsics actually serve quite different purposes--or at least they did when the IMAGE DBMS was first designed. Over time, HP has enhanced IMAGE/SQL in response to user demands, and in the process, the differences between DBUPDATE and DBPUT have become somewhat blurred. In this article, we'll take a historical point of view, and explain what the differences were and what they are today.

Let's begin by explaining what the difference originally was.

In the original design of IMAGE/SQL, DBUPDATE was used to update existing records in place. That is, if a record already existed in an IMAGE/SQL database, you could use DBUPDATE to modify the contents of that record in place.
DBPUT, on the other hand, was used to insert new records into a database.

It's important to note the restriction that was placed on DBUPDATE: it could only modify the contents of records in place. This distinction has often been oversimplified as follows: "DBUPDATE is for changing existing records, and DBPUT is for creating new records in a database." In actual fact, that's not quite correct. DBUPDATE is for changing existing records in a database, but there are some restrictions on exactly what kinds of changes it can make.

Specifically, DBUPDATE can modify the contents of any data item in an existing record except for so called "critical items," such as search items and sort items. The reason for this restriction is that modifying these items would force IMAGE/SQL to change the location of the record, either in the chain, or in the dataset. Changing the location of records once represented a very significant performance penalty, especially in the early days of the HP 3000 architecture. So the architects of IMAGE/SQL simply disallowed this functionality from DBUPDATE.

To understand why the original designers of IMAGE/SQL didn't want DBUPDATE to be able to relocate records (and why they eventually changed their minds), we need to explore the structure of IMAGE/SQL databases in a bit more detail.

Consider Figure 1. This figure shows a schematic of a sample IMAGE/SQL database. It's a very simple database, consisting of only two datasets--CUSTOMER-MASTER (a manual master dataset), and ORDER-NUMBER-DETAIL (a detail dataset). (If you read the article in the last issue, this figure should look familiar to you.)

Figure 1: Sample Database

The master dataset (CUSTOMER-MASTER) is represented by the inverted triangle in the upper lefthand corner of the figure. It contains four records corresponding to our company's four customers: George, Ken, Andrea, and Greg.

For the sake of simplicity, we've drawn the figure to show only one data item (CUST-NAME) in each record. In actual fact, any database designer worth his salt would have designed the records to contain additional information about each customer. In other words, each record in CUSTOMER-MASTER would contain the CUST-NAME field (which is shown in the figure), as well as other fields such as CUSTOMER-ADDDRESS, PHONE-NUM, EMAIL-ADDR, and so forth.

Now let's turn our attention to the ORDER-NUMBER-DETAIL dataset. This dataset contains one record for each order that has been placed with our company. Each order is associated with the customer who placed that order.

The first record in the dataset represents an order that was placed by Greg. Figure 1 shows five fields in the ORDER-NUMBER-DETAIL records:

CUST-NAME, the name of the customer who placed the order
ORDER-NO, a unique number that's assigned to each order
QUANT, the quantity of the item being ordered
DESC, a description of the item being ordered
AMT, the amount being paid by the customer.

Order number 0001 is represented by the first record in ORDER-NUMBER-DETAIL. Figure 1 shows that an order for one "do-job" was placed by Greg. It also tells you three other important things about Greg's account:

Greg has placed a total of three orders with our company (order numbers 0001, 0004, and 0009). Consequently, there are three records in ORDER-NUMBER-DETAIL that have the string "Greg" in the CUST-NAME field.

The Greg record in CUSTOMER-MASTER contains a pointer to the first Greg record in ORDER-NUMBER-DETAIL.

Each Greg record in ORDER-NUMBER-DETAIL contains pointers to the next Greg record, and to the previous Greg record.

Chains and Pointers

In the last article, we explored the concept of chains in some detail. From a programmatic point of view, we saw that the DBFIND intrinsic can be used to locate chains of records. Once a chain has been located, the DBGET intrinsic can be used to retrieve records in the chain. Using chained access, DBGET can retrieve either the next record or the previous record in a chain.

Because of the way pointers are structured, DBGET can navigate these chains very quickly. Let's take a closer look at how DBGET works to understand how. Suppose that you've used DBGET to retrieve the first Greg record (ORDER-NO = 0001) shown in Figure 1. In addition to the data shown in the figure (CUST-NAME, ORDER-NO and so forth), DBGET will also retrieve a pointer to the second Greg record.

To better understand how these pointers are stored in the database, and represented in an application program, refer to Figure 2.

Figure 2: Pointers in the STATUS-ARRAY

002380 01  STATUS-ARRAY.
002390     05  CONDITION-CODE          PIC S9(4) COMP.
002400     05  ENTRY-LENGTH            PIC S9(4) COMP.
002410     05  RECORD-NUMBER           PIC S9(9) COMP.
002420     05  CHAIN-LENGTH            PIC S9(9) COMP.
002430     05  BACKWARD-POINTER        PIC S9(9) COMP.
002440     05  FORWARD-POINTER         PIC S9(9) COMP.
002450

054900
055000       CALL INTRINSIC "DBGET" USING DB-NAME,
055100                         SET-NAME,
055200                         DB-MODE,
055300                         STATUS-ARRAY,
055400                         ITEMS-LIST,
055500                         ORDER-DETAIL-BUFFER,
055600                         IGNORED-PARM
055700

Figure 2 shows a fragment of COBOL code that calls the DBGET intrinsic (beginning at line 5500). Note that one of the parameters being passed to DBGET is the STATUS-ARRAY. This parameter is shared among almost all the IMAGE/SQL intrinsics. It's an array of data items that is used by IMAGE/SQL to track errors and to pass information from intrinsic to intrinsic. (It was discussed in detail in part 26 of this series).

The COBOL code that defines the status array is shown at the top of Figure 2 (beginning with line 2380). Up until now, our discussions of the status array have focused upon the CONDITION-CODE. Now we're going to turn our attention to the items called CHAIN-LENGTH, BACKWARD-POINTER, and FORWARD-POINTER:

CHAIN-LENGTH is a count of how many records there are in the chain containing the current record.
BACKWARD-POINTER and FORWARD-POINTER contain the disk addresses, respectively, of the previous record and the next record in the chain.

Every time you use DBGET to retrieve a record in a chain, you automatically also retrieve the pointers to the next record and previous record in that chain. These pointers are placed in the status array, in the items shown in Figure 2. Similarly, when DBFIND locates a chain of records, it places an integer value in CHAIN-LENGTH indicating how many records are in the chain that it has located. It also fills in the FORWARD-POINTER field with the disk address of the first record in the chain.

These pointers are one of the keys to IMAGE/SQL's outstanding performance. Because the pointers are stored in the form of disk addresses (that is, the actual hardware addresses on the disk of the records in question), IMAGE/SQL can retrieve the next record in the chain, or the previous one, very quickly. There's no need to calculate disk addresses, or search the database. Every record is linked directly to its neighbors.

DBUPDATE and Pointers

The downside of IMAGE/SQL's system of pointers is that it can get in the way when you need to change the contents of a record. In the original version of IMAGE/SQL, the rule was as follows: You can use DBUPDATE to change the contents of any data item in a record except for "critical items." Critical items are items that can't be changed without affecting the pointers specifically, sort items and search items.

Let's return to Figure 1 for a moment to see why critical items were originally considered "hands off" for DBUPDATE. Imagine that an application program was working its way down the Greg Chain shown in the figure. Suppose that you had just used DBGET to retrieve the 2^nd Greg record in the chain (order number 0009).

Using DBUPDATE, you are allowed to change the price, or the quantity, or even the order number in this record, because none of these are critical items. That is, none of them has any bearing on the record's location in the chain, or in the dataset.

However, you could not change the contents of the CUST-NAME item from "Greg" to something different, say, to "Ken." CUST-NAME is considered a critical item because the previous and the next record in the chain contain pointers to the record that we're working on. If DBUPDATE allowed you to change Greg's name to Ken, it would also have to make a number of other changes in the database.

The length of the Greg chain (which was 3) would have to be changed to 2. This entails changes to the master dataset CUSTOMER-MASTER, where the chain lengths are stored.

Also the records for order number 0001 and 0004 would have to be changed. Currently, these records contain pointers to order number 0009, but if we're going to change the contents of the critical CUST-NAME item from Greg to something else, then this record will no longer be a member of the Greg chain. So the forward pointer from order number 0001 would have to be changed to point to 0004, and, similarly, the backward pointer in 0004 would have to be changed to reference 0001.

If all that seems like a lot of work, well, "you ain't seen nothin' yet." If we allow DBUPDATE to change the name from "Greg" to Ken" in order number 0009, then it will no longer be a part of the Greg chain. But which chain will it be a part of?

DBUPDATE will now literally have to do a new DBFIND to see if there is a record in CUSTOMER-MASTER that matches the new name "Ken." If not (assuming that this is a manual master), it would have to return an error. In Figure 1, we can see that there is a "Ken" chain that already exists. So order number 0009 would then be added to that chain, which is currently made up of orders number 0003, 0005, 0008, and 0011.

You can see that allowing DBUPDATE to make changes to search items like CUST-NAME takes an intrinsic that's blazingly fast (so long as it only writes to non-critical items) and changes it to an intrinsic that could be numbingly slow (because of all the work it now would now have to do). This is especially true if you're operating on a comparatively slow hardware platform, like one of the early versions of the HP 3000, which were all that was available when IMAGE/SQL was originally designed.

Consequently, early versions of IMAGE/SQL did not allow DBUPDATE to change critical items. If you wanted to change the CUST-NAME field in Figure 1, you'd have to physically delete the record (using DBDELETE) and then replace it with an entirely new record (using DBPUT).

Critical Item Update

Over the years, HP brought newer and faster hardware to HP 3000 customers. Larger memory configurations made it possible to cache entire databases in memory. This greatly reduced the performance impact of changes to critical items. By the early 1990s, the performance penalty had been reduced to milliseconds, simply because the hardware had gotten so fast. Customers told HP that it was time to "take the gloves off" of DBUPDATE.

So, in the mid 1990s, HP finally made it possible for DBUPDATE to change critical items. This so called "Critical Item Update" feature meant that application programs could use DBUPDATE to change any item in a data entry.

In order to ensure compatibility with earlier versions, the new feature was designed so that it must be enabled using DBUTIL. If it isn't enabled, then DBUPDATE works just as it always has. This ensured that application programs that assumed that DBUPDATE could not change critical items would not be "broken" by the enhancement. But if the Critical Item Update feature is enabled, then DBUPDATE will allow an application program to change the contents of any item (including critical items), and automatically make the corresponding changes to the pointers and so forth that we've just described.

Because most legacy applications were written prior to the mid 90s, most IMAGE/SQL applications that use DBUPDATE behave as if they could not change critical items. Instead, when critical items must be changed, they use DBDELETE and DBPUT to replace the entire record.

Clearly, if an application does a lot of this sort of thing, there could now be a significant performance payoff from enabling Critical Item Update and enhancing the program logic to use DBUPDATE instead of DBDELETE and DBPUT.

Sort Items

We've seen that search items (like CUST-NAME in Figure 1) cannot be changed using DBUPDATE unless Critical Item Update is enabled. Search items are therefore referred to as critical items. There is another kind of critical item, and that is a sort item.

Suppose that the CUST-NAME chain shown in Figure 1 were defined in the schema to be a sorted chain. Ordinarily, chains of records are not sorted; they are built in chronological sequence. That is, each record that is added to a chain is simply tacked on to the end of the chain.

But when a chain is defined to be a sorted chain, then a specific data item must be identified in the schema as a sort item, and the chain is maintained in sequence by the sort item. It's important to understand that defining a chain to be a sorted chain is not free from a performance perspective. You must pay (and pay dearly) for the convenience of having every chain maintained in sorted order. The payments are in the form of additional overhead every time you make a change to a chain of records. Let's see how.

First of all, in order to define a sorted chain, you name the sort item in the schema. For example, suppose we wanted the CUST-NAME chains to be sorted by the contents of DESC. When we defined that data item in the schema, we'd specify the sort field as follows:

ENTRY: CUST-NAME(CUSTOMER-MASTER(DESC)),

The data entry CUST-NAME is a search item, which defines a chain. We can tell because it's followed by the name of a corresponding master dataset (CUSTOMER-MASTER).

The CUST-NAME chain of records is sorted by the DESC data item, and we can tell that because it appears in parenthesis after the master dataset name. This is how we know that DESC is a sort item.

In the case of Figure 1, assume that the CUST-NAME chain is sorted by the description (DESC). As a result, the order for a "Do-job" appears at the head of the chain, followed by the order for a "thingumbob," followed in turn by the order for the "Widget."

There are a number of important things to note about sorted chains:

The order of the chain is different from the order in which the records are physically stored. That is, if you were to read through all the records in ORDER-NUMBER-DETAIL sequentially, you'd begin with order number 0001, and continue with order 0002, 0003, and so forth. You would encounter all the records in the Greg chain; order number 0001 would appear before number 0004 or 0009.
But if you were to read the records in the Greg Chain using chained access, they'd be ordered by description. Thus order number 0001 (for a Do-job) would appear first, followed by 0009 (for a thingumbob) and then 0004 (for a widget).
When you write a new record to a sorted chain using DBPUT, the record is not simply "tacked on" to the end of the chain. Rather, DBPUT must read the entire chain to figure out where to insert the new record. In databases with lengthy chains made up of thousands of records, this can represent a very significant performance hit. This is where the hidden performance cost of sorted chains lies.
You can use DBUPDATE to update a record in a sorted chain. But just as we saw with search items, you cannot use DBUPDATE to change the contents of a sort item unless Critical Item Update is enabled.

The reason for the restriction on DBUPDATE is similar to the reason why DBUPDATE cannot change search items. If we allowed DBUPDATE to change sort items, then the targeted record would need to be relocated in the chain. And since the chain is sorted, relocating the record represents a significant performance penalty.

The designers of TurboIMAGE thought that this would be too great a performance hit to pay. So sort items, like search items, were considered to be "hands off" for DBUPDATE until the Critical Item Update feature was added in the mid 1990s.

Sample Code

We'll wrap this article up by looking at some sample COBOL code that uses DBGET and DBUPDATE to change a record in place. (Note that this sample code does not use the same database that we've been showing in Figure 1.)

Look at the sample code fragment in Figure 3. DBGET is used to retrieve a record from a dataset called PARTS. The record is stored in a data item called PARTS-BUFFER. We are using mode 7, which retrieves a record using a search item value. In this example, we prompt the user for the search item beginning at line 59400. The COBOL ACCEPT verb allows the user to type the desired part number. The actual call to DBGET follows at line 60400.

Lines 61300 through 61800 contain error handling code. Assuming that there has been no error (that is, that the part number entered by the user actually exists in the database), we will fall through to line 62100. Here we display the information that we've retrieved from the database, including the part number, the description (line 62700), and the current quantity in stock (line 63200).

The sample program fragment shown in Figure 3 gives the user opportunities to change each of these three values using the COBOL ACCEPT verb. These opportunities are at line 62500, 63000, and 63500. The user can either accept the existing values or type new ones. Either way, the final values are placed in PARTS-BUFFER. Then, at line 64100, we call the DBUPDATE intrinsic to update the information in the database using the new data in PARTS-BUFFER. The error handling routine that begins at line 64900 is instructive. If CONDITION-CODE contains a value of zero after we return from DBUPDATE, we will fall through to line 65900, which displays a message that the part was successfully updated. On the other hand, if we determine that there has been an error (that is, that CONDITION-CODE was non-zero), we must test for specific values.

Figure 3: Using DBGET and DBUPDATE

058500 UPDATE-PART-INFORMATION.
058800
059400     DISPLAY "Please enter the Part Number to update: "
059500         WITH NO ADVANCING.
059600     ACCEPT SEARCH-ITEM-VALUE FREE.
059800
059900     MOVE 7         TO DB-MODE.
060000     MOVE "@;"      TO ITEMS-LIST.
060100     MOVE "PARTS;"  TO SET-NAME.
060200
060400     CALL INTRINSIC "DBGET" USING DB-NAME,
060500                        SET-NAME,
060600                        DB-MODE,
060700                        STATUS-ARRAY,
060800                        ITEMS-LIST,
060900                        PARTS-BUFFER,
061000                        SEARCH-ITEM-VALUE.
061200
061300     IF CONDITION-CODE NOT = 0
061500       IF CONDITION-CODE = 17
061600         DISPLAY "The Part does not exist."
061700       ELSE
061800         PERFORM IMAGE-ERROR
061900     ELSE
062100       DISPLAY "Current Part Number = ",PART-NO-VALUE
062200                                  OF PARTS-BUFFER
062300       DISPLAY "          New Value : "
062400          WITH NO ADVANCING
062500       ACCEPT PART-NO-VALUE OF PARTS-BUFFER FREE
062700       DISPLAY "Current Description = ",PART-DESCRIPTION-VALUE
062800       DISPLAY "          New Value : "
062900          WITH NO ADVANCING
063000       ACCEPT PART-DESCRIPTION-VALUE FREE
063200       DISPLAY "Current Quantity in Stock = ",QTY-IN-STOCK-VALUE
063300       DISPLAY "               New Value  : "
063400          WITH NO ADVANCING
063500       ACCEPT QTY-IN-STOCK-VALUE FREE
063700
063800       MOVE 1 TO DB-MODE
063900
064100       CALL INTRINSIC "DBUPDATE" USING DB-NAME,
064200                             SET-NAME,
064300                             DB-MODE,
064400                             STATUS-ARRAY,
064500                             ITEMS-LIST,
064600                             PARTS-BUFFER
064800
064900       IF CONDITION-CODE NOT= 0
065100          IF CONDITION-CODE = 41
065200             DISPLAY "The value of the Part Number",
065201                     " cannot be changed"
065210          ELSE
065300             IF CONDITION-CODE = 42
065400                DISPLAY "You do not have the ability",
065500                        " to update the data item."
065600             ELSE
065700                PERFORM IMAGE-ERROR
065800       ELSE
065900         DISPLAY "The Part was successfully updated".
066000
066100     DISPLAY SPACE.

A condition code of 41 (line 65100) means that we tried to change the part number, which is a critical item. Don't forget that this error will not occur if Critical Item Update has been enabled. In that case, IMAGE/SQL would have changed the part number (paying the performance penalty to re-chain the record if necessary), and returned a condition code of zero (assuming that nothing else went wrong.)

Changing Non-Critical Items

In Figure 3, Line 65300 reminds us that there is at least one other reason why DBUPDATE might not let us change a particular item in a record. Suppose that a user who is running the program shown in Figure 3 tries to change a non-critical item, such as QTY-IN-STOCK. Ordinarily, you wouldn't expect DBUPDATE to give you any trouble. But DBUPDATE is in the unenviable position of having to report trouble from another quarter.

Line 65300 of our program tests for a condition code of 42. The error message indicates that "You do not have the ability to update the data item." This is a trap for a security violation. Recall that the database schema contains a read/write list for every item in the database. This list defines who can read or write each individual data item in the database. Depending on the database password used to open the database, a user may or may not be allowed to update a particular item.

If IMAGE/SQL detects that the user is trying to update an item that he's not allowed to change, the error code of 42 will be returned from DBUPDATE.

There's one more error check in the program code shown in Figure 3. If the error that we've trapped is neither 41 nor 42, then some unforeseen error has occurred. To handle that contingency, the program transfers control to a routine called IMAGE-ERROR, which closes the database and aborts the application program. (The code that makes up IMAGE-ERROR is not shown in Figure 3.)

This kind of generic test for a non-zero condition code should be made after every IMAGE/SQL intrinsic call.

We've seen how DBUPDATE is used to update records in a database. Next time, we'll turn our attention to DBPUT.

George Stachnik works in Hewlett-Packard's Commercial Systems Division in Cupertino, California. He is responsible for the development and delivery of e3000 training.

←Part 27 Part 29→