HPlogo SORT-MERGE/XL General User's Guide > Chapter 6 SORT-MERGE/XL Commands

ALTSEQ

MPE documents

Complete PDF
Table of Contents
Index
Glossary

The >ALTSEQ command defines a collating sequence other than the standard ASCII or EBCDIC format. The >ALTSEQ command must be preceded by a >DATA command. It is effective only if the keys are of type BYTE and if the input data is ASCII. (Refer to Appendix B for information on ASCII and EBCDIC character set values.)

SYNTAX



  >A[LTSEQ] modspec1[, modspec2]...[, modspecN]

PARAMETERS


modspec

A set of parameters you use to define your own collating sequence. You can use more than one group of these parameters in one or more successive >ALTSEQ commands until the desired collating sequence is defined.

The modspec parameter set has the following form:


                  { =  }
  [EACH] leftspec {    } rightspec
                  {WITH}

         OR

                  {WITH}
  MERGE  leftspec {    } rightspec
                  { =  } 

To specify leftspec and rightspec use the following form:

  {string        }
  {num byte      }
  {range string  } 

EACH

The EACH parameter indicates that the collating sequence is to be modified by assigning each character of leftspec the ordinal value obtained by taking the ASCII code decimal value of the corresponding character in rightspec. If leftspec is longer than rightspec, rightspec is concatenated to itself enough times to make it equal in length to leftspec.

MERGE

The MERGE parameter indicates that the collating merging leftspec and rightspec. Characters are selected alternatively from leftspec and rightspec.


NOTE: If neither EACH nor MERGE is specified, the collating sequence is modified as if EACH was specified, but rightspec is padded with blanks if it is shorter than leftspec.

=

When used in the modspec parameter, the equal sign (=) functions as a separator between leftspec and rightspec.

WITH

The WITH parameter can be used interchangeably with the equal sign (=) and is generally used when MERGE is specified.

string

A string is a single character or a group of ASCII or EBCDIC characters specified by enclosing them in quotation marks, for example, "J" or "JAS".

num byte

A numerical specification used in the following form:


  [%[(bb)]] nnn

The bb is a base of any decimal number between 2 and 16 inclusive. You specify %(bb) to indicate a base other than 8 or 10.

The % indicates base 8 when no (bb) is specified. If both % and (bb) are omitted, the nnn parameter is assumed to be a decimal number (that is, base 10).

The nnn represents a number (integer) with a value between 0 and decimal 255, inclusive. Each %n is a digit between 0 and 9, inclusive, or one of the letters A, B, C, D, E, or F. The letters A through F are used to represent the digits 10 through 15, when a base greater than 10 is used. Each digit n or nnn must be less than the base bb.

For example, 12 represents the decimal value 12; %12 represents the octal value 12, which is equivalent to the decimal value 10; and %(16)12 represents the hexadecimal value 12, which is equivalent to the decimal value 18.

range string

Specifies two characters separated by a minus sign (-) and enclosed in quotation marks, or two numeric byte specifications separated by a minus sign. For example, "A-Z" or %101-%132 (which is the octal specification for the character range "A-Z").


NOTE: Whenever a minus sign (-) is the second character in a group of three characters, the group is treated as a range. In all other cases, the minus sign is treated the same as any other character. For example, "A-D" represents the four characters A, B, C, and D while "AD-" represents the three characters A, D, and -.

DISCUSSION


Each modification of the collating sequence changes the ordinal values in the translation table assigned to the characters specified by leftspec. Refer to the >SHOW command for a discussion of the translation table. If rightspec is longer than leftspec, the extra characters are ignored. If leftspec is longer than rightspec and neither EACH nor MERGE has been specified, rightspec is padded with blanks to make it equal in length to leftspec. For example, the command, >ALTSEQ "SAW"="TG" gives S, A, and W the ordinal values T, G, and space. (See the discussion below for explanations of modspec with EACH and MERGE.) These assignments of new ordinal values are only for collating purposes. That is, the identity of the character is not lost; data is unchanged and appears in its original form in the output.

You must issue a >DATA command, specifying data type and a collating sequence type before you can use the >ALTSEQ command in any SORT/XL or MERGE/XL operation. The system displays the error message THE DATA COMMAND MUST BE ISSUED BEFORE THE ALTSEQ COMMAND CAN BE ISSUED, if the >ALTSEQ command is not preceded by a >DATA command.


NOTE: The operation of SORT/XL (or MERGE/XL) is slower when you define a collating sequence with the >ALTSEQ command than when a standard ASCII or EBCDIC collating sequence is used.

Using modspec With EACH


If EACH is specified, the modifications of the collating sequence are the same as explained above, except if leftspec is longer than rightspec, rightspec is concatenated to itself a sufficient number of times to make it equal in length to leftspec. For example, the command, >ALTSEQ EACH "ADW"="FG", give A, D, and W the ordinal values obtained by taking the ASCII code decimal values of F, G, and F. Assuming the basic collating sequence has been specified as ASCII, this means A=70 appears in the sixth row, fifth column of the translation table, D=71 in the sixth row, eighth column, and W=70 in the eighth row, seventh column. Note that 70 and 71 are the ASCII code decimal values of the characters F and G, respectively. For additional information refer to the "EXAMPLES" section below.

Using modspec With MERGE


When MERGE is specified in the modspec parameter, the values in the translation table assigned to the characters specified by leftspec and rightspec, and the characters in between are modified. Characters are selected alternatively from leftspec and rightspec and the translation table is modified so the characters collate in this order. The first character is always selected from leftspec. If leftspec precedes rightspec in the collating sequence, the sequence is modified so the characters between the two ranges collate after the merger of the ranges. If rightspec precedes leftspec, the characters between the two ranges collate before the first character of the first range. When either range is exhausted, the characters from the other range are simply appended until that range is also exhausted. Note that the strings specified by leftspec and rightspec must be strictly increasing and contiguous whenever MERGE is specified.

If you wish to do an alphabetic sorting in which each upper case letter collates ahead of the corresponding lower case letter, use the command >ALTSEQ MERGE "A-Z" WITH "a-z". The following six special characters follow the lower case z since the first range precedes the second range:

  [  \  ]  ^  _  and `  

If the modspec is MERGE "a-z" WITH "A-Z", the same six characters precede the lower case a. For additional information, refer to the "EXAMPLES" section below.

Consider this form of modspec as a shorthand for the modspec specifying EACH. For example, the command, >ALTSEQ MERGE "A-Z" WITH "a-z", is equivalent to the longer command >ALTSEQ "AaBb...Zz"= "AB...Zab...z", where ... represents all the necessary characters.

EXAMPLES


The following examples show how to use various parameters with the >ALTSEQ command, as well as the resulting collating sequences.

Standard ASCII Collating Sequence


To display the standard collating sequence enter the DATA IS ASCII, SEQUENCE IS ASCII and >SHOW SEQUENCE commands, as shown below. Refer to this display, for comparative purposes, to see what occurs to the collating sequence when you use >ALTSEQ for various functions in the following examples.

  :SORT

  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:10 AM
  (C) HEWLETT-PACKARD CO. 1986

  >DATA IS ASCII, SEQUENCE IS ASCII
  >SHOW SEQUENCE

  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    @   A   B   C   D   E   F   G   H   I   J   K   L   M   N   0
    P   Q   R   S   T   U   V   W   X   Y   Z   [   \   ]   ^   _
    `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
    p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ del

Using the EACH Parameter


The following example shows how to use the >ALTSEQ command with the EACH parameter followed by a string specification:

  :SORT

  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:10 AM
  (C) HEWLETT-PACKARD CO. 1986

  >DATA IS ASCII, SEQUENCE IS ASCII
  >ALTSEQ EACH "LMN"="ST"
  >SHOW SEQUENCE

  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    @   A   B   C   D   E   F   G   H   I   J   K   O   P   Q   R
    L=  N=  S   M=  T   U   V   W   X   Y   Z   [   \   ]   ^   -
    `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
    p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ del

The result of the modspec in the above example where EACH "LMN"="ST" is shown below:

     Original List                 Sorted Result

     TOKEN                         COST
     MOP                           COME
     COST                          SING
     COME                          NOSE
     TABLE                         LONESOME
     MISS                          SOLE
     SING                          TABLE
     NOSE                          MISS
     LONESOME                      TOKEN
     SOLE                          MOP

During the sort operation, L and N are equated to S, and M is equated to T.

Using >ALTSEQ Without the EACH Parameter


The following example shows how to use the >ALTSEQ command without including the EACH parameter. You may use abbreviated forms for >ALTSEQ (>A), >SHOW SEQUENCE (>SH S), and SEQUENCE IS ASCII (SEQ A), if you wish.

  :SORT
  
  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:15 AM
  (C) HEWLETT-PACKARD CO. 1986
  
  >DATA IS ASCII, SEQUENCE IS ASCII
  >ALTSEQ "ABC" = "X"
  >SHOW SEQUENCE
  
  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp=  B=  C   !   "   #   $   %   &   '   (   )   *   +   ,   -
    .   /   0   1   2   3   4   5   6   7   8   9   :   ;   <   =
    >   ?   @   D   E   F   G   H   I   J   K   L   M   N   O   P
    Q   R   S   T   U   V   W   A=  X   Y   Z   [   \   ]   ^   _
    `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
    p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ del

The >ALTSEQ command pads X with two blank characters making it equal to ABC in length. Note the character sp (space) is equated to B and C and the character A is equated to X. The table position identified by each character of the left string is replaced by the corresponding character of the right string until the string ABC is exhausted

Numeric Byte Specification


The following example shows how to use the >ALTSEQ command for a numeric byte specification:

  :SORT

  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:20 PM
  (C) HEWLETT-PACKARD CO. 1986

  >DATA IS ASCII, SEQUENCE IS ASCII
  >ALTSEQ 65=%141
  >SHOW SEQUENCE
  
  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    @   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P
    Q   R   S   T   U   V   W   X   Y   Z   [   \   ]   ^   -   `
    A=  a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
    p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ del

In this example, the upper case A (represented by the decimal value 65) is assigned the same ordinal value as the lower case a (represented by the octal value %141) in the final collating sequence.

Using a Range String Specification


The following example shows how to use the >ALTSEQ command for a range string specification:

  :SORT

  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:25 AM
  (C) HEWLETT-PACKARD CO. 1986
  
  >ALTSEQ %101-%132="a-z"
  >SHOW SEQUENCE
  
  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    @   [   \   ]   ^   _   `   A=  a   B=  b   C=  c   D=  d   E=
    e   F=  f   G=  g   H=  h   I=  i   J=  j   K=  k   L=  l   M=
    m   N=  n   O=  o   P=  p   Q=  q   R=  r   S=  s   T=  t   U=
    u   V=  v   W=  w   X=  x   Y=  y   Z=  z   {   |   }   ~ del

The left range in the above example is specified by two numeric byte specifications separated by a minus sign. Note that the same range can be represented by "A-Z" (characters), %101-"Z" (octal representation), or 65-90 (decimal representation).

Collating Upper Case Before Lower Case


The following example shows how to use the >ALTSEQ command for collating upper case, then lower case characters. This is a commonly used alternative to the standard collating sequence.

  :SORT
  
  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:30 AM
  (C) HEWLETT-PACKARD CO. 1986
  
  >ALTSEQ MERGE "A-Z" WITH "a-z"
  >SHOW SEQUENCE
  
  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    @   A   a   B   b   C   c   D   d   E   e   F   f   G   g   H
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    h   I   i   J   j   K   k   L   l   M   m   N   n   O   o   P
    p   Q   q   R   r   S   s   T   t   U   u   V   v   W   w   X
    x   Y   y   Z   z   [   \   ]   ^   _   `   {   |   }   ~ del

The six characters [, \, ], ^, _,'' and ` follow the lower case z. The result of MERGE "A-Z" WITH "a-z" is as follows:

        Original        Sorted List        Sorted List
          List         Without MERGE       Using MERGE

        CAN                 AXE                AXE
        shovel              BROOM              BROOM
        MAN                 CAN                boy
        BROOM               DOG                CAN
        TABLE               MAN                DOG
        AXE                 TABLE              drawer
        drawer              boy                MAN
        boy                 drawer             shovel 
        DOG                 shovel             TABLE

Collating Lower Case Before Upper Case


The following shows how to use the >ALTSEQ command to collate lower case alphabetic characters, and have each followed by its corresponding upper case character:

  :SORT
  
  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:35 AM
  (C) HEWLETT-PACKARD CO. 1986
  
  >ALTSEQ MERGE "a-z" = "A-Z"
  >SHOW SEQUENCE
  
  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    @   [   \   ]   ^   _   `   a   A   b   B   c   C   d   D   e
    E   f   F   g   G   h   H   i   I   j   J   k   K   l   L   m
    M   n   N   o   O   p   P   q   Q   r   R   s   S   t   T   u
    U   v   V   w   W   x   X   y   Y   z   Z   {   |   }   ~ del

The six characters [, \, ], ^, _, and '" precede the lower case a.

The result of MERGE "a-z" = "A-Z" is as follows:

        Original        Sorted List        Sorted List
          List         Without MERGE       Using MERGE

        CAN                AXE                 AXE
        shovel             BROOM               boy

        MAN                CAN                 BROOM
        BROOM              DOG                 CAN
        TABLE              MAN                 drawer
        AXE                TABLE               DOG
        drawer             boy                 MAN
        boy                drawer              shovel
        DOG                shovel              TABLE

Merging Unequal Strings


The following example shows how to use the >ALTSEQ command to merge unequal strings:

  :SORT
  
  HP32214A.01.00  SORT/3000 THU, JUN  4, 1987,  8:40 AM
  (C) HEWLETT-PACKARD CO. 1986
  
  >ALTSEQ MERGE "ABCD" WITH "ab"
  >SHOW SEQUENCE
  
  nul soh stx etx eot enq ack bel  bs  ht  lf  vt  ff  cr  so  si
  dle dc1 dc2 dc3 dc4 nak syn etb can  em sub esc  fs  gs  rs  us
   sp   !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
    0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
    @   A   a   B   b   C   D   E   F   G   H   I   J   K   L   M
    N   O   P   Q   R   S   T   U   V   W   X   Y   Z   [   \   ]
    ^   _   `   c   d   e   f   g   h   i   j   k   l   m   n   o
    p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ del

The collating sequence appears AaBbCDE...Z. The merging of the strings continues until the right string ab is exhausted.

ADDITIONAL DISCUSSION


Refer to the >DATA and >SHOW commands in this chapter.




Chapter 6 SORT-MERGE/XL Commands


DATA