HP 3000 Manuals

Collating Sequences [ MPE XL Native Language Programmer's Guide ] MPE/iX 5.0 Documentation


MPE XL Native Language Programmer's Guide

Appendix B  Collating Sequences 

Collating is defined as arranging character strings into order (usually
alphabetic).  To do this, a mechanism must be available that, given two
character strings, decides which one comes first.  In Native Language
Support (NLS) this mechanism is the NLCOLLATE intrinsic.


NOTE This appendix deals with collating or lexical ordering and does not include matching. For matching purposes, there is generally a difference between A and a.
Look at the full ROMAN8 character set and consider that all these characters can appear in every European language. Even if a character does not exist in a language, it can still show up in names and/or addresses. It is quite useful to address a letter to Spain correctly, even if it originates in Germany. Therefore, the full ROMAN8 character set is considered to be used in all languages, and a collating sequence has been defined for all characters in the ROMAN8 character set for the languages it supports. Table B-1 lists the collating sequence for American-English, Canadian-French, Danish, Dutch, English, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, and Swedish. All characters in an alpha or numeric group collate the same. These characters usually differ only in uppercase versus lowercase priority, or accent priority. (Refer to Table B-2 for collating sequences.) In sorting, they are initially considered the same. If characters in the two strings do not determine which string comes first, then the priorities of characters are used to determine the order. Refer to Table B-1 for examples of collating sequence priority. Table B-1. Collating Sequence Priority --------------------------------------------------------------------------------------------- | | | | Example | Priority Explanation | | Sorted Strings | | | | | --------------------------------------------------------------------------------------------- | | | | aEb, aEc | The third character in each string is different. The "b" precedes | | | the "c". | | | | --------------------------------------------------------------------------------------------- | | | | aeb,aEb | The characters in the two strings are identical, so accent priority | | | determines the order. The "e" precedes the "E". | | | | --------------------------------------------------------------------------------------------- | | | | abc, Abd | The last characters in the strings are different. The "c" precedes | | | the "d". | | | | --------------------------------------------------------------------------------------------- | | | | aBc, abc | The characters in the two strings are the same, so the uppercase | | | priority determines the order. The "B" precedes the "b". | | | | --------------------------------------------------------------------------------------------- Table B-2 displays the collating sequence in three ways: * The graphic representation of the character. * The decimal equivalent of the character's binary value. * A description of the character. Table B-2. Collating Sequence
[]
Table B-2. Collating Sequence (continued)
[]
Table B-2. Collating Sequence (continued)
[]
Table B-2. Collating Sequence (continued)
[]
Table B-2. Collating Sequence (continued)
[]
Table B-2. Collating Sequence (continued)
[]
Table B-2. Collating Sequence (continued)
[]

NOTE The (uppercase AE ligature) and (lowercase ae ligature) are expanded for collating purposes to AE or ae and collates as: ad AE Ae aE ae AF The beta (sharp s) is expanded for collating purposes to ss and collates according to the German standard as: sr ss st
Table B-3 through Table B-6 show the language-dependent variations to the collating sequence.


MPE/iX 5.0 Documentation