Character data in the DBEnvironment can be represented in the native
language specified by the DBEnvironment language.
When native language character columns are created, they follow
the same rules as CHAR and VARCHAR columns.
For character columns, size is defined in bytes. Thus a column
defined as CHAR (20) could hold 20 characters in ASCII or
10 characters in Japanese Kanji.
Numeric data must be in ASCII representation.
Pattern matching is in terms of conceptual characters rather
than bytes. This is necessary for languages in which there are
both one-byte and two-byte characters frequently mixed in the same string.
An example is Japanese,
in which the Kanji and Hiragana characters occupy 16 bits each,
whereas the Katakana characters use only 8 bits.
Conceptual character matching is also necessary to establish a
collating sequence that includes the one-byte ASCII character
set as a subset of a two-byte character set such as Chinese.
Truncation is done on a character basis.
For example, imagine a column defined as CHAR (20). If
a string contains 11 Kanji characters, or 22 bytes, the
last character is truncated if you try to insert
it into the column.
In a case where a string contains both Kanji
and Katakana characters and is 21 bytes long, the truncation
depends on the size of the last character. If it is a 2-byte
Kanji character, the data is truncated to 19 bytes;
if it is a 1-byte Katakana character, the data is
truncated to 20 bytes.
An implicit type conversion occurs when an
NATIVE-3000
string
is compared to a native language CHAR or VARCHAR type. The
shorter string is padded with ASCII blanks before the
comparison is done.
When a case insensitive ASCII expression is compared to a case insensitive NLS
expression, the two expressions are compared using the NLS collation rules. The case insensitive NLS comparison is done by using the NLSCANMOVE and NLCOLLATE
intrinsics. The same ASCII characters in upper and lower case are equivalent.
The same accent characters (extended characters) in upper and lower case are
also equivalent. However, an accent character may not be the same as its ASCII
equivalent, depending on the specific language collation table.