Oracle® Database PL/SQL Packages and Types Reference 10g Release 2 (10.2) Part Number B14258-02 |
|
|
PDF · Mobi · ePub |
UTL_I18N
is a set of services that provides additional globalization functionality for applications written in PL/SQL.
The chapter contains the following topics:
Overview
Constants
This section contains topics which relate to using the UTL_I18N
package.
The UTL_I18N
PL/SQL package consists of the following categories of services:
String conversion functions for various datatypes.
Functions that convert a text string to character references and vice versa.
Functions that map between Oracle, Java, and ISO languages and territories.
Functions that map between Oracle, Internet Assigned Numbers Authority (IANA), and e-mail safe character sets.
A function that returns the Oracle character set name from an Oracle language name.
A function that performs script transliteration.
Functions that return the ISO currency code, local time zones, and local languages supported for a given territory.
Functions that return the most appropriate linguistic sort, a listing of all the applicable linguistic sorts, and the local territories supported for a given language.
Functions that map between the Oracle full and short language names.
A function that returns the language translation of a given language and territory name.
A function that returns a listing of the most commonly used time zones.
The UTL_I18N
package uses the constants shown in Table 170-1.
Table 170-1 UTL_I18N Constants
Constant | Type | Value | Description |
---|---|---|---|
|
|
|
Returns the default character set for general cases. |
|
|
|
Map from an Oracle character set name to an email safe character set name on a non-Windows platform. |
|
|
|
Map from an Oracle character set name to an IANA character set name. |
|
|
|
Used with |
|
|
|
Map from an IANA character set name to an Oracle character set name. |
|
|
|
The mapping is between an Oracle character set name and an email safe character set name. |
|
|
|
Map from an Oracle character set name to an email safe character set name on a Windows platform. |
|
|
|
|
|
|
|
Converts only fullwidth Katakana characters to fullwidth Hiragana characters. |
|
|
|
Converts only fullwidth Katakana characters to halfwidth Katakana characters. |
|
|
|
Converts only fullwidth Hiragana characters to fullwidth Katakana characters. |
|
|
|
Converts only fullwidth Hiragana characters to halfwidth Katakana characters. |
|
|
|
Converts only halfwidth Katakana characters to fullwidth Katakana characters. |
|
|
|
Converts only halfwidth Katakana characters to fullwidth Hiragana characters. |
|
|
|
Converts any type of Kana character to a fullwidth Katakana character. |
|
|
|
Converts any type of Kana character to a fullwidth Hiragana character. |
|
|
|
Converts any type of Kana character to a halfwidth Katakana character. |
Table 170-2 UTL_I18N Package Subprograms
Procedure | Description |
---|---|
Converts a given text string to its character reference counterparts, for characters that fall outside the document character set. |
|
Returns the list of common time zone IDs that are independent of the locales. |
|
Returns the default Oracle character set name or the default e-mail safe character set name from an Oracle language name. |
|
Returns the default ISO 4217 currency code for the specified territory. |
|
Returns the default linguistic sort name for the specified language. |
|
Returns the local language names for the specified territory. |
|
Returns the local linguistic sort names for the specified language. |
|
Returns the local territory names for the specified language. |
|
Returns the local time zone IDs for the specified territory. |
|
Returns the translation of the language and territory name in the specified translation language. |
|
|
|
Maps an Oracle short language name to an Oracle language name. |
|
Returns an Oracle language name from an ISO locale name. |
|
Returns an ISO locale name from the Oracle language and territory name. |
|
Returns an Oracle territory name from an ISO locale name. |
|
Maps an Oracle language name to an Oracle short language name. |
|
Converts |
|
Converts |
|
Converts a |
|
Transliterates between Japanese hiragana and katakana. |
|
Converts an input string that contains character references to a text string. |
This function converts a text string to its character reference counterparts for characters that fall outside the character set used by the current document. Character references are mainly used in HTML and XML documents to represent characters independently of the encoding of the document.Character references may appear in two forms, numeric character references and character entity references. Numeric character references specify the Unicode code point value of a character, while character entity references use symbolic names to refer to the same character. For example, å
is the numeric character reference for the small letter "a" with a ring above, whereas å
is the character entity reference for the same character. Character entity references are also used to escape special characters, as an example, <
represents the < (less than) sign. This is to avoid possible confusion with the beginning of a tag in Markup languages.
UTL_I18N.ESCAPE_REFERENCE( str IN VARCHAR2 CHARACTER SET ANY_CS, page_cs_name IN VARCHAR2 DEFAULT NULL) RETURN VARCHAR2 CHARACTER SET str%CHARSET;
Table 170-3 ESCAPE_REFERENCE Function Parameters
Parameter | Description |
---|---|
|
Specifies the input string |
|
Specifies the character set of the document. If |
If the user specifies an invalid character set or a NULL
string, then the function returns a NULL
string.
UTL_I18N.ESCAPE_REFERENCE('hello < '||chr(229),'us7ascii')
This returns 'hello < å'
.
This function returns a listing of the most commonly used time zones. This list contains a subset of the time zones that are supported in the database.
UTL_I18N.GET_COMMON_TIME_ZONES RETURN STRING_ARRAY;
Returns the list of the most commonly used time zones.
DECLARE retval UTL_I18N.STRING_ARRAY; BEGIN retval := UTL_I18N.GET_COMMON_TIME_ZONES; END; /
This function returns the default Oracle character set name or the default e-mail safe character set name from an Oracle language name.
See Also:
"MAP_CHARSET Function" for an explanation of an e-mail safe character setUTL_I18N.GET_DEFAULT_CHARSET( language IN VARCHAR2, context IN PLS_INTEGER DEFAULT GENERIC_CONTEXT, iswindows IN BOOLEAN DEFAULT FALSE) RETURN VARCHAR2;
Table 170-4 GET_DEFAULT_CHARSET Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle language |
|
|
|
If
|
If the user specifies an invalid language name or an invalid flag, then the function returns a NULL
string.
GENERIC_CONTEXT, iswindows=FALSE
UTL_I18N.GET_DEFAULT_CHARSET('French', UTL_I18N.GENERIC_CONTEXT, FALSE)
This returns 'WE8ISO8859P1'
.
UTL_I18N.GET_DEFAULT_CHARSET('French', UTL_I18N.MAIL_CONTEXT, TRUE)
This returns 'WE8MSWIN1252
'.
UTL_I18N.GET_DEFAULT_CHARSET('French', UTL_I18N.MAIL_CONTEXT, FALSE)
This returns 'WE8ISO8859P1
'.
This function returns the default ISO 4217 currency code for the specified territory.
UTL_I18N.GET_DEFAULT_ISO_CURRENCY ( territory IN VARCHAR2 CHARACTER SET ANY_CS) RETURN VARCHAR2;
Table 170-5 GET_DEFAULT_ISO_CURRENCY Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle territory. It is case-insensitive. |
If the user specifies an invalid territory name, then the function returns a NULL
string.
Displays the default ISO currency code for China.
DECLARE retval VARCHAR2(50); BEGIN retval := UTL_I18N.GET_DEFAULT_ISO_CURRENCY('CHINA'); DBMS_OUTPUT.PUT_LINE(retval); END; /
This function returns the most commonly used Oracle linguistic sort for the specified language.
UTL_I18N.GET_DEFAULT_LINGUISTIC_SORT ( language IN VARCHAR2 CHARACTER SET ANY_CS) RETURN VARCHAR2;
Table 170-6 GET_DEFAULT_LINGUISTIC_SORT Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle language. It is case-insensitive. |
If the user specifies an invalid language name, then the function returns a NULL
string.
Displays the name of the most appropriate linguistic sort name for the language used in the current SQL session.
DECLARE retval VARCHAR2(50); BEGIN SELECT value INTO retval FROM nls_session_parameters WHERE parameter = 'NLS_LANGUAGE'; retval := UTL_I18N.GET_DEFAULT_LINGUISTIC_SORT(retval); DBMS_OUTPUT.PUT_LINE(retval); END; /
This function returns the local language names for the specified territory.
UTL_I18N.GET_LOCAL_LANGUAGES ( territory IN VARCHAR2 CHARACTER SET ANY_CS) RETURN STRING_ARRAY;
Table 170-7 GET_LOCAL_LANGUAGES Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle territory. It is case-insensitive. |
If the user specifies an invalid territory name, then the function returns a NULL
string.
Returns the list of local languages used in Belgium.
DECLARE retval UTL_I18N.STRING_ARRAY; cnt INTEGER; BEGIN retval := UTL_I18N.GET_LOCAL_LANGUAGES('BELGIUM'); DBMS_OUTPUT.PUT('Count = '); DBMS_OUTPUT.PUT_LINE(retval.LAST); cnt := retval.FIRST; WHILE cnt IS NOT NULL LOOP DBMS_OUTPUT.PUT_LINE(retval(cnt)); cnt := retval.NEXT(cnt); END LOOP; END; / ... Count = 2 DUTCH FRENCH
This function returns a list of the Oracle linguistic sort names that are appropriate for the specified language. A BINARY
sort is included for all languages.
UTL_I18N.GET_LOCAL_LINGUISTIC_SORTS ( language IN VARCHAR2 CHARACTER SET ANY_CS) RETURN STRING_ARRAY;
Table 170-8 GET_LOCAL_LINGUISTIC_SORTS Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle language. It is case-insensitive. |
If the user specifies an invalid language name, then the function returns a NULL
string.
Displays the local linguistic sort names for JAPANESE
.
DECLARE retval UTL_I18N.STRING_ARRAY; cnt INTEGER; BEGIN retval := UTL_I18N.GET_LOCAL_LINGUISTIC_SORTS('Japanese'); DBMS_OUTPUT.PUT('Count = '); DBMS_OUTPUT.PUT_LINE(retval.COUNT); cnt := retval.FIRST; WHILE cnt IS NOT NULL LOOP DBMS_OUTPUT.PUT_LINE(retval(cnt)); cnt := retval.NEXT(cnt); END LOOP; END; / ... Count = 2 JAPANESE_M BINARY
This function returns the local territory names for the specified language.
UTL_I18N.GET_LOCAL_TERRITORIES ( language IN VARCHAR2 CHARACTER SET ANY_CS) RETURN STRING_ARRAY;
Table 170-9 GET_LOCAL_TERRITORIES Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle language. It is case-insensitive. |
If the user specifies an invalid language name, then the function returns a NULL
string.
Returns the list of Oracle territories that use German as one of their local languages.
DECLARE retval UTL_I18N.STRING_ARRAY; cnt INTEGER; BEGIN retval := UTL_I18N.GET_LCOAL_TERRITORIIES('GERMAN'); DBMS_OUTPUT.PUT('Count = '); DBMS_OUTPUT.PUT_LINE(retval.LAST); cnt := retval.FIRST; WHILE cnt IS NOT NULL LOOP DBMS_OUTPUT.PUT_LINE(retval(cnt)); cnt := retval.NEXT(cnt)); END LOOP; END; / ... Count = 4 GERMANY AUSTRIA LUXEMBOURG SWITZERLAND
This function returns the local time zone IDs for the specified territory.
UTL_I18N.GET_LOCAL_TIME_ZONES ( territory IN VARCHAR2 CHARACTER SET ANY_CS DEFAULT NULL) RETURN STRING_ARRAY;
Table 170-10 GET_LOCAL_TIME_ZONES Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid Oracle territory. It is case-insensitive. |
If the user specifies an invalid territory name, then the function returns a NULL
string.
Creates a function that returns the list of time zones locally used in the territory AZERBAIJAN
followed by the general common time zones. This is useful for when the user's territory is known and the application still allows the user to choose other time zones as a user's preference.
CREATE OR REPLACE FUNCTION get_time_zones (territory IN VARCHAR2 CHARACTER SET ANY_CS) RETURN utl_i18n.string_array IS retval utl_i18n.string_array; retval2 utl_i18n.string_array; stpos INTEGER; BEGIN retval := utl_i18n.get_local_time_zones( territory); retval2 := utl_i18n.get_common_time_zones; stpos := retval.LAST + 1; retval(stpos) := '-----'; -- a separator FOR i IN retval2.FIRST..retval2.LAST LOOP stpos := stpos + 1; retval(stpos) := retval2(i); END LOOP; RETURN retval; END; /
Returns the list of local time zones for AZERBAIJAN
followed by the common time zones with a separator string of five dashes (-----).
DECLARE retval UTL_I18N.STRING_ARRAY; cnt INTEGER; BEGIN DBMS_OUTPUT.ENABLE(100000); retval UTL_I18N.GET_TIME_ZONES('AZERBAIJAN'); cnt := retval.FIRST; WHILE cnt IS NOT NULL LOOP DBMS_OUTPUT.PUT_LINE(retval(cnt)); cnt := retval.NEXT(cnt); END LOOP; END; / Asia/Baku ----- Pacific/Pago_Pago Pacific/Honolulu America/Anchorage America/Vancouver America/Los_Angeles America/Tijuana America/Edmonton America/Denver America/Phoenix America/Mazatlan America/Winnipeg America/Regina America/Chicago America/Mexico_City America/Guatemala America/El_Salvador America/Managua America/Costa_Rica America/Montreal ...
This function returns the translation of the language and territory name in the specified translation language.
UTL_I18N.GET_TRANSLATION ( parameter IN VARCHAR2 CHARCTER SET ANY_CS, trans_language IN VARCHAR2 'AMERICAN', flag IN PLS_INTEGER DEFAULT LANGUAGE_TRANS) RETURN VARCHAR2 CHARACTER SET parameter%CHARSET;
Table 170-11 GET_TRANSLATION Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid language name, territory name, or a combined string in the form of |
|
Specifies a translation language name. For example, |
|
Specifies the translation type:
The default translation type is |
If VARCHAR2
is used as a parameter type, the returned translation text can be corrupted due to the conversion to the database character set. Using NVARCHAR2
as the parameter type will preserve the translation text because Unicode can encode all translated languages.
If the specified translation language is not available or an invalid name is provided, the default "American English" translations are returned. For example, Oracle does not provide GUJARATI
translations, so the returned translation would be in American English.
The following returns the names of all the Oracle-supported languages in Italian.
DECLARE CURSOR c1 IS SELECT value FROM V$NLS_VALID_VALUES WHERE parameter = 'LANGUAGE' ORDER BY value; retval NVARCHAR2(100); BEGIN FOR item IN c1 LOOP retval := UTL_I18N.GET_TRANSLATION (TO_NCHAR(item.value), 'italian'); END LOOP; END;
This function maps the following:
An Oracle character set name to an IANA character set name.
An IANA character set name to an Oracle character set name.
An Oracle character set to an e-mail safe character set name.
UTL_I18N.MAP_CHARSET( charset IN VARCHAR2, context IN PLS_INTEGER DEFAULT GENERIC_CONTEXT, flag IN PLS_INTEGER DEFAULT ORACLE_TO_IANA) RETURN VARCHAR2;
Table 170-12 MAP_CHARSET Function Parameters
Parameter | Description |
---|---|
|
Specifies the character set name to be mapped. The mapping is case-insensitive. |
|
|
|
|
An e-mail safe character set is an Oracle character set that is commonly used by applications when they submit e-mail messages. The character set is usually used to convert contents in the database character set to e-mail safe contents. To specify the character set name in the mail header, you should use the corresponding IANA character set name obtained by calling the MAP_CHARSET
function with the ORACLE_TO_IANA
option, providing the e-mail safe character set name as input.
For example, no e-mail client recognizes message contents in the WE8DEC
character set, whose corresponding IANA name is DEC-MCS
. If WE8DEC
is passed to the MAP_CHARSET
function with the MAIL_CONTEXT
option, then the function returns WE8ISO8859P1
. Its corresponding IANA name, ISO-8859-1
, is recognized by most e-mail clients.
The steps in this example are as follows:
Call the MAP_CHARSET
function with the MAIL_CONTEXT | MAIL_GENERIC
option with the database character set name, WE8DEC
. The result is WE8ISO8859P1
.
Convert the contents stored in the database to WE8ISO8859P1
.
Call the MAP_CHARSET
function with the ORACLE_TO_IANA | GENERIC_CONTEXT
option with the e-mail safe character set, WE8ISO8859P1
. The result is ISO-8859-1
.
Specify ISO-8859-1
in the mail header when the e-mail message is submitted.
The function returns a character set name if a match is found. If no match is found or if the flag is invalid, then it returns NULL
.
Note:
Many Oracle character sets can map to one e-mail safe character set. There is no function that maps an e-mail safe character set to an Oracle character set name.UTL_I18N.MAP_CHARSET('iso-8859-1',UTL_I18N.GENERIC_CONTEXT,UTL_I18N.IANA_TO_ORACLE)
This returns 'WE8ISO8859P1'
.
UTL_I18N.MAP_CHARSET('WE8DEC', utl_i18n.mail_context, utl_i18n.mail_generic)
This returns 'WE8ISO8859P1'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle character setsThis function maps an Oracle short language name to an Oracle language name.
UTL_I18N.MAP_FROM_SHORT_LANGUAGE ( language IN VARCHAR2 CHARACTER SET ANY_CS) RETURN VARCHAR2;
Table 170-13 MAP_FROM_SHORT_LANGUAGE Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid short language name. It is case-insensitive. |
If the user specifies an invalid language name, then the function returns a NULL
string.
Returns the default linguistic sort name for the customer with the ID of 9000. Note that the table customers
is from the oe
user in the Common Schema. Because the customer's language preference is stored using a short language name, you need to convert to a full language name by calling the GET_DEFAULT_LINGUISTIC_SORT
procedure.
DECLARE short_n VARCHAR2(10); ling_n VARCHAR2(50); BEGIN SELECT nls_language INTO short FROM customers WHERE customer_id = 9000; ling_n := UTL_I18N.GET_DEFAULT_LINGUISTIC_SORT ( UTL_I18N.MAP_FROM_SHORT_LANGUAGE(short_n)); DBMS_OUTPUT.PUT_LINE(ling_n); END; /
This function returns an Oracle language name from an ISO locale name.
UTL_I18N.MAP_LANGUAGE_FROM_ISO( isolocale IN VARCHAR2) RETURN VARCHAR2;
Table 170-14 MAP_LANGUAGE_FROM_ISO Function Parameters
Parameter | Description |
---|---|
|
Specifies the ISO locale. The mapping is case-insensitive. |
If the user specifies an invalid locale string, then the function returns a NULL
string.
If the user specifies a locale string that includes only the language (for example, en_
instead of en_US
), then the function returns the default language name for the specified language (for example, American
).
UTL_I18N.MAP_LANGUAGE_FROM_ISO('en_US')
This returns 'American'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle languagesThis function returns an ISO locale name from an Oracle language name and an Oracle territory name. A valid string must include at least one of the following: a valid Oracle language name or a valid Oracle territory name.
UTL_I18N.MAP_LOCALE_TO_ISO ( ora_language IN VARCHAR2, ora_territory IN VARCHAR2) RETURN VARCHAR2;
Table 170-15 MAP_LOCALE_TO_ISO Function Parameters
Parameter | Description |
---|---|
|
Specifies an Oracle language name. It is case-insensitive. |
ora_territory |
Specifies an Oracle territory name. It is case-insensitive. |
If the user specifies an invalid string, then the function returns a NULL
string.
UTL_I18N.MAP_LOCALE_TO_ISO('American','America')
This returns 'en_US'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle languages and territoriesThis function returns an Oracle territory name from an ISO locale.
UTL_I18N.MAP_TERRITORY_FROM_ISO ( isolocale IN VARCHAR2) RETURN VARCHAR2;
Table 170-16 MAP_TERRITORY_FROM_ISO Function Parameters
Parameter | Description |
---|---|
|
Specifies the ISO locale. The mapping is case-insensitive. |
If the user specifies an invalid locale string, then the function returns a NULL
string.
If the user specifies a locale string that includes only the territory (for example, _fr
instead of fr_fr
), then the function returns the default territory name for the specified territory (for example, French
).
UTL_I18N.MAP_TERRITORY_FROM_ISO('en_US')
This returns 'America'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle territoriesThis function maps an Oracle language name to an Oracle short language name.
UTL_I18N.MAP_TO_SHORT_LANGUAGE ( language IN VARCHAR2 CHARACTER SET ANY_CS) RETURN VARCHAR2;
Table 170-17 MAP_TO_SHORT_LANGUAGE Function Parameters
Parameter | Description |
---|---|
|
Specifies a valid full language name. It is case-insensitive. |
If the user specifies an invalid language name, then the function returns a NULL
string.
Returns the short language name for the language.
DECLARE retval VARCHAR2(100);BEGIN retval := UTL_I18N.MAP_TO_SHORT_LANGUAGE('american'); DBMS_OUTPUT.PUT_LINE(retval);END;/US
This function converts RAW
data from a valid Oracle character set to a VARCHAR2
string in the database character set.
The function is overloaded. The different forms of functionality are described along with the syntax declarations.
Buffer Conversion:
UTL_I18N.RAW_TO_CHAR( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL) RETURN VARCHAR2;
Piecewise conversion converts raw data into character data piece by piece:
UTL_I18N.RAW_TO_CHAR ( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL, scanned_length OUT PLS_INTEGER, shift_status IN OUT PLS_INTEGER) RETURN VARCHAR2;
Table 170-18 RAW_TO_CHAR Function Parameters
Parameter | Description |
---|---|
|
Specifies the |
|
Specifies the character set that the |
|
Specifies the number of bytes of source data scanned |
|
Specifies the shift status at the end of the scan. The user must set it to Note: ISO 2022 character sets use escape sequences instead of shift characters to indicate the encoding method. |
If the user specifies an invalid character set, NULL
data, or data whose length is 0, then the function returns a NULL
string.
UTL_I18N.RAW_TO_CHAR(hextoraw('616263646566C2AA'), 'utf8')
This returns the following string in the database character set:
'abcde'||chr(170)
UTL_I18N.RAW_TO_CHAR(hextoraw('616263646566C2AA'),'utf8',shf,slen)
This expression returns the following string in the database character set:
'abcde'||chr(170)
It also sets shf
to SHIFT_IN
and slen
to 8
.
The following example converts data from the Internet piece by piece to the database character set.
rvalue RAW(1050); nvalue VARCHAR2(1024); conversion_state PLS_INTEGER = 0; converted_len PLS_INTEGER; rtemp RAW(10) = ''; conn utl_tcp.connection; tlen PLS_INTEGER; ... conn := utl_tcp.open_connection ( remote_host => 'localhost', remote_port => 2000); LOOP tlen := utl_tcp.read_raw(conn, rvalue, 1024); rvalue := utl_raw.concat(rtemp, rvalue); nvalue := utl_i18n.raw_to_char(rvalue, 'JA16SJIS', converted_len, conversion_stat); if (converted_len < utl_raw.length(rvalue) ) then rtemp := utl_raw.substr(rvalue, converted_len+1); else rtemp := ''; end if; /* do anything you want with nvalue */ /* e.g htp.prn(nvalue); */ END LOOP; utl_tcp.close_connection(conn); EXCEPTION WHEN utl_tcp.end_of_input THEN utl_tcp.close_connection(conn); END;
This function converts RAW
data from a valid Oracle character set to an NVARCHAR2
string in the national character set.
The function is overloaded. The different forms of functionality are described along with the syntax declarations.
Buffer Conversion:
UTL_I18N.RAW_TO_NCHAR ( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL) RETURN NVARCHAR2;
Piecewise conversion converts raw data into character data piece by piece:
UTL_I18N.RAW_TO_NCHAR ( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL, scanned_length OUT PLS_INTEGER, shift_status IN OUT PLS_INTEGER) RETURN NVARCHAR2;
Table 170-19 RAW_TO_NCHAR Function Parameters
Parameter | Description |
---|---|
|
Specifies the |
|
Specifies the character set that the |
|
Specifies the number of bytes of source data scanned |
|
Specifies the shift status at the end of the scan. The user must set it to Note: ISO 2022 character sets use escape sequences instead of shift characters to indicate the encoding method. |
If the user specifies an invalid character set, NULL
data, or data whose length is 0, then the function returns a NULL
string.
UTL_I18N.RAW_TO_NCHAR(hextoraw('616263646566C2AA'),'utf8')
This returns the following string in the national character set:
'abcde'||chr(170)
UTL_I18N.RAW_TO_NCHAR(hextoraw('616263646566C2AA'),'utf8', shf, slen)
This expression returns the following string in the national character set:
'abcde'||chr(170)
It also sets shf
to SHIFT_IN
and slen
to 8
.
The following example converts data from the Internet piece by piece to the national character set.
rvalue RAW(1050); nvalue NVARCHAR2(1024); converstion_state PLS_INTEGER = 0; converted_len PLS_INTEGER; rtemp RAW(10) = ''; conn utl_tcp.connection; tlen PLS_INTEGER; ... conn := utl_tcp.open_connection ( remote_host => 'localhost', remote_port => 2000); LOOP tlen := utl_tcp.read_raw(conn, rvalue, 1024); rvalue := utl_raw.concat(rtemp, rvalue); nvalue := utl_i18n.raw_to_nchar(rvalue, 'JA16SJIS', converted_len, conversion_stat); if (converted_len < utl_raw.length(rvalue) ) then rtemp := utl_raw.substr(rvalue, converted_len+1); else rtemp := ''; end if; /* do anything you want with nvalue */ /* e.g htp.prn(nvalue); */ END LOOP; utl_tcp.close_connection(conn); EXCEPTION WHEN utl_tcp.end_of_input THEN utl_tcp.close_connection(conn); END;
This function converts a VARCHAR2
or NVARCHAR2
string to another valid Oracle character set and returns the result as RAW
data.
UTL_I18N.STRING_TO_RAW( data IN VARCHAR2 CHARACTER SET ANY_CS, dst_charset IN VARCHAR2 DEFAULT NULL) RETURN RAW;
Table 170-20 STRING_TO_RAW Function Parameters
Parameter | Description |
---|---|
|
Specifies the |
|
Specifies the destination character set. If |
If the user specifies an invalid character set, a NULL
string, or a string whose length is 0, then the function returns a NULL
string.
DECLARE r raw(50); s varchar2(20); BEGIN s:='abcdef'||chr(170); r:=utl_i18n.string_to_raw(s,'utf8'); dbms_output.put_line(rawtohex(r)); end; /
This returns a hex value of '616263646566C2AA'
.
This function performs script transliteration. In this release, the TRANSLITERATE
function only supports Japanese Kana conversion.
UTL_I18N.TRANSLITERATE ( data IN VARCHAR2 CHARACTER SET ANY_CS, name IN VARCHAR2) RETURN VARCHAR2 CHARACTER SET data%CHARSET;
Table 170-21 TRANSLITERATE Function Parameters
Parameter | Description |
---|---|
|
Specifies the data to be converted. Either |
|
Specifies the transliteration name string. For a list of valid names, see Table 170-22. |
These options specify Japanese Kana conversions.
Table 170-22 TRANSLITERATE Function Constants
Constant Name | Value | Description |
---|---|---|
|
|
Converts any type of Kana character to a fullwidth Katakana character. |
|
|
Converts any type of Kana character to a halfwidth Katakana character. |
|
|
Converts any type of Kana character to a fullwidth Hiragana character. |
|
|
Converts only fullwidth Katakana characters to halfwidth Katakana characters. |
|
|
Converts only fullwidth Katakana characters to fullwidth Hiragana characters. |
|
|
Converts only halfwidth Katakana characters to fullwidth Katakana characters. |
|
|
Converts only halfwidth Katakana characters to fullwidth Hiragana characters. |
|
|
Converts only fullwidth Hiragana characters to fullwidth Katakana characters. |
|
|
Converts only fullwidth Hiragana characters to halfwidth Katakana characters. |
The function returns the converted string.
Given a table japanese_emp
, containing an NVARCHAR2
column ename
, the following statement can be used to normalize all the kana names in ename
to hiragana:
UPDATE japanese_emp SET ename = UTL_I18N.TRANSLITERATE (ename, 'kana_hiragana');
Figure shows how this output might look.
Figure 170-1 Loading Locale-Specific Data to the Database
The following statement normalizes one kana name to hiragana:
DECLARE Name japanese_emp.ename%TYPE; Eno CONSTANT NUMBER(4) := 1; BEGIN SELECT ename INTO name FROM japanese_emp WHERE enumber = eno; name := UTL_I18N.TRANSLITERATE(name, UTL_I18N.KANA_HIRAGANA); UPDATE japanese_emp SET ename = name WHERE enumber = eno; EXCEPTION WHEN UTL_I18N.UNSUPPORTED_TRANSLITERATION THEN DBMS_OUTPUT.PUT_LINE('transliteration not supported'); END; /
This function returns a string from an input string that contains character references. It decodes each character reference to the corresponding character value.
See Also:
"ESCAPE_REFERENCE Function" for more information about escape sequencesUTL_I18N.UNESCAPE_REFERENCE ( str IN VARCHAR2 CHARACTER SET ANY_CS) RETURN VARCHAR2 CHARACTER SET str%CHARSET;
Table 170-23 UNESCAPE_REFERENCE Function Parameters
Parameter | Description |
---|---|
|
Specifies the input string |
If the user specifies a NULL
string or a string whose length is 0, then the function returns a NULL
string. If the function fails, then it returns the original string.
UTL_I18N.UNESCAPE_REFERENCE('hello < å')
This returns 'hello <'||chr(229)
.