Chinese characters display garbled problem in Oracle database

Abstract Improper setting of
character set is a key problem affecting the display of Chinese characters in Oracle database . Based on practical experience, this paper introduces the classification, composition and setting method of oracle 's character set, analyzes the common phenomena and causes of garbled Chinese characters displayed in ORACLE database, and proposes effective solutions for various phenomena and causes.

Keywords: ORACLE character set garbled characters to solve UTF-8

1 Introduction
    ORACLE database, as an industry-leading database product, has been widely used in domestic large and medium-sized enterprises in recent years. Although the ORACLE database product itself has been quite mature in terms of localization, there are still many users who report that Chinese characters display garbled characters. For example, different users in the same database query the username in the same table and get different results: "ORACLE??????" and "ORACLE China Co., Ltd.", obviously Chinese characters are displayed as garbled characters in the results, then why? Improper setting of character set is the key problem that affects the display of Chinese characters in ORACLE database.

2 About the character set The
    character set is set by ORACLE to adapt to the display of different languages. The character sets used for Chinese character display mainly include ZHS16CGB231280, ZHS16GBK, US7ASCII and UTF-8. Character sets exist on both the server side and the client side. The server-side character set is specified when installing ORACLE, and the character set registration information is stored in the V$NLS_PARAMETERS table of the ORACLE database dictionary; while the client-side character set is in the system registry (WINDOWS system) or in the user's environment variables (UNIX) system) is set.

3 Composition and setting of
    character set The composition and setting method of character set are divided into two types: client side and server side:

(1) The composition and setting of the client character set. The client's character set is set by the current user's environment variable NLS_LANG. The composition of the environment variable NLS_LANG: NLS_LANG=language_territory.charset

where: language specifies the language of the server message

      territory specifies the server's date and number format

      charset specifies the character set

The three components can be combined arbitrarily, for example:

      AMERICAN_AMERICA.US7SCII

      SIMPLIFIED CHINESE_CHINA.ZHS16GBK

      AMERICAN_AMERICA.ZHS16GBK

The setting method of the client character set is slightly different for different operating systems :

    The WINDOWS system is set in the registry key: HKEY_LOCAL_MACHINE/SOFTWARE/ORACLE/HOME0/NLS_LANG;

    The UNIX system is set in the current user's environment variables, such as adding a line of the following code to the current user's profile file:

    NLS_LANG=SIMPLIFIED Chinese_CHINA.ZHS16GBK;export NLS_LANG

(2) The composition and setting of the character set of the server. The composition of the server character set is reflected in the values ​​of NLS_LANGUAGE, NLS_TERRITORY, and NLS_CHARACTERSET in the data dictionary table V$NLS_PARAMETERS, where the value of NLS_CHARACTERSET is the specific database character set. For example, using the query statement SQL>SELECT * FROM V$NLS_PARAMETERS; the following results can be obtained:

PARAMETER                   VALUE

NLS_LANGUAGE           SIMPLIFIED CHINESE

NLS_TERRITORY               CHINA

NLS_CHARACTERSET           ZHS16GBK

    That is, the character set used by the current database is ZHS16GBK.

    The character set of the database server is set when the data is created. However, the set character set can be modified by the following methods:

    Method 1: Rebuild the database. When creating a database, set the character set of the database to the required character set.

    Method 2: Modify the SYS.PROPS$ table. After logging in to ORACLE with the SYS user, use the following statement to modify the corresponding character set and submit: SQL>UPDATE PROPS$ SET VALUE$='ZHS16GBK'WHERE NAME='NLS_CHARACTERSET';SQL>COMMIT;

    Changing the database character set by this method is only valid for the changed data, that is, the original data in the database is still stored in the original character set.

    In addition, some also use the CREATE DATABASE CHARACTER SET ZHS16GBK command to temporarily modify the character set. When the database is restarted, the database character set will be restored to the original character set.

4 Common Chinese character garbled problems and solutions
    To correctly display the Chinese character information in the ORACLE database on the client side, first of all, the character set of the client must be consistent with the character set of the server; secondly, the character set of the data loaded into the ORACLE database must be the same as that of the server. set consistent. Accordingly, the problem of Chinese characters displaying garbled characters can be roughly divided into the following situations:

    (1) The client-side character set is different from the server-side character set, and the server-side character set is consistent with the loaded data character set. This situation is the most common, as long as the client's character set is set correctly. Specific solutions:

     Step 1: Query V$NLS_PARAMETERS to get the character set of the server: SQL>SELECT * FROM V$NLS_PARAMETERS;

PARAMETER                       VALUE

NLS_LANGUAGE            SIMPLIFIED CHINESE

NLS_TERRITORY                   CHINA             

NLS_CHARACTERSET              ZHS16GBK   

    Step 2: Set the character set of the client according to the character set of the server. For the setting method, please refer to the setting method of the character set of the client. Taking the UNIX system as an example, the following two lines can be added to the current user's profile file:

    NLS_LANG=SIMPLIFIED Chinese_CHINA.ZHS16GBK    export NLS_LANG

   (2) The client-side character set is the same as the server-side character set, and the server-side character set is inconsistent with the loaded data character set. This situation generally occurs when the ORACLE version is upgraded or the database is reinstalled, and a different character set is selected from the original database, and the backup data loaded for recovery is still unloaded according to the original character set. Another situation is to load data from other ORACLE databases that use different character sets. In both cases, Chinese characters cannot be displayed correctly regardless of whether the client-side character set is the same as the server-side character set. Specific solutions:

    Option 1: Modify the character set of the server to be consistent with the character set of the loaded data according to the modification method of the character set of the server, and then import the data.

    Option 2: Use the data format dump to avoid the problem caused by the character set. That is, first pour the loaded data into a database that is consistent with its character set, and then export the data either in text format (in the case of a small amount of data), or through third-party tools (such as POWER BUILDER, ACCESS, FOXPRO, etc.) Pour out the data, and finally import the poured data into the target database.

    (3) The client-side character set is different from the server-side character set, and the server-side character set is different from the input data character set. In this case, when the character set of the client is inconsistent with the character set of the server, Chinese character information is input from the client. The entered information cannot display Chinese characters even if the client character set is changed correctly. Solution: After modifying the character set of the client to be consistent with the character set of the server, re-enter the data.

5 Conclusions
   According to the description of the official ORACLE document, once the database is created, the character set of the database cannot be changed. Therefore, it is important to consider in advance which character set you will use for your database. The general rule for database character set selection is to set the database character set to a superset of the operating system's native character set, and the database character set should also be a superset of all client character sets. If it is also a Chinese environment, when choosing ZHS16CGB231280 or ZHS16GBK, we are more likely to choose ZHS16GBK because it contains the ZHS16CGB231280 character set.

References
1. (US) JONATHAN GENNICK CAROL MCCULLOUGH-DIETER GERRIT-JAN LINKER, translators: Zhao Yanqin, Liu Guanying, Qin Yujie, etc. "ORACLE8I DBA Collection". Electronic Industry Press

2.JASON COUCHMAN,SUDHEER MARISETTI.《OCP ORACLE9I DATABASE:FUNDAMENTALS I EXAM GUIDE》.出版社: MCGRAW-HILL

3.ORACLE Corporation.ORACLE 9i Database Administration Fundamentals I Student Guide》

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326236876&siteId=291194637