The file transferred from eclipse to hadoop appears garbled

When this problem occurred, I first looked for a few questions:

1. Is the file utf-8

2. Upload to hadoop in Linux, go down in Linux to see if it is garbled

3. There is no problem with the above, I went to check eclipse, changed the project engineering to utf-8, and found that it was not working

4. After reading the introduction, change the system settings of eclipse to uft-8, and that's it

UTF-8 settings in eclipse

1.windows->Preferences to open the "Preferences" dialog box;

2. Then, general->Workspace, the Text file encoding on the right, select Other, and change it to UTF-8.

3. Web->Open, set CSS, HTML, JSP, JavaScript, XML, etc. to UTF-8.

Or change it here, general->Content Types, the Context Types tree on the right, click Text, select Java Source File, enter UTF-8 in the Default encoding input box below, and click Update to set the Java file encoding to UTF- 8. If other changes are needed, the method is the same.

4.java->Installed JREs.

Remove the original Jdk 6.0 that comes with Eclipse, and reconfigure Jdk 1.6.0_06 (Add: in the C drive, under the Java file in Programe. C:\Program Files\Java)

In addition,

UTF-8 settings in myeclipse

same,

1 Under Window option—
preferences option; 2 Open preferences->General->workspace and set UTF-8;
3 The same is preferences->MyEclipse->Files and Editors ->CSS, HTML, JSP, JavaScript, XML, etc. to set UTF-8 .
4 is also preferences->java->Installed JREs: Remove the original Jdk6.0 that comes with MyEclipse, and reconfigure Jdk1.6.0_06 (Add: under the Java file in the C drive, Programe).

Reason for modification:
If you want to make plug-in development applications have better internationalization support and support Chinese output to the greatest extent, it is best to use UTF-8 encoding for Java files. However, the default character encoding of the Eclipse workspace is the default encoding of the operating system, and the default encoding of the simplified Chinese operating system (Windows XP, Windows
2000 simplified Chinese) is GB18030, the project encoding established in this workspace It is GB18030, and the java file created in the project is also GB18030. If you want to use UTF-8 directly for newly created projects and java files, you need to do the above work, which should not be ignored.

Guess you like

Origin blog.csdn.net/qq_41661800/article/details/94392344