Java Foundation - Character Class of IO Stream Object (FileWrite and FileReader)

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　Author: Yin Zhengjie

1. Introduction to common coding

1>ASCII

　　We know that computers were invented by foreigners, and they did not take into account the use of computers all over the world at that time, so they did not consider Chinese, Japanese, Thai, etc. when designing coding, but only used English capitalization and punctuation. When some special characters are taken into account, this encoding format is called ASCII encoding. Each letter or special character will be identified by a binary combination of 8 "0" and "1". Smart friends estimate that the mouth can be calculated. What is the eighth power of 2? Yes, it is 256, which means that there is no way to represent the languages of all countries with 8 consecutive "0" or "1". Because its maximum upper limit is 256 characters of storage. For example, if we want to save an uppercase letter "A", we need one byte, which in binary is: "0100 0001"

2>.gb2312 and gbk

　　This encoding format was born to solve the situation that ASCII cannot store Chinese characters, that is to say, gb2312 can store our Chinese characters, and gbk is only a supplement to gb2312, which mainly supplements traditional Chinese characters. Both of these two encoding formats can store all ASCII characters and all Chinese characters, which means that you have developed a Chinese game. If you install this game on computers in Japan, Thailand and other countries, it may be Can't run because their computer's OS is also their native language. gb2312 and gbk encoding are much better than utf8 in storing Chinese characters, because utf8 needs 3 bytes (ie 24 bits) to store a Chinese character, while gbk only needs 2 bytes to do it, occupying space less.

3>.Universal code

　　However, with the rapid development of computers, other countries also need to store the text of the corresponding country (for example, gbk code appeared in China, Shift_JIS code appeared in Japan, Euc-kr code appeared in South Korea, etc.), so another kind of code appeared. It can store the codes of various countries, which we call "Universal Code" (ie Unicode), and supports the languages and characters of most countries. It has a lower limit requirement, that is, American characters must be stored with at least 16 bits (two bytes), that is to say, a character must be represented by consecutive 16 binary digits of "0" and "1". Wrong, this encoding format can save all ASCII characters, but the ASCII encoding that was accessed with one byte before is now accessed with two bytes. For example, when accessing a capital "A", it is expressed in binary as: "000000000100 0001", which is obviously a waste of space. What used to be stored in 8 bits is now stored in 16 bits. To store an ordinary Chinese character, it only needs to use 3 bytes to access, that is, 24 bits for storage, that is, 24 consecutive "0" or "1" for storage. Even more complex literals may require 4 to 5 bytes for storage.

4>.UTF8

　　We just mentioned that the universal code is really good, because it can store the languages and characters of almost all countries, but the disadvantage is that the storage space for each character must be at least 2 bytes, which is very difficult to store some simple characters. It is obviously a waste of space, so another encoding UTF-8 has emerged, which we call variable-length character encoding. It is actually a compression of the Unicode, which can represent a character with the least number of bits (each character represents at least 8 bits, that is, a byte). In other words, an English character is represented by one byte (accessed according to the previous ASCII method), and the storage of Chinese characters is still stored in the way of Unicode, that is, 3 bytes represent a Chinese character.

5>. Recommended encoding format

　　We know that there are many encoding formats. Although gbk saves space for storing Chinese characters than utf8, we still recommend using utf8 encoding format for three reasons:

　　　　First, gbk does not contain the text of other countries;

　　　　Second, many open source software are developed by foreigners, and most of them use utf8 encoding format;

　　　　第三，Python3.x解释器默认使用utf-8方式进行编码和解码（当然你也可以指定编码格式）；

　　想要了解更多编码的知识请参考：http://www.cnblogs.com/yinzhengjie/p/7518172.html

二.字符输出流写文本FileWriter类

　　java.io.Writer是写入字符流的抽象类，换句话说，它是所有字符输出流的超类。

　　作用：将内存中的字符写入到文本中。这里演示的是它常用的子类对象，即FileWriter。

 1 /*
 2 @author :yinzhengjie
 3 Blog:http://www.cnblogs.com/yinzhengjie/tag/Java%E5%9F%BA%E7%A1%80/
 4 EMAIL:[email protected]
 5 */
 6 
 7 package cn.org.yinzhengjie.note5;
 8 
 9 import java.io.FileWriter;
10 import java.io.IOException;
11 
12 public class WriterDemo {
13     public static void main(String[] args) throws IOException {
14         FileWriter fw = new FileWriter("yinzhengjie.txt");
15         //写入字符串
16         fw.write("尹正杰");
17         //将内存中的数据刷新到文件中
18         fw.flush();
19         //写入一个字符串数组
20         char[] buf = {'a','b','c','d','e'};
21         fw.write(buf);
22         //写入一个字符串数组的一部分
23         fw.write(buf, 1, 3);
24         //写入一个字符
25         fw.write(97);
26         //释放资源，其实在释放资源时会默认调用flush方法。
27         fw.close();
28     }
29 }

三.字符输入流读取文本FileReader类

　　java.io.Reader是字符输入流读取文本文件的抽象类，它也是所有字符输入流的超类。

　　作用：专门读取文本文件，我们这里演示的是FileReader。

尹正杰abcdebcda

yinzhengjie.txt 文本内容

 1 /*
 2 @author :yinzhengjie
 3 Blog:http://www.cnblogs.com/yinzhengjie/tag/Java%E5%9F%BA%E7%A1%80/
 4 EMAIL:[email protected]
 5 */
 6 
 7 package cn.org.yinzhengjie.note5;
 8 
 9 import java.io.FileReader;
10 import java.io.IOException;
11 
12 public class ReaderDemo {
13     public static void main(String[] args) throws IOException {
14         FileReader fr = new FileReader("yinzhengjie.txt");
15         /*
16         //按照单个字符读取
17         int len;
18         while((len = fr.read()) != -1) {
19             System.out.print((char)len);
20         }
21         fr.close();
22         */
23         
24         //按照字节数组的方式读取
25         char[] buf = new char[4096];
26         int len = 0;
27         while((len = fr.read(buf)) != -1) {
28             System.out.println(new String(buf,0,len));
29         }
30         
31     }
32 }
33 
34 /*
35 以上代码执行结果如下：
36 尹正杰abcdebcda
37 */

四.字符流复制文本文件

 1 /*
 2 @author :yinzhengjie
 3 Blog:http://www.cnblogs.com/yinzhengjie/tag/Java%E5%9F%BA%E7%A1%80/
 4 EMAIL:[email protected]
 5 */
 6 
 7 package cn.org.yinzhengjie.note5;
 8 
 9 import java.io.FileReader;
10 import java.io.FileWriter;
11 import java.io.IOException;
12 
13 public class CopyFile {
14     public static void main(String[] args) {
15         FileReader fr = null;
16         FileWriter fw = null;
17         
18         try {
19             fr = new FileReader("yinzhengjie.txt");
20             fw = new FileWriter("yinzhengjie.backup");
21             
22             char[] buf = new char[4096];
23             int len ;
24             while(( len = fr.read(buf)) != -1 ) {
25                 fw.write(buf, 0, len);
26                 fw.flush();
27             }
28         } catch (Exception e) {
29             throw new RuntimeException("复制失败！");
30         }finally {
31             try {
32                 if(fw != null) {
33                     fw.close();
34                 }
35             }catch(IOException e) {
36                 throw new RuntimeException("释放资源失败！");
37             }finally {
38                 try {
39                     if(fr != null) {
40                         fr.close();
41                     }
42                 }catch(IOException e) {
43                     throw new RuntimeException("释放资源失败！");
44                 }
45             }
46         }
47     }
48 }

Java Foundation - Character Class of IO Stream Object (FileWrite and FileReader)

Guess you like