How long is a char in java

background

  Char contains a few bytes. You may remember that when you were in school, the book was written in 2 bytes. It has not been studied deeply. Today we will explore how many bytes are in a char?

Char

  Char was used to store characters at the beginning of the design, but there are so many characters in the world. If there is 1 byte, it means that only 256 can be stored, which is obviously not suitable, and if there are two bytes, Then 65536 kinds can be stored. This number corresponds to the number of language characters in most countries. So the Java team uses unicode as the encoding by default, and a char is stored as 2 bytes.
  There are two questions here?
    1. Does the char of java necessarily have two bytes?
    2. Can Chinese characters be stored in the char?
  Answer the first question first. Must char be two bytes? No, this is related to the character encoding we choose. If we use the "ISO-8859-1" encoding, then a char will only have one byte. If you use "UTF-8" or "GB2312", "GBK" and other encoding formats? These encoding formats use dynamic length, if it is English characters, everyone is one byte. In Chinese, "UTF-8" is three bytes, and "GBK" and "GB2312" are two bytes. For "unicode", it is two bytes anyway.
  Then answer the second question. If a char is stored with "ISO-8859-1", it will definitely not be able to store a Chinese, while for "UTF-8", "GB2312", "GBK" most Chinese Characters can be stored.

to sum up

  The length of char and whether it can store Chinese characters are related to the encoding format. For cross-platform encoding, we should set the corresponding format when encoding and decoding to prevent abnormality caused by encoding and decoding.

Reprint address

发布了190 篇原创文章 · 获赞 19 · 访问量 20万+

Guess you like

Origin blog.csdn.net/zengchenacmer/article/details/75453373