#QT character encoding set GB2312, GBK, BIG5, GB18030, Unicode, UTF-8, UTF-16

In QT4, because the code set is too old, I encountered a "pit": Furnace

These two guys can't be displayed normally, and most of the old projects use the GB2312 or GBK format

Most of the initialization codes in the qt main function are as follows:

#if (QT_VERSION <= QT_VERSION_CHECK(5, 0, 0))
	// 这种编码集已经不推荐使用,过于陈旧的标准
	//QTextCodec	*pTextCodec = QTextCodec::codecForName("GB2312");
	//QTextCodec	*pTextCodec = QTextCodec::codecForName("GBK");
	
	QTextCodec	*pTextCodec = QTextCodec::codecForName("GB18030");
	if(NULL != pTextCodec)
	{
		QTextCodec::setCodecForCStrings(pTextCodec);
		QTextCodec::setCodecForLocale(pTextCodec);
		QTextCodec::setCodecForTr(pTextCodec);
	}
#endif

GB2312 encoding : The national standard for encoding Chinese characters in Simplified Chinese released on May 1, 1981. GB2312 adopts double-byte encoding for Chinese characters, including 7445 graphic characters, including 6763 Chinese characters.

BIG5 encoding : The traditional Chinese standard character set in Taiwan, using double-byte encoding, contains a total of 13,053 Chinese characters, implemented in 1984.

GBK encoding : The national standard for encoding Chinese characters released in December 1995 is an expansion of the encoding of GB2312, which uses double-byte encoding for Chinese characters. The GBK character set contains a total of 21,003 Chinese characters, including all Chinese, Japanese, and Korean Chinese characters in the national standard GB13000-1, and all Chinese characters in the BIG5 encoding.

GB18030 encoding : The national standard for encoding Chinese characters released on March 17, 2000 is an expansion of the GBK encoding, covering Chinese, Japanese, Korean and Chinese minority languages, including 27,484 Chinese characters. The GB18030 character set encodes characters in three ways: single-byte, double-byte and four-byte. Compatible with GBK and GB2312 character sets.

Unicode encoding : an international standard character set, which defines a unique encoding for each character in various languages ​​in the world to meet cross-language, cross-platform text information conversion. Unicode uses four bytes to encode each character.

UTF-8 and UTF-16 encoding : the conversion format of Unicode encoding, variable-length encoding, which is more space-saving than Unicode. The byte order of UTF-16 is big-endian and little-endian.

Guess you like

Origin blog.csdn.net/wangningyu/article/details/128078500