day02 QT learning Chinese characters and garbled question

character set

ASCII character set a total of 7 128 characters, standard ASCII highest bit (b7) is used as parity, if not the first, the first will be lost.

ISO-8859-1 extended ASCII, with 128-255 stored Latin characters, so basically Western countries have been able to use the character set.

ANSI American National Standards Institute

It defines a multi-byte character set (MBCS, Multi-ByteChactacter Set), 0-127 characters between, is still a representative of a character byte, 2 bytes represents one character.

 

Multi-byte character set is based on the establishment of GB2312 GBK coding

GB2312, GBK still ANSI code, 127 is greater than two represent a kanji character, GB2312 6763 may represent a commonly used Chinese characters. GBK encoding is an extension GB2312 Chinese characters, can represent 21,003 characters.

 

UTF encoding is a variable-length encoding, single byte and ASCII codes are the same. For symbol n bytes (n> 1), the first n bits of the first byte is 1, n + 1 is 0, the first two bytes are 10 behind.

UTF-8 byte length 1-4

UTF-16 length of 2 or 4 bytes

UTF-32 4-byte length

QT Programming, QString using UTF-16 internal memory. The default encoding is under Chinese GB2312 / GBK.

 

BOM endian

LE, the small end, the lower addresses stored in the lower.

BE, big end, high stored in the lower address.

BOM header byte order mark, the text header

FE FF denotes BE, i.e., large end

FF FE represents LE, i.e., small end  

 

QString

Internal QString using 16-bitQChars ushort Unicode4.0 stored string. When you use a non-operation inside the QT interface string, note encoding conversion.

QString string encapsulation processing.

Empty judge == "" isNull isEmpty

String concatenation + =

String formatting% 1% 2 arg ()

For example, / * Format * /

    QString ssr;

    ssr = QString("name=%1 %2 %3 %4 %5")

            .arg("xiaoming")

            .arg(15)

            .arg(14.5)

            .arg(123,0,2)

            .arg(255,8,16);

    qDebug() << ssr;

 

Meanwhile QSting the Find and Replace supports regular expressions.

 

 

QT Chinese garbled question

 

The default character set, or codes do not coincide with the data source.

QT character sets:

codec = QTextCodec::codecForName(“UTF-8”);

QTextCodec::setCodecForLocale(codec);

QTextCodec::availableCodecs();

After setting the code set will only affect QString :: fromlocal8bit and tolocal8bit

as follows:

    char * src = "Metadata Chinese GBK";

    // gbk or gb2312 metadata is stored in a multi-byte QString

    // default local encoding GBK

    QString str1 = QString::fromLocal8Bit(src);

    qDebug() << str1;

 

    // convert the QString gbk, output

    cout << str1.toLocal8Bit().toStdString() << endl;

 

    // set the local encoding format, the original

    QTextCodec::setCodecForLocale(QTextCodec::codecForName("UTF-8"));

    QString str2 = QString::fromLocal8Bit(str1.toUtf8());

    // equivalent QString str2 = QString :: fromLocal8Bit (str1.toStdString () c_str ().);

 

    Under QString use as an argument // windows

    MessageBox (0, str2.toStdWString () c_str (), L "Chinese title q", 0.);

 

 

Source file format character set (VS and qtCreator different settings).

 

QtCreator coded character set: Tools -> Options -> Text Editor -> Behavior

 

VS can be saved as a file in UTF encoding, or added

#pragma execution_character_set ( "UTF-8"), with the code statement encoding format.

Note that if this point has the file encoding is UTF, and if more than #pragma syntax again, this will result in garbled again.

If you set the character set in the vs project property, now useless.

 

Character set conversion macro function QStringLiteral, is converted to the multi-byte UTF.

Guess you like

Origin www.cnblogs.com/merlinzjl/p/11391860.html