[Character encoding] c++ encoding format and conversion

References:

The difference between wstring and string
wstring and string are two string types in C++. Their main difference lies in the character encoding and storage method.

  1. Character code:

    • stringUse a single-byte character encoding, typically ASCII encoding or its extensions (such as UTF-8).
    • wstring uses a wide character encoding, typically UTF-16 or UTF-32. This makes wstring more suitable for processing multilingual text, especially text containing non-Latin characters.
  2. existence method:

    • string uses a single byte to store each character, so each character occupies one byte of memory. This makes string very efficient when storing English text.
    • wstring uses multiple bytes to store each character, typically occupying two or four bytes of memory per character, depending on the character encoding. This makes wstring more efficient when storing text containing multi-byte characters.
  3. Applicable scene:

    • Generally, if the text you are dealing with mainly contains English characters, then using string is appropriate because it is more memory-saving and efficient.
    • If you need to process multilingual text, especially text containing non-Latin characters, or need to interact with Windows APIs, then using wstring may be more appropriate.

It should be noted that when using wstring, you need to be extra careful with character encoding to avoid encoding inconsistencies or garbled characters. At the same time, in C++11 and later standards, it is recommended to use std::wstring_convert or other string processing functions in the standard library to process wstring and Conversion between string.

Guess you like

Origin blog.csdn.net/qq_30340349/article/details/133854843