Original article, please indicate the source for reprinting
There are several ways to convert between multibyte (char*) and wide character (wchar_t*), which are introduced one by one in the order from general to special.
1. C library functions
1.1 Key functions
(1) setlocale()
function: configure the localization information
header file: <locale.h>
Function prototype:
char *setlocale (int category, const char * locale);
Function parameters:
category: Indicates the setting of a localized content, and can take the following values:
LC_ALL All options including the following require
LC_COLLATE to configure string comparison
C_CTYPE to configure character categories and conversions, such as all-caps strtoupper()
LC_MONETARY to configure financial Currency
LC_NUMERIC Configure the number of digits after the decimal point
LC_TIME Configure the time and date format, used with strftime()
locale: indicates the local domain code
Return value: if it is NULL, it will return the current locale name (usually C); if it is not empty, it will be set according to category and locale, if successful, it will return the new locale name (regional name) , or NULL on failure.
(2) wcstombs_s()
function: convert wide character encoding string into multi-byte encoding string
header file: <stdlib.h>
Function prototype:
errno_t __cdecl wcstombs_s(size_t * _PtNumOfCharConverted, char * _Dst, size_t _DstSizeInBytes, const wchar_t * _Src, size_t _MaxCountInBytes);
Function parameters:
PtNumOfCharConverted: Point to the length of the converted string plus terminator (unit byte)
Dst: Point to the first address of the converted string
DstSizeInBytes: The maximum byte space of the destination address (unit byte)
_Src: Source wide character String start address
_MaxCountInBytes: The maximum number of bytes that can be stored in the multi-byte string buffer, used to trim the converted string
Return value : 0 is returned if successful, and a failure code if it fails
(3) mbstowcs_s ()
function function: convert multi-byte encoded string into wide character encoded string
header file: <stdlib.h>
Function prototype:
errno_t __cdecl mbstowcs_s(size_t * _PtNumOfCharConverted, wchar_t * _DstBuf, size_t _SizeInWords, const char * _SrcBuf, size_t _MaxCount );
Parameter description:
PtNumOfCharConverted: Points to the length of the converted string plus a terminator (unit wchar_t)
_DstBuf: Points to the first address of the converted string
_SizeInWords: The maximum character space size of the destination address (unit wchar_t)
_SrcBuf: Source multi-byte characters String start address
_MaxCount: The maximum number of characters that can be stored in the wide string buffer, which is used to trim the converted wide string
Return value: 0 is returned on success, and a failure code is returned on failure
1.2 Conversion example
The example is implemented in C++, in order to use string and wstring as parameters or return values, internally still convert string to char*, and convert wstring to wchar_t*
If you need the C version, you only need to intercept the mutual conversion part of char* and wchar_t* inside the function.
#include <iostream>
#include <locale.h>
#include <string>
using namespace std;
string ws2s(const wstring& ws){
size_t convertedChars=0;
string curLocale=setlocale(LC_ALL,NULL); //curLocale="C"
setlocale(LC_ALL,"chs");
const wchar_t* wcs = ws.c_str();
size_t dByteNum=sizeof(wchar_t)*ws.size()+1;
cout<<"ws.size():"<<ws.size()<<endl; //5
char* dest=new char[dByteNum];
wcstombs_s(&convertedChars,dest,dByteNum,wcs,_TRUNCATE);
cout<<"convertedChars:"<<convertedChars<<endl; //8
string result=dest;
delete [] dest;
setlocale(LC_ALL,curLocale.c_str());
return result;
}
wstring s2ws(const string& s)
{ size_t convertedChars=0; string curLocale=setlocale(LC_ALL,NULL); //curLocale="C" setlocale(LC_ALL,"chs"); const char* source=s.c_str(); size_t charNum=sizeof(char)*s.size()+1; cout<<"s.size():"<<s.size()<<endl; //7 wchar_t* dest=new wchar_t[charNum]; mbstowcs_s(&convertedChars,dest,charNum,source,_TRUNCATE); cout<<"s2ws_convertedChars:"<<convertedChars<<endl; //6 wstring result=dest; delete [] dest; setlocale(LC_ALL,curLocale.c_str()); return result; } int main() {
wchar_t *pwstr = L"123ABC Hello";
string obj=ws2s(pwstr);
cout<<obj<<endl; //Output 123ABC Hello
char *pstr = "123ABC Hello";
wstring objw = s2ws(pstr);
setlocale(LC_ALL, "chs"); //In order to make wcout output Chinese
//or wcout.imbue(locale("chs"));
wcout<<objw<<endl; //Output 123ABC Hello
}
2.Windows API