About rapidxml unable to parse Chinese path problem

put the result first

1         setlocale(LC_ALL, ""); 
2         rapidxml::file<> f(szPath);
3         setlocale(LC_ALL, "C");

For the setlocale function, here is what msdn says

char *setlocale(  
   int category,  
   const char *locale   
);  

parameter

category
Classification affected by locale.

locale
Locale specifier.

 

return value
Return value
 

If a valid  locale sum  categoryis provided, returns   a pointer to the string associated with the specified locale sum  . categoryIf  locale or  category invalid, a null pointer is returned and the program's current locale is not changed.

For example, calling

setlocale( LC_ALL, "en-US" );

Set all categories, just return that string

en-US

You can copy  setlocale that portion of the string returned by to restore the program's locale information. Global or thread-local storage is used for  setlocale strings returned by . Later calls setlocale will overwrite the string, which will invalidate the string pointer returned by the previous call.

 

If it  locale points to an empty string, i.e.

setlocale(LC_ALL, ""); 

Then the locale is the implementation-defined native environment. C A value of C conversion specifies a minimal ANSI-compliant environment. C The locale assumes all  char data types are 1 byte, and its value is always less than 256.

 

Why do you have to call setlocale?
Because the C/C++ language standard defines its runtime character set environment as "C", which is a subset of the ASCII character set, then mbstowcs will treat the strings contained in cstr as ASCII during work. The encoded characters are not considered to be a string containing chs encoding, so he will split each Chinese into 2 ASCII codes for conversion, and the result will be a string of 4 wchar_t characters, then How can I make mbstowcs work properly? Before calling mbstowcs for conversion, you must explicitly tell mbstowcs that the current cstr string contains the chs-encoded string, which is done by calling the setlocale( LC_ALL, "chs" ) function. It should be noted that this function will change the entire application. The character set encoding method must be restored by calling the setlocale( LC_ALL, "C" ) function again, so as to ensure that mbstowcs regards the string in cstr as a Chinese string during conversion, and converts it into 2 wchar_t characters, instead of 4. Of course, the converted path does not actually exist, because of this reason, the rapidxml ran out of a RUNTIME ERROR

remember to call

setlocale(LC_ALL, "" ); After that, you need to convert back to the original character set through setlocale( LC_ALL, "C" ) in the appropriate place, otherwise it may cause other errors

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325014583&siteId=291194637