Visual C ++. NET string conversion methods (reprint)

Visual C ++. NET string conversion methods [1]

2002-12-06 14:48:39

Ding and

   Visual C ++. NET involves the ATL / ATL Server, MFC and Managed C ++ and other programming, not only powerful and widely used. In programming, we often encounter ANSI, Unicode and encoding of different types of BSTR string conversion operations. This article first describes the basic string type, then the description related classes, such as CComBSTR, _bstr_t, CStringT, and finally discuss their conversion method further comprising using the latest conversion ATL7.0 classes and macros, such as CA2CT, CA2TEX like.

  A, BSTR, LPSTR and LPWSTR

  In Visual C ++. NET way all programming, we often use such a basic string types, such as BSTR, LPSTR LPWSTR and so on. These data types similar to the above has arisen, because the data exchange between different programming languages, and support for ANSI, Unicode and multi-byte character sets (MBCS) a.

  So what is BSTR, LPSTR and LPWSTR it?

   BSTR (Basic STRing, Basic string) is a Unicode string OLECHAR * type. It is described as a type compatible with automation. Because the operating system to provide the appropriate API function (e.g., the SysAllocString) to manage its scheduling and some default code, a COM actually BSTR string, but it is widely used in a variety of applications beyond the automation technology. 1 depicts the structure of a BSTR, which is the number of bytes actually DWORD value string occupied, and its value is twice the Unicode character string.

   LPWSTR and LPSTR data type is a string and VC ++ Win32 used. LPSTR is defined to be a pointer to NULL ( '\ 0') at the end of 8 ANSI character array pointer, pointing LPWSTR is a NULL-terminated array of pointers 16 double-byte characters. In VC ++, as well as similar types of strings, such as LPTSTR, LPCTSTR the like, the meanings thereof as shown in FIG.

  For example, LPCTSTR means "long pointer to a constant generic string", it means "one point generally constant length string pointer type", and C / C ++ const char * is mapped, the mapped LPTSTR char *.

  In general, there are definitions of the following types:

#ifdef UNICODE
 typedef LPWSTR LPTSTR;
 typedef LPCWSTR LPCTSTR;
#else
 typedef LPSTR LPTSTR;
 typedef LPCSTR LPCTSTR;
#endif 

  二、CString、CStringA 和 CStringW

   Visual C ++. NET will CStringT as ATL and MFC shared "General" string class, it has CString, CStringA CStringW and three forms of different operation types of strings of characters. These character type is TCHAR, char and wchar_t. In Unicode TCHAR platform equivalent to WCHAR (16-bit Unicode character), is equivalent to the ANSI char. wchar_t is usually defined as unsigned short. Since CString often used in MFC applications are not repeated here.

  Three, VARIANT, COleVariant and _variant_t

  在OLE、ActiveX和COM中,VARIANT数据类型提供了一种非常有效的机制,由于它既包含了数据本身,也包含了数据的类型,因而它可以实现各种不同的自动化数据的传输。下面让我们来看看OAIDL.H文件中VARIANT定义的一个简化版:

struct tagVARIANT {
 VARTYPE vt;
 union {
  short iVal; // VT_I2.
  long lVal; // VT_I4.
  float fltVal; // VT_R4.
  double dblVal; // VT_R8.
  DATE date; // VT_DATE.
  BSTR bstrVal; // VT_BSTR.
  …
  short * piVal; // VT_BYREF|VT_I2.
  long * plVal; // VT_BYREF|VT_I4.
  float * pfltVal; // VT_BYREF|VT_R4.
  double * pdblVal; // VT_BYREF|VT_R8.
  DATE * pdate; // VT_BYREF|VT_DATE.
  BSTR * pbstrVal; // VT_BYREF|VT_BSTR.
 };
};

  显然,VARIANT类型是一个C结构,它包含了一个类型成员vt、一些保留字节以及一个大的union类型。例如,如果vt为VT_I2,那么我们可以从iVal中读出VARIANT的值。同样,当给一个VARIANT变量赋值时,也要先指明其类型。例如:

VARIANT va;
:: VariantInit(&va); // 初始化
int a = 2002;
va.vt = VT_I4; // 指明long数据类型
va.lVal = a; // 赋值

  为了方便处理VARIANT类型的变量,Windows还提供了这样一些非常有用的函数:

  VariantInit —— 将变量初始化为VT_EMPTY;

  VariantClear —— 消除并初始化VARIANT;

  VariantChangeType —— 改变VARIANT的类型;

  VariantCopy —— 释放与目标VARIANT相连的内存并复制源VARIANT。

   COleVariant类是对VARIANT结构的封装。它的构造函数具有极为强大大的功能,当对象构造时首先调用VariantInit进行初始化, 然后根据参数中的标准类型调用相应的构造函数,并使用VariantCopy进行转换赋值操作,当VARIANT对象不在有效范围时,它的析构函数就会被 自动调用,由于析构函数调用了VariantClear,因而相应的内存就会被自动清除。除此之外,COleVariant的赋值操作符在与 VARIANT类型转换中为我们提供极大的方便。例如下面的代码:

COleVariant v1("This is a test"); // 直接构造
COleVariant v2 = "This is a test";
// 结果是VT_BSTR类型,值为"This is a test"
COleVariant v3((long)2002);
COleVariant v4 = (long)2002;
// 结果是VT_I4类型,值为2002

  _variant_t是一个用于COM的VARIANT类,它的功能与COleVariant相似。不过在Visual C++.NET的MFC应用程序中使用时需要在代码文件前面添加下列两句:

  #include "comutil.h"

  #pragma comment( lib, "comsupp.lib" )

  四、CComBSTR和_bstr_t

  CComBSTR是对BSTR数据类型封装的一个ATL类,它的操作比较方便。例如:

CComBSTR bstr1;
bstr1 = "Bye"; // 直接赋值
OLECHAR* str = OLESTR("ta ta"); // 长度为5的宽字符
CComBSTR bstr2(wcslen(str)); // 定义长度为5
wcscpy(bstr2.m_str, str); // 将宽字符串复制到BSTR中
CComBSTR bstr3(5, OLESTR("Hello World"));
CComBSTR bstr4(5, "Hello World");
CComBSTR bstr5(OLESTR("Hey there"));
CComBSTR bstr6("Hey there");
CComBSTR bstr7(bstr6);
// 构造时复制,内容为"Hey there"

  _bstr_t是是C++对BSTR的封装,它的构造和析构函数分别调用SysAllocString和SysFreeString函数,其他操作是借用BSTR API函数。与_variant_t相似,使用时也要添加comutil.h和comsupp.lib。

五、BSTR、char*和CString转换

  (1) char*转换成CString

  若将char*转换成CString,除了直接赋值外,还可使用CString::Format进行。例如:

char chArray[] = "This is a test";
char * p = "This is a test";

  或

LPSTR p = "This is a test";

  或在已定义Unicode应的用程序中

TCHAR * p = _T("This is a test");

  或

LPTSTR p = _T("This is a test");
CString theString = chArray;
theString.Format(_T("%s"), chArray);
theString = p;

  (2) CString转换成char*

  若将CString类转换成char*(LPSTR)类型,常常使用下列三种方法:

  方法一,使用强制转换。例如:

CString theString( "This is a test" );
LPTSTR lpsz =(LPTSTR)(LPCTSTR)theString; 

  方法二,使用strcpy。例如:

CString theString( "This is a test" );
LPTSTR lpsz = new TCHAR[theString.GetLength()+1];
_tcscpy(lpsz, theString);

  需要说明的是,strcpy(或可移值Unicode/MBCS的_tcscpy)的第二个参数是 const wchar_t* (Unicode)或const char* (ANSI),系统编译器将会自动对其进行转换。

  方法三,使用CString::GetBuffer。例如:

CString s(_T("This is a test "));
LPTSTR p = s.GetBuffer();
// 在这里添加使用p的代码
if(p != NULL) *p = _T('\0');
s.ReleaseBuffer();
// 使用完后及时释放,以便能使用其它的CString成员函数

  (3) BSTR转换成char*

  方法一,使用ConvertBSTRToString。例如:

#include
#pragma comment(lib, "comsupp.lib")
int _tmain(int argc, _TCHAR* argv[]){
BSTR bstrText = ::SysAllocString(L"Test");
char* lpszText2 = _com_util::ConvertBSTRToString(bstrText);
SysFreeString(bstrText); // 用完释放
delete[] lpszText2;
return 0;

  方法二,使用_bstr_t的赋值运算符重载。例如:

_bstr_t b = bstrText;
char* lpszText2 = b;

  (4) char*转换成BSTR

  方法一,使用SysAllocString等API函数。例如:

BSTR bstrText = ::SysAllocString(L"Test");
BSTR bstrText = ::SysAllocStringLen(L"Test",4);
BSTR bstrText = ::SysAllocStringByteLen("Test",4);

  方法二,使用COleVariant或_variant_t。例如:

//COleVariant strVar("This is a test");
_variant_t strVar("This is a test");
BSTR bstrText = strVar.bstrVal;

  方法三,使用_bstr_t,这是一种最简单的方法。例如:

BSTR bstrText = _bstr_t("This is a test");

  方法四,使用CComBSTR。例如:

BSTR bstrText = CComBSTR("This is a test");

  或

CComBSTR bstr("This is a test");
BSTR bstrText = bstr.m_str;

  方法五,使用ConvertStringToBSTR。例如:

char* lpszText = "Test";
BSTR bstrText = _com_util::ConvertStringToBSTR(lpszText);

  (5) CString转换成BSTR

  通常是通过使用CStringT::AllocSysString来实现。例如:

CString str("This is a test");
BSTR bstrText = str.AllocSysString();

SysFreeString(bstrText); // 用完释放 

  (6) BSTR转换成CString

  一般可按下列方法进行:

BSTR bstrText = ::SysAllocString(L"Test");
CStringA str;
str.Empty();
str = bstrText; 

  或

CStringA str(bstrText);

  (7) ANSI、Unicode和宽字符之间的转换

  方法一,使用MultiByteToWideChar将ANSI字符转换成Unicode字符,使用WideCharToMultiByte将Unicode字符转换成ANSI字符。

  方法二,使用“_T”将ANSI转换成“一般”类型字符串,使用“L”将ANSI转换成Unicode,而在托管C++环境中还可使用S将ANSI字符串转换成String*对象。例如:

TCHAR tstr[] = _T("this is a test");
wchar_t wszStr[] = L"This is a test";
String* str = S”This is a test”;

  方法三,使用ATL 7.0的转换宏和类。ATL7.0在原有3.0基础上完善和增加了许多字符串转换宏以及提供相应的类,它具有如图3所示的统一形式:

   其中,第一个C表示“类”,以便于ATL 3.0宏相区别,第二个C表示常量,2表示“to”,EX表示要开辟一定大小的缓冲。SourceType和DestinationType可以是A、 T、W和OLE,其含义分别是ANSI、Unicode、“一般”类型和OLE字符串。例如,CA2CT就是将ANSI转换成一般类型的字符串常量。下面 是一些示例代码:

LPTSTR tstr= CA2TEX<16>("this is a test");
LPCTSTR tcstr= CA2CT("this is a test");
wchar_t wszStr[] = L"This is a test";
char* chstr = CW2A(wszStr); 

  六、结语

  几乎所有的程序都要用到字符串,而Visual C++.NET由于功能强大、应用广泛,因而字符串之间的转换更为频繁。本文几乎涉及到目前的所有转换方法。当然对于.NET框架来说,还可使用Convert和Text类进行不同数据类型以及字符编码之间的相互转换。

转载于:https://www.cnblogs.com/userinterface/archive/2005/03/30/128617.html

Guess you like

Origin blog.csdn.net/weixin_33998125/article/details/93668147