V745. A 'wchar_t *' type string is incorrectly converted to 'BSTR' type string.


The analyzer detected that a string of type "wchar_t *" is handled as a string of type BSTR. It is very strange, and this code is very likely to be incorrect. To figure out why such string handling is dangerous, let's first recall what the BSTR type is.

Actually, we will quote the article from MSDN. I know, people don't like reading MSDN documentation, but we'll have to. We need to understand the danger behind errors of this type - and diagnostic V745 does indicate serious errors in most cases.

typedef wchar_t OLECHAR;
typedef OLECHAR * BSTR;

A BSTR (Basic string or binary string) is a string data type that is used by COM, Automation, and Interop functions. Use the BSTR data type in all interfaces that will be accessed from script.

  • Length prefix. A four-byte integer that contains the number of bytes in the following data string. It appears immediately before the first character of the data string. This value does not include the terminating null character.
  • Data string. A string of Unicode characters. May contain multiple embedded null characters.
  • Terminator. Two null characters.

A BSTR is a pointer. The pointer points to the first character of the data string, not to the length prefix.

BSTRs are allocated using COM memory allocation functions, so they can be returned from methods without concern for memory allocation.

The following code is incorrect:

BSTR MyBstr = L"I am a happy BSTR";

This code builds (compiles and links) correctly, but it will not function properly because the string does not have a length prefix. If you use a debugger to examine the memory location of this variable, you will not see a four-byte length prefix preceding the data string.

Instead, use the following code:

BSTR MyBstr = SysAllocString(L"I am a happy BSTR");

A debugger that examines the memory location of this variable will now reveal a length prefix containing the value 34. This is the expected value for a 17-byte single-character string that is converted to a wide-character string through the inclusion of the "L" string modifier. The debugger will also show a two-byte terminating null character (0x0000) that appears after the data string.

If you pass a simple Unicode string as an argument to a COM function that is expecting a BSTR, the COM function will fail.

I hope this excerpt from MSDN has explained well enough why one should not mix BSTR strings and ordinary strings of type "wchar_t *".

Also, keep in mind that the analyzer can't tell for sure if there is a real error in the code or not. If an incorrect BSTR string is passed somewhere outside the code, it will cause a failure. But if a BSTR string is cast back to "wchar_t *", all is fine. What is meant here is the code of the following pattern:

wchar_t *wstr = Foo();
BSTR tmp = wstr;
wchar_t *wstr2 = tmp;

True, there's no real error here. But this code still "smells" and has to be fixed. When fixed, it won't bewilder the programmer maintaining the code, and neither will it trigger the analyzer's warning. Use proper data types:

wchar_t *wstr = Foo();
wchar_t *tmp = wstr;
wchar_t *wstr2 = tmp;

We also recommend reading the sources mentioned at the end of the article: they will help you figure out what BSTR strings are all about and how to cast them to strings of other types.

Here's another example:

wchar_t *wcharStr = L"123";
wchar_t *foo = L"12345";
int n = SysReAllocString(&wcharStr, foo);

This is the description of function SysReAllocString:

INT SysReAllocString(BSTR *pbstr, const OLECHAR *psz);

Reallocates a previously allocated string to be the size of a second string and copies the second string into the reallocated memory.

As you see, the function expects, as its first argument, a pointer to a variable referring to the address of a BSTR string. Instead, it receives a pointer to an ordinary string. Since the "wchar_t **" type is actually the same thing as "BSTR *", the code compiles correctly. In practice, however, it doesn't make sense and will cause a runtime error.

The fixed version of the code:

BSTR wcharStr = SysAllocString(L"123");
wchar_t *foo = L"12345";
int n = SysReAllocString(&wcharStr, foo);

A special case we should discuss is when the 'auto' keyword is used. The analyzer produces a warning on the following harmless code:

auto bstr = ::SysAllocStringByteLen(foo, 3);
ATL::CComBSTR value;
value.Attach(bstr);  // Warning: V745

True, this is a false positive, but the analyzer is technically correct when issuing the warning. The 'bstr' variable is of type 'wchar_t *'. When deducing the type of the 'auto' variable, the C++ compiler does not take into account that the function returns a value of type 'BSTR'. When deducing 'auto', the 'BSTR' type is simply a synonym of 'whar_t *'. This means that the code above is equivalent to this:

wchar_t *bstr = ::SysAllocStringByteLen(foo, 3);
ATL::CComBSTR value;
value.Attach(bstr);

This is why the PVS-Studio analyzer generates the warning: it is a bad practice to store a pointer to a 'BSTR' string in a standard 'wchar_t *' pointer. To eliminate the warning, you should specify the type explicitly rather than use 'auto' here:

BSTR *bstr = ::SysAllocStringByteLen(foo, 3);

ATL::CComBSTR value;

value.Attach(bstr);

This is an interesting case when the 'auto' operator loses type information and makes things worse rather than help.

Another way to eliminate the warning is to use one of the false-positive suppression mechanisms described in the documentation.

References:

According to Common Weakness Enumeration, potential errors found by using this diagnostic are classified as CWE-704.

You can look at examples of errors detected by the V745 diagnostic.


Bugs Found

Checked Projects
355
Collected Errors
13 303