A Spin-off: Firebird Checked by PVS-Studio

21.02.2014 Andrey Karpov

We are currently working on a great task of carrying out a detailed comparison of four code analyzers: CppCat, Cppcheck, PVS-Studio and Visual Studio 2013 (i.e. its built-in code analyzer). As a set of materials to base this comparison on, we decided to check at least 10 open-source projects and study the reports from all the analyzers. This is a very labor-intensive task and it is not over yet. However, we have already checked a few projects and can share some of the results with you. And that's what I'm going to do in this article. We'll start with interesting bugs we have managed to find in Firebird with the help of PVS-Studio.

Picture 1

Firebird

Firebird (FirebirdSQL) is a relational database offering many ANSI SQL standard features that runs on Linux, Windows, and a variety of Unix platforms. Firebird offers excellent concurrency, high performance, and powerful language support for stored procedures and triggers.

The project website: http://www.firebirdsql.org/

Wikipedia article: Firebird

Let's see what interesting defects PVS-Studio managed to find in this project's code.

Uninitialized variables

static const UCHAR* compile(const UCHAR* sdl, sdl_arg* arg)
{
  SLONG n, count, variable, value, sdl_operator;
  ....
  switch (op)
  {
    ....
    case isc_sdl_add:
      sdl_operator = op_add;
    case isc_sdl_subtract:
      if (!sdl_operator)
        sdl_operator = op_subtract;
  ......
}

V614 Uninitialized variable 'sdl_operator' used. sdl.cpp 404

I suspect that the 'break' operator was deliberately omitted between "case isc_sdl_add:" and "case isc_sdl_subtract:". This code doesn't take account of the case when we may get to the line "case isc_sdl_subtract:" right away. And if that happens, the 'sdl_operator' variable will not be initialized by then yet.

Here's another similar issue. The 'fieldNode' variable may stay uninitialized if "field == false".

void blb::move(....)
{
  ....
  const FieldNode* fieldNode;
  if (field)
  {
    if ((fieldNode = ExprNode::as<FieldNode>(field)))
    ....
  }
  ....
  const USHORT id = fieldNode->fieldId;
  ....
}

V614 Potentially uninitialized pointer 'fieldNode' used. blb.cpp 1043

That is why it is not a good idea to give the same name to different variables in one function:

void realign(....)
{
  for (....)
  {
    UCHAR* p = buffer + field->fld_offset;
    ....
    for (const burp_fld* field = relation->rel_fields;
         field; field = field->fld_next)
    {
      ....
      UCHAR* p = buffer + FB_ALIGN(p - buffer, sizeof(SSHORT));
  ........
}

V573 Uninitialized variable 'p' was used. The variable was used to initialize itself. restore.cpp 17535

When initializing the second variable 'p', the programmer wanted to use the value of the first variable 'p'. Instead, the second variable - not initialized yet - is used.

A note for the project's authors. Have a look at this fragment too: restore.cpp 17536

Dangerous string comparison (a vulnerability)

Note that the result of the memcmp() function is stored in a variable of the 'SSHORT' type. 'SSHORT' is actually but a synonym of the 'short' type.

SSHORT TextType::compare(
  ULONG len1, const UCHAR* str1, ULONG len2, const UCHAR* str2)
{
  ....
  SSHORT cmp = memcmp(str1, str2, MIN(len1, len2));

  if (cmp == 0)
    cmp = (len1 < len2 ? -1 : (len1 > len2 ? 1 : 0));

  return cmp;
}

V642 Saving the 'memcmp' function result inside the 'short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. texttype.cpp 338

Wondering what's wrong here?

Let me remind you that the memcmp() function returns a value of the 'int' type. In our case, the result is written into a variable of the 'short' type, so hi bits are lost. This is dangerous!

The function returns the following values: less than zero, zero, or larger than zero. "Larger than zero" implies any positive number. It may be either 1 or 2 or 19472341. That's why one can't store the result of the memcmp() function in a type smaller than the 'int' type.

This problem may seem farfetched. But it is actually a true vulnerability. For example, a similar bug in the MySQL code was acknowledged as a vulnerability, too: Security vulnerability in MySQL/MariaDB sql/password.c. In that case, the result was written into a variable of the 'char' type. The 'short' type is no better from the viewpoint of security.

Similar dangerous comparisons were found in the following fragments:

  • cvt2.cpp 256
  • cvt2.cpp 522

Typos

Typos can be found in any code, at any time. Most of them are usually caught soon during the testing procedure. But some still survive and can be found almost in any project.

int Parser::parseAux()
{
  ....
  if (yyps->errflag != yyps->errflag) goto yyerrlab;
  ....
}

V501 There are identical sub-expressions to the left and to the right of the '!=' operator: yyps->errflag != yyps->errflag parse.cpp 23523

No need in comments here. And in the following fragment, Copy-Paste must have been used:

bool CMP_node_match( const qli_nod* node1, const qli_nod* node2)
{
  ....
  if (node1->nod_desc.dsc_dtype != node2->nod_desc.dsc_dtype ||
      node2->nod_desc.dsc_scale != node2->nod_desc.dsc_scale ||
      node2->nod_desc.dsc_length != node2->nod_desc.dsc_length)
  ....
}

V501 There are identical sub-expressions 'node2->nod_desc.dsc_scale' to the left and to the right of the '!=' operator. compile.cpp 156

V501 There are identical sub-expressions 'node2->nod_desc.dsc_length' to the left and to the right of the '!=' operator. compile.cpp 157

It causes an incorrect comparison of the members of the classes 'nod_desc.dsc_scale' and 'nod_desc.dsc_length' in the CMP_node_match() function.

One more typo was found in the following line: compile.cpp 183

Strange loops

static processing_state add_row(TEXT* tabname)
{
  ....
  unsigned i = n_cols;
  while (--i >= 0)
  {
    if (colnumber[i] == ~0u)
  {
       bldr->remove(fbStatus, i);
       if (ISQL_errmsg(fbStatus))
         return (SKIP);
    }
  }
  msg.assignRefNoIncr(bldr->getMetadata(fbStatus));
  ....
}

V547 Expression '-- i >= 0' is always true. Unsigned type value is always >= 0. isql.cpp 3421

The 'i' variable is 'unsigned'. It means that it is always larger than or equal to 0. Because of that, the (--i >= 0) condition makes no sense as it is always true.

The loop below will, on the contrary, terminate sooner as it was meant to:

SLONG LockManager::queryData(....)
{
  ....
  for (const srq* lock_srq = (SRQ) 
         SRQ_ABS_PTR(data_header.srq_backward);
     lock_srq != &data_header;
     lock_srq = (SRQ) SRQ_ABS_PTR(lock_srq->srq_backward))
  {
    const lbl* const lock = ....;
    CHECK(lock->lbl_series == series);
    data = lock->lbl_data;
    break;
  }
  ....
}

What for is there that suspicious 'break'?

Another similar issue was be found in the following line: pag.cpp 217

Classics

As usual, there are a lot of classic defects related to pointers, for example when a pointer is first dereferenced and then is checked for being null. It is far not always an error, but this code is still poorly-written and potentially dangerous. I will show only one example in this article; all the rest instances are listed in a special text file.

int CCH_down_grade_dbb(void* ast_object)
{
  ....
  SyncLockGuard bcbSync(
    &bcb->bcb_syncObject, SYNC_EXCLUSIVE, "CCH_down_grade_dbb");
  bcb->bcb_flags &= ~BCB_exclusive;

  if (bcb && bcb->bcb_count)
  ....
}

V595 The 'bcb' pointer was utilized before it was verified against nullptr. Check lines: 271, 274. cch.cpp 271

At first the 'bcb' pointer is dereferenced in the expression "bcb->bcb_flags &= ....". As you can conclude from the next check, 'bcb' may be equal to zero.

Check the list of other examples of this error (31 warnings in total): firebird-V595.txt

Shift operators

Since Firebird is built by different compilers for different platforms, there is sense in fixing shifts which may cause undefined behavior. They may well show up with very unpleasant consequences in the future.

const ULONG END_BUCKET = (~0) << 1;

V610 Undefined behavior. Check the shift operator '<<. The left operand '(~0)' is negative. ods.h 337

One can't shift negative numbers. To learn more on this issue, see the article "Wade not in unknown waters. Part three".

This code should be rewritten in the following way:

const ULONG END_BUCKET = (~0u) << 1;

Here are two other shifts of that kind:

  • exprnodes.cpp 6185
  • array.cpp 845

Meaningless checks

static processing_state add_row(TEXT* tabname)
{
  ....
  unsigned varLength, scale;
  ....
  scale = msg->getScale(fbStatus, i);
  ....
  if (scale < 0)
  ....
}

V547 Expression 'scale < 0' is always false. Unsigned type value is never < 0. isql.cpp 3716

The 'scale' variable is 'unsigned'. The (scale < 0) comparison is meaningless.

A similar issue: isql.cpp 4437

Have a look at another function:

static bool get_switches(....)
  ....
  if (**argv != 'n' || **argv != 'N')
  {
    fprintf(stderr, "-sqlda :  "
            "Deprecated Feature: you must use XSQLDA\n ");
    print_switches();
    return false;
  }
  ....
}

Command line arguments are processed incorrectly here. The (**argv != 'n' || **argv != 'N') condition is always true.

Miscellaneous

void FB_CARG Why::UtlInterface::getPerfCounters(
  ...., ISC_INT64* counters)
{
  unsigned n = 0;
  ....
  memset(counters, 0, n * sizeof(ISC_INT64));
  ....
}

V575 The 'memset' function processes '0' elements. Inspect the third argument. perf.cpp 487

I suspect that the programmer forgot to assign a value different from zero to the variable 'n' in the function body.

The convert() function receives a string length as its third argument:

ULONG convert(const ULONG srcLen,
              const UCHAR* src,
              const ULONG dstLen,
              UCHAR* dst,
              ULONG* badInputPos = NULL,
              bool ignoreTrailingSpaces = false);

However, the function is used in an incorrect way:

string IntlUtil::escapeAttribute(....)
{
  ....
  ULONG l;
  UCHAR* uc = (UCHAR*)(&l);
  const ULONG uSize =
    cs->getConvToUnicode().convert(size, p, sizeof(uc), uc);
  ....
}

V579 The convert function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. intlutil.cpp 668

We're dealing with a 64-bit error here which will show up in Win64.

The 'sizeof(uc)' expression returns the pointer size, not the buffer size. It is not important if the pointer size coincides with the size of the 'unsigned long' type. It is the case when working under Linux. No troubles will occur on Win32 either.

The bug will reveal itself in the Win64 version of the application. The convert() function will assume that the buffer size is 8 bytes (like the pointer size), though it is really 4 bytes.

Note. Perhaps there are also other 64-bit errors in the program, but I didn't examine those diagnostics. They are boring to write about and it is not always possible to figure out if such a bug will show up or not without knowing a program's logic. The 64-bit bug described above was found in an indirect way, through general diagnostics.

Conclusion

Perhaps the readers are interested to know if we have managed to find anything worthy in this project with Cppcheck and VS2013. Yes, these analyzers did manage to find a few defects that PVS-Studio had missed. But they are very few. So PVS-Studio is surely in the lead for this project. You will learn more about the comparison results from the article we are going to publish quite soon.

I would also like to point out that all the defects described in the article can be found with the CppCat analyzer as well. The PVS-Studio produces more warnings if you turn on the 3-rd level diagnostics (64-bit ones and so on). But, again, we would have got the same results if we had used CppCat instead of PVS-Studio. CppCat is a good tool to start improving your code every day.