Can We Trust the Libraries We Use?

11.08.2014 Andrey Karpov

Any large modern application consists of numerous third-party libraries, and I'd like to discuss the topic of our trust in these libraries. In books and articles, there are lots of debates about code quality, testing methods, development methodologies, and so on. But I don't remember anyone discussing the quality of bricks applications are built from. So let's talk about it today. For example, there is the Medicine Insight Segmentation and Registration Toolkit (ITK). I find it to be implemented pretty well. At least, I have noticed just a few bugs in its code. But I cannot say the same about the code of the third-party libraries used there. So the question is: how much can we trust such systems? Much food for thought.

Picture 2

When developing medical applications, everyone is talking about quality and coding standards; programmers are demanded to follow such standards as MISRA and so on. To tell the truth, I'm not well familiar with methodologies used when writing safety-critical applications. But I do suspect that the question of the quality of third-party libraries used in development is often ignored. Application code and code of third-party libraries live their own separate lives.

This conclusion is drawn from my subjective observations. I very often come across very high-quality applications where I can't find even half a dozen of serious bugs. At the same time, such applications may include third-party libraries of an extremely bad quality.

Assume a doctor makes an incorrect diagnosis because of some image artifacts caused by a bug in software. In this case, it doesn't matter a little bit whether this bug is in the program itself or in the image handling library. Think of it.

What set me thinking about it all again was the check of the source codes of the ITK project:

Picture 3

Insight Segmentation and Registration Toolkit (ITK). ITK is an open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Developed through extreme programming methodologies, ITK employs leading-edge algorithms for registering and segmenting multidimensional data.

While analyzing the ITK project with PVS-Studio, I once more noticed the following thing. There were few suspicious fragments related to the ITK project itself, but at the same time lots of suspicious fragments and obvious bugs in the files stored in the "ThirdParty" folder.

No wonder. ITK includes quite a lot of libraries. But that's pretty sad indeed: some bugs in those libraries may affect ITK's operation.

I'm not going to appeal for any drastic acts or give any recommendations; my goal is to attract people's attention to my findings so they could think them over. To make my words stick in your memory, I will show you some suspicious fragments that have caught my attention.

Let's start with the OpenJPEG library

Poor case

typedef enum PROG_ORDER {
  PROG_UNKNOWN = -1,
  LRCP = 0,
  RLCP = 1,
  RPCL = 2,
  PCRL = 3,
  CPRL = 4
} OPJ_PROG_ORDER;

OPJ_INT32 pi_check_next_level(....)
{
  ....
  case 'P':
    switch(tcp->prg)
    {
      case LRCP||RLCP:
        if(tcp->prc_t == tcp->prcE){
          l=pi_check_next_level(i-1,cp,tileno,pino,prog);
  ....
}

PVS-Studio's diagnostic message: V560 A part of conditional expression is always true: RLCP. pi.c 1708

The programmer forgot how to use the 'case' operator properly. The statement "case LRCP||RLCP:" is equivalent to "case 1:". And this is obviously not what the programmer intended.

The correct code should look as follows:

case LRCP:
case RLCP:

And that's exactly what is written in other places of the program. Well, I would also add a comment – something like this:

case LRCP: // fall through
case RLCP:

Null pointer dereferencing

bool j2k_write_rgn(....)
{
  OPJ_BYTE * l_current_data = 00;
  OPJ_UINT32 l_nb_comp;
  OPJ_UINT32 l_rgn_size;
  opj_image_t *l_image = 00;
  opj_cp_t *l_cp = 00;
  opj_tcp_t *l_tcp = 00;
  opj_tccp_t *l_tccp = 00;
  OPJ_UINT32 l_comp_room;

  // preconditions
  assert(p_j2k != 00);
  assert(p_manager != 00);
  assert(p_stream != 00);

  l_cp = &(p_j2k->m_cp);
  l_tcp = &l_cp->tcps[p_tile_no];
  l_tccp = &l_tcp->tccps[p_comp_no];

  l_nb_comp = l_image->numcomps;
  ....
}

PVS-Studio's diagnostic message: V522 Dereferencing of the null pointer 'l_image' might take place. j2k.c 5205

The 'l_image' pointer is initialized to zero and is not changed anywhere after that. So, when calling the j2k_write_rgn() function, the null pointer will be dereferenced.

A variable assigned to itself

OPJ_SIZE_T opj_stream_write_skip (....)
{
  ....
  if (!l_is_written)
  {
    p_stream->m_status |= opj_stream_e_error;
    p_stream->m_bytes_in_buffer = 0;
    p_stream->m_current_data = p_stream->m_current_data;
    return (OPJ_SIZE_T) -1;
  }
  ....
}

PVS-Studio's diagnostic message: V570 The 'p_stream->m_current_data' variable is assigned to itself. cio.c 675

Something is messed up in this code. A variable is assigned its own value.

Incorrect check

typedef struct opj_stepsize
{
  OPJ_UINT32 expn;
  OPJ_UINT32 mant;
};

bool j2k_read_SQcd_SQcc(
  opj_j2k_t *p_j2k,
  OPJ_UINT32 p_comp_no,
  OPJ_BYTE* p_header_data,
  OPJ_UINT32 * p_header_size,
  struct opj_event_mgr * p_manager
  )
{  
  ....
  OPJ_UINT32 l_band_no;
  ....
  l_tccp->stepsizes[l_band_no].expn =
    ((l_tccp->stepsizes[0].expn) - ((l_band_no - 1) / 3) > 0) ?
      (l_tccp->stepsizes[0].expn) - ((l_band_no - 1) / 3) : 0;
  ....
}

PVS-Studio's diagnostic message: V555 The expression of the 'A - B > 0' kind will work as 'A != B'. itkopenjpeg j2k.c 3421

It's not easy to quickly find the error in this fragment, so I've made a simplified artificial example:

unsigned A, B;
....
X = (A - B > 0) ? (A - B) : 0;

As far as I understand, the programmer intended to do the following. If the A variable is larger than B, than the difference should be calculated; if not, the expression should evaluate to zero.

He chose a wrong way to write this comparison. Since the (A - B) expression is 'unsigned', it will always be greater or equal to 0. For instance, if "A = 3, B = 5', than (A - B) equals 0xFFFFFFFE (4294967294).

So it appears this expression can be simplified:

X = (A != B) ? (A - B) : 0;

If (A == B), we'll get 0 as the difference. It means the expression can be simplified even more:

X = A - B;

Something is obviously wrong. The correct way of writing this comparison is as follows:

X = (A > B) ? (A - B) : 0;

GDCM

Well, enough of Jpeg; we don't want the article to turn into a reference book. There are other libraries to discuss – for example the Grassroots DICOM library (GDCM).

Incorrect loop condition

bool Sorter::StableSort(std::vector<std::string> const & filenames)
{
  ....
  std::vector< SmartPointer<FileWithName> >::iterator
    it2 = filelist.begin();

  for( Directory::FilenamesType::const_iterator it =
         filenames.begin();
       it != filenames.end(), it2 != filelist.end();
       ++it, ++it2)
  {
  ....
}

PVS-Studio's diagnostic message: V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. gdcmsorter.cxx 82

The comma ',' operator in the loop condition is meaningless. The result of this operator is its right operand. So the "it != filenames.end()" expression is not taken into account in any way.

The loop should probably look like this:

for(Directory::FilenamesType::const_iterator it = ....;
    it != filenames.end() && it2 != filelist.end();
    ++it, ++it2)

A bit further in the code, there is another similar incorrect loop (gdcmsorter.cxx 123).

Potential null pointer dereferencing

bool PrivateTag::ReadFromCommaSeparatedString(const char *str)
{
  unsigned int group = 0, element = 0;
  std::string owner;
  owner.resize( strlen(str) );
  if( !str || sscanf(str, "%04x,%04x,%s", &group ,
                     &element, &owner[0] ) != 3 )
  {
    gdcmDebugMacro( "Problem reading Private Tag: " << str );
    return false;
  }
  ....
}

PVS-Studio's diagnostic message: V595 The 'str' pointer was utilized before it was verified against nullptr. Check lines: 26, 27. gdcmprivatetag.cxx 26

You can see from the condition that the 'str' pointer may be equal to nullptr. Nevertheless, this pointer is dereferenced without being checked in the following line:

owner.resize( strlen(str) );

Unspecified behavior

bool ImageCodec::DoOverlayCleanup(
  std::istream &is, std::ostream &os)
{
  ....
  // nmask : to propagate sign bit on negative values
  int16_t nmask = (int16_t)0x8000;
  nmask = nmask >>
          ( PF.GetBitsAllocated() - PF.GetBitsStored() - 1 );
  ....
}

PVS-Studio's diagnostic message: V610 Unspecified behavior. Check the shift operator '>>. The left operand 'nmask' is negative. gdcmimagecodec.cxx 397

Shifting negative values through the ">>" operator leads to unspecified behavior. Relying on luck is unacceptable for such libraries.

Dangerous reading from file

void LookupTable::Decode(....) const
{
  ....
  while( !is.eof() )
  {
    unsigned short idx;
    unsigned short rgb[3];
    is.read( (char*)(&idx), 2);
    if( is.eof() ) break;
    if( IncompleteLUT )
    {
      assert( idx < Internal->Length[RED] );
      assert( idx < Internal->Length[GREEN] );
      assert( idx < Internal->Length[BLUE] );
    }
    rgb[RED]   = rgb16[3*idx+RED];
    rgb[GREEN] = rgb16[3*idx+GREEN];
    rgb[BLUE]  = rgb16[3*idx+BLUE];
    os.write((char*)rgb, 3*2);
  }
  ....
}

PVS-Studio's diagnostic message: V663 Infinite loop is possible. The 'cin.eof()' condition is insufficient to break from the loop. Consider adding the 'cin.fail()' function call to the conditional expression. gdcmMSFF gdcmlookuptable.cxx 280

You see, the program may hang at this place. If something triggers an error reading from file, the "is.eof()" check will fail to stop the loop. In case of an error, the program cannot read from the file. But the end of the file has not been reached yet. And these are quite different things.

An additional check should be added that can be implemented through a call of the is.fail() function.

There are quite a lot of other dangerous errors reading from file. I recommend the developers to check all the fragments where the eof() function is called. These fragments can be found both in GDCM and other libraries.

ITK

Let's finish with the libraries here. I think I've managed to make my worry clear for you.

Perhaps the readers are interested to know if I have found anything in the ITK library itself. Yes, there were a few interesting issues.

The Last Line Effect

I recently wrote a funny article titled "The Last Line Effect". If you haven't read it yet, I do recommend to.

Here is another way this effect manifests itself. In the last, third line, the index should be '2' instead of '1'.

int itkPointSetToSpatialObjectDemonsRegistrationTest(....)
{
  ....
  // Set its position
  EllipseType::TransformType::OffsetType offset;
  offset[0]=50;
  offset[1]=50;
  offset[1]=50;
  ....
}

PVS-Studio's diagnostic message: V519 The 'offset[1]' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 41, 42. itkpointsettospatialobjectdemonsregistrationtest.cxx 42

A typo

Here is one more typo with an array index:

template< typename TCoordRepType >
void
VoronoiDiagram2D< TCoordRepType >::SetOrigin(PointType vorsize)
{
  m_VoronoiBoundaryOrigin[0] = vorsize[0];
  m_VoronoiBoundaryOrigin[0] = vorsize[1];
}

PVS-Studio's diagnostic message: V519 The 'm_VoronoiBoundaryOrigin[0]' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 74, 75. itkvoronoidiagram2d.hxx 75

A missing index

void MultiThreader::MultipleMethodExecute()
{
  ....
  HANDLE process_id[ITK_MAX_THREADS];
  ....
  process_id[thread_loop] = (void *) _beginthreadex(0, 0, ....);

  if ( process_id == 0 )
  {
    itkExceptionMacro("Error in thread creation !!!");
  }
  ....
}

PVS-Studio's diagnostic message: V600 Consider inspecting the condition. The 'process_id' pointer is always not equal to NULL. itkmultithreaderwinthreads.cxx 90

The "if ( process_id == 0 )" check is meaningless. The programmer wanted to check an array item and the code was meant to look like this:

if ( process_id[thread_loop] == 0 )

Identical checks

template< typename T >
void WriteCellDataBufferAsASCII(....)
{
  ....
  if( this->m_NumberOfCellPixelComponents == 3 )
  {
    ....
  }
  else if( this->m_NumberOfCellPixelComponents == 3 )
  {
    ....
  }
  ....
}

PVS-Studio's diagnostic messages: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 948, 968. itkvtkpolydatameshio.h 948

Suspicious constructor

template<typename LayerType, typename TTargetVector>
QuickPropLearningRule <LayerType,TTargetVector>
::QuickPropLearningRule()
{
  m_Momentum = 0.9; //Default
  m_Max_Growth_Factor = 1.75;
  m_Decay = -0.0001;
  m_SplitEpsilon = 1;
  m_Epsilon = 0.55;
  m_Threshold = 0.0;
  m_SigmoidPrimeOffset = 0;
  m_SplitEpsilon = 0;
}

PVS-Studio's diagnostic messages: V519 The 'm_SplitEpsilon' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 35, 39. itkquickproplearningrule.hxx 39

Notice how the 'm_SplitEpsilon' variable is initialized. At first, this class member is assigned value 1 and then 0. That's pretty strange.

Incorrect cache clearing

template <typename TInputImage, typename TOutputImage>
void
PatchBasedDenoisingImageFilter<TInputImage, TOutputImage>
::EmptyCaches()
{
  for (unsigned int threadId = 0;
       threadId < m_ThreadData.size(); ++threadId)
  {
    SizeValueType cacheSize =
      m_ThreadData[threadId].eigenValsCache.size();
    for (SizeValueType c = 0; c < cacheSize; ++c)
    {
      delete m_ThreadData[threadId].eigenValsCache[c];
      delete m_ThreadData[threadId].eigenVecsCache[c];
    }
    m_ThreadData[threadId].eigenValsCache.empty();
    m_ThreadData[threadId].eigenVecsCache.empty();
  }
}

PVS-Studio's diagnostic messages:

  • V530 The return value of function 'empty' is required to be utilized. itkpatchbaseddenoisingimagefilter.hxx 85
  • V530 The return value of function 'empty' is required to be utilized. itkpatchbaseddenoisingimagefilter.hxx 86

Due to inattention, the programmer implemented a call of the 'empty()' function instead of 'clear()'. It leads to adding garbage to the cache, so using it becomes dangerous. This bug is hard to find and it can lead to very strange side effects.

Other bugs

There were other bugs, both in ITK and the third-party libraries. But I promised myself to fit the article into 12 pages, while typing it in Microsoft Word. You see, I don't like that my articles tend to grow in size more and more each time. So I have to restrict myself. The reason why articles are getting lengthy is that the PVS-Studio analyzer is learning to find more and more bugs.

It's OK that I haven't described all the suspicious fragments. To be honest, I was just quickly scanning through the report and surely missed much. Don't treat this article as a collection of warnings; instead, I want it to stimulate some of you to start using static analyzers in your work regularly. It will be much better that way, for I cannot possibly check all the programs in the world.

If ITK's authors check their project themselves, it will be much better than making fixes relying on my article. Unfortunately, PVS-Studio generates too many false positives on ITK. The reason is that the code uses a few special macros. Analysis results can be significantly improved through slight customization. If necessary, ask me for advice, I'll be glad to help you.

Conclusion

Dear readers, please remember that one-time checks by static analyzers give you just a small benefit. Only using them regularly will truly help you save your time. This idea is discussed in detail in the post "Leo Tolstoy and static code analysis".

May your programs and libraries stay bugless!