22.01.2016

ChakraCore: analysis of JavaScript-engine for Microsoft Edge

Picture 1

On the JSConf US conference in December 2015 the developers announced that they were planning to make open the source code of Chakra key components, a JavaScript-engine, operating in Microsoft Edge. Recently the ChackraCore source code became available under the MIT license in the corresponding repository on GitHub. In this article you will find interesting code fragments that were detected with the help of PVS-Studio code analyzer.

Introduction

ChakraCore is the core part of Chakra, the high-performance JavaScript engine that powers Microsoft Edge and Windows applications written in HTML/CSS/JS. ChakraCore supports Just-in-time (JIT) compilation of JavaScript for x86/x64/ARM, garbage collection, and a wide range of the latest JavaScript features.

PVS-Studio is a static analyzer for bug detection in the source code of programs, written in C, C++ and C#. PVS-Studio tool is designed for developers of contemporary applications and integrates into the Visual Studio environments of 2010-2015.

In an article on an open source project check, we report only about a limited number of all of the warnings issued by the analyzer, therefore we recommend the authors of the project to run the analyzer on their code themselves and study complete analysis results. We also provide developers of open source projects with a temporary key.

Various errors

Picture 2

V501 There are identical sub-expressions 'this->propId == Js::PropertyIds::_superReferenceSymbol' to the left and to the right of the '||' operator. diagobjectmodel.cpp 123

IDiagObjectModelDisplay * ResolvedObject::CreateDisplay()
{
 ....
  if (this->isConst ||
    this->propId == Js::PropertyIds::_superReferenceSymbol ||
    this->propId == Js::PropertyIds::_superReferenceSymbol)
  {
      pOMDisplay->SetDefaultTypeAttribute(....);
  }
  ....
}

There two similar checks in the condition. Probably, while writing the code, a programmer accidentally chose the same constant in the IntelliSense menu, for instance, instead of "Js::PropertyIds:: _superCtorReferenceSymbol".

V501 There are identical sub-expressions 'GetVarSymID(srcIndexOpnd->GetStackSym())' to the left and to the right of the '==' operator. globopt.cpp 20795

void GlobOpt::EmitMemop(....)
{
  ....
  IR::RegOpnd *srcBaseOpnd = nullptr;
  IR::RegOpnd *srcIndexOpnd = nullptr;
  IRType srcType;
  GetMemOpSrcInfo(...., srcBaseOpnd, srcIndexOpnd, srcType);
  Assert(GetVarSymID(srcIndexOpnd->GetStackSym()) ==        // <=
         GetVarSymID(srcIndexOpnd->GetStackSym()));         // <=
  ....
}

Two more identical comparisons. Most likely "srcIndexOpnd->GetStackSym()" was meant to be compared with "srcBaseOpnd ->GetStackSym()" .

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 3220, 3231. lower.cpp 3220

bool Lowerer::GenerateFastBrSrEq(....,
                                 IR::RegOpnd * srcReg1,
                                 IR::RegOpnd * srcReg2,
                                 ....)
{
  if (srcReg2 && IsConstRegOpnd(srcReg2))
  {
    ....
  }
  else if (srcReg1 && IsConstRegOpnd(srcReg1))
  {
    ....
  }
  else if (srcReg2 && (srcReg2->m_sym->m_isStrConst))
  {
    ....
  }
  else if (srcReg1 && (srcReg1->m_sym->m_isStrConst))       // <=
  {
    ....
  }
  else if (srcReg2 && (srcReg2->m_sym->m_isStrEmpty))
  {
    ....
  }
  else if (srcReg1 && (srcReg1->m_sym->m_isStrConst))       // <=
  {
    ....
  }

  return false;
}

The analyzer detected two similar checks in the cascade of conditional operators; as a result a code block in the last condition never gets control. Full code of the given example is very long and it's hard to notice a misprint. It is a good example, showing benefits from using a static code analyzer during the work with similar code, when the programmer gets quickly tired and lets his guard down.

Most likely, last two conditions were meant to be like this:

....
else if (srcReg2 && (srcReg2->m_sym->m_isStrEmpty))
{
  ....
}
else if (srcReg1 && (srcReg1->m_sym-> m_isStrEmpty))       // <=
{
  ....
}

V713 The pointer scriptContext was utilized in the logical expression before it was verified against nullptr in the same logical expression. diaghelpermethodwrapper.cpp 214

template <bool doCheckParentInterpreterFrame>
void HandleHelperOrLibraryMethodWrapperException(....)
{
  ....
  if (!exceptionObject->IsDebuggerSkip() ||
    exceptionObject == scriptContext->GetThreadContext()->.... ||
    exceptionObject == scriptContext->GetThreadContext()->.... ||
    !scriptContext)    // <=
  {
    throw exceptionObject->CloneIfStaticExceptionObject(....);
  }
  ....
}

Dereference of "scriptContext" pointer is done before its validity is checked. It was luck that such a bug wasn't noticed and didn't affect the program. Such bugs can live in the code for a very long time and show up in rare atypical situations.

V570 The 'this->isInlined' variable is assigned to itself. functioncodegenjittimedata.h 625

void SetupRecursiveInlineeChain(
    Recycler *const recycler,
    const ProfileId profiledCallSiteId)
{
  if (!inlinees)
  {
    inlinees = RecyclerNewArrayZ(....);
  }
  inlinees[profiledCallSiteId] = this;
  inlineeCount++;
  this->isInlined = isInlined;   // <=
}

It's very suspicious that the same value is stored in the 'isInlined' boolean variable. Most likely the programmer intended to write something else.

Here is one more fragment, where the variable is assigned to itself:

  • V570 The 'sym->m_isTaggableIntConst' variable is assigned to itself. linearscan.cpp 3170

V590 Consider inspecting the 'sub[i] != '-' && sub[i] == '/'' expression. The expression is excessive or contains a misprint. rl.cpp 1388

const char *
stristr
(
  const char * str,
  const char * sub
)
{
  ....
  for (i = 0; i < len; i++)
  {
    if (tolower(str[i]) != tolower(sub[i]))
    {
      if ((str[i] != '/' && str[i] != '-') ||
            (sub[i] != '-' && sub[i] == '/')) {              / <=
           // if the mismatch is not between '/' and '-'
           break;
      }
    }
  }
  ....
}

The analyzer detected, that a part of the (sub[i] != '-') conditional expression doesn't influence the check result. To make sure, let's have a look at the truth table. Most likely there is a misprint here; it's hard to say how to write this code correctly.

Picture 3

V603 The object was created but it is not being used. If you wish to call constructor, 'this->StringCopyInfo::StringCopyInfo(....)' should be used. stringcopyinfo.cpp 64

void StringCopyInfo::InstantiateForceInlinedMembers()
{
    AnalysisAssert(false);

    StringCopyInfo copyInfo;
    JavascriptString *const string = nullptr;
    wchar_t *const buffer = nullptr;

    (StringCopyInfo());                     // <=
    (StringCopyInfo(string, buffer));       // <=
    copyInfo.SourceString();
    copyInfo.DestinationBuffer();
}

Programmers often make mistakes, trying to explicitly call the constructor to initialize the object. In this example we see new unnamed objects of "StringCopyInfo" type that get created and then immediately destroyed. As a result, the class fields are left uninitialized.

The correct solution would be to create an initialization function and call it from the constructors in this fragment as well.

V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. constants.h 39

class Constants
{
public:
  ....
  static const int Int31MinValue = -1 << 30;
  ....
};

According to the latest standard of the C++ language, a shift of a negative number results in undefined behavior.

V557 Array overrun is possible. The value of 'i' index could reach 8. rl.cpp 2375

enum TestInfoKind::_TIK_COUNT = 9

const char * const TestInfoEnvLstFmt[] =
{
   " TESTFILE=\"%s\"",
   " BASELINE=\"%s\"",
   " CFLAGS=\"%s\"",
   " LFLAGS=\"%s\"",
   NULL,
   NULL,
   NULL,
   NULL    // <= TestInfoEnvLstFmt[7]
};

void
WriteEnvLst
(
   Test * pDir, TestList * pTestList
)
{
  ....
  // print the other TIK_*
  for(int i=0;i < _TIK_COUNT; i++) {
    if (variants->testInfo.data[i] && TestInfoEnvLstFmt[i]){// <=
       LstFilesOut->Add(TestInfoEnvLstFmt[i],               // <=
                        variants->testInfo.data[i]);
    }
    ....
  }
  ....
}

The analyzer detected that array index is out of bounds. The things is that the for() loop performs 9 iterations, but there are only 8 elements in the "TestInfoEnvLstFmt[]" array.

Perhaps, one more NULL was forgotten in the end:

const char * const TestInfoEnvLstFmt[] =
{
   " TESTFILE=\"%s\"",
   " BASELINE=\"%s\"",
   " CFLAGS=\"%s\"",
   " LFLAGS=\"%s\"",
   NULL,
   NULL,
   NULL,
   NULL    // <= TestInfoEnvLstFmt[7]
   NULL    // <= TestInfoEnvLstFmt[8]
};

But there is a chance that some string is missing in the middle of the array!

Dangerous pointers

Picture 7

V595 diagnostic looks for such code fragments where the pointer is dereferenced before it is compared with null. Usually in the projects there are several of such warnings. Such an error holds the record in out error base according to the number of issues found (see examples). But in general V595 diagnostics are too boring to give a lot of examples from a project. Also the check and dereference of a pointer can be located quite far away from each other in the function, having dozens or even hundreds of strings between them; that makes the explanation of this bug more complicated in the scope of an article.

That's why I'll provide several short examples of the code that most probably contain an error related to the pointer handling.

V595 The 'instrLd' pointer was utilized before it was verified against nullptr. Check lines: 1823, 1831. flowgraph.cpp 1823

IR::Instr *
FlowGraph::PeepTypedCm(IR::Instr *instr)
{
 ....
 if (instrLd && !instrLd->GetSrc1()->IsEqual(instr->GetDst()))
 {
   return nullptr;
 }
 
 if(instrLd2 && !instrLd2->GetSrc1()->IsEqual(instrLd->GetDst()))
 {
   return nullptr;
 }
 ....
}

Have a look at the pointer with the name "instrLd". In the first case we see that it is dereferenced and compared with null, in the second case a programmer forgot to do so, that's why it can cause null pointer dereference.

V595 The 'src2Val' pointer was utilized before it was verified against nullptr. Check lines: 9717, 9725. globopt.cpp 9717

bool GlobOpt::TypeSpecializeIntBinary(....)
{
  ....
  bool isIntConstMissingItem = src2Val->GetValueInfo()->....

  if(isIntConstMissingItem)
  {
      isIntConstMissingItem = Js::SparseArraySegment<int>::....
  }

  if (!src2Val || !(src2Val->GetValueInfo()->IsLikelyInt()) ||
      isIntConstMissingItem)
  {
      return false;
  }
  ....
}

Pointer "Src2Val" is used at the beginning of the function, but then the developers actively began checking whether this pointer is equal to zero.

V595 The 'm_lastInstr' pointer was utilized before it was verified against nullptr. Check lines: 214, 228. irbuilderasmjs.cpp 214

void
IRBuilderAsmJs::AddInstr(IR::Instr * instr, uint32 offset)
{
  m_lastInstr->InsertAfter(instr);                  // <=
  if (offset != Js::Constants::NoByteCodeOffset)
  {
    ....
  }
  else if (m_lastInstr)                             // <=
  {
      instr->SetByteCodeOffset(m_lastInstr->GetByteCodeOffset());
  }
  m_lastInstr = instr;
  ....
}

One more example of careless use of a pointer that can potentially be a null pointer.

A list of similar fragments:

  • V595 The 'arrayData' pointer was utilized before it was verified against nullptr. Check lines: 868, 870. immutablelist.h 868
  • V595 The 'pMembersList' pointer was utilized before it was verified against nullptr. Check lines: 2012, 2015. diagobjectmodel.cpp 2012
  • V595 The 'walkerRef' pointer was utilized before it was verified against nullptr. Check lines: 3191, 3193. diagobjectmodel.cpp 3191
  • V595 The 'block->loop' pointer was utilized before it was verified against nullptr. Check lines: 981, 1002. globopt.cpp 981
  • V595 The 'src2Val' pointer was utilized before it was verified against nullptr. Check lines: 12528, 12536. globopt.cpp 12528
  • V595 The 'symDst' pointer was utilized before it was verified against nullptr. Check lines: 1966, 1967. irbuilderasmjs.cpp 1966
  • V595 The 'symDst' pointer was utilized before it was verified against nullptr. Check lines: 2010, 2011. irbuilderasmjs.cpp 2010
  • V595 The 'symDst' pointer was utilized before it was verified against nullptr. Check lines: 2076, 2077. irbuilderasmjs.cpp 2076
  • V595 The 'symDst' pointer was utilized before it was verified against nullptr. Check lines: 3591, 3592. irbuilderasmjs.cpp 3591
  • V595 The 'symDst' pointer was utilized before it was verified against nullptr. Check lines: 4113, 4114. irbuilderasmjs.cpp 4113
  • V595 The 'symDst' pointer was utilized before it was verified against nullptr. Check lines: 4510, 4511. irbuilderasmjs.cpp 4510
  • V595 The 'm_lastInstr' pointer was utilized before it was verified against nullptr. Check lines: 1102, 1116. irbuilder.cpp 1102

This list shows some simplest and clearest examples. To examine all fragments like that, developers should have a look at the analysis result themselves.

V522 Dereferencing of the null pointer 'tempNumberTracker' might take place. backwardpass.cpp 578

void
BackwardPass::MergeSuccBlocksInfo(BasicBlock * block)
{
  TempNumberTracker * tempNumberTracker = nullptr; // <= line 346
  ....
  if (!block->isDead)
  {
      ....
      if(!IsCollectionPass())
      {
          ....
          if (this->DoMarkTempNumbers())
          {
              tempNumberTracker = JitAnew(....);   // <= line 413
          }
      ....
  ....
  if (blockSucc->tempNumberTracker != nullptr)
  {
      ....
      tempNumberTracker->MergeData(....);          // <= line 578
      if (deleteData)
      {
          blockSucc->tempNumberTracker = nullptr;
      }
  }
  ....
}

Example of another diagnostic which is also connected with the pointers. Here is a code fragment of the MergeSuccBlocksInfo() function, which is quite long - 707 strings. But with the help of static analysis we managed to find "tempNumberTracker" pointer, whose initialization can potentially fail due to several conditions. As a result, if the odds are against the programmer, there will be dereference of a null pointer.

Stop! Check Assert!

Picture 5

Assert, located in the program indicates that a programmer assumes that some expression is true for a correctly running program. But can these "successful checks be trusted"?

V547 Expression 'srcIndex - src->left >= 0' is always true. Unsigned type value is always >= 0. sparsearraysegment.inl 355

class SparseArraySegmentBase
{
public:
    static const uint32 MaxLength;
    ....
    uint32 size;
    ....
}

template<typename T>
SparseArraySegment<T>* SparseArraySegment<T>::CopySegment(....,
  uint32 srcIndex, ....)
{
  ....
  AssertMsg(srcIndex - src->left >= 0,                    // <=
    "src->left > srcIndex resulting in \
     negative indexing of src->elements");
  js_memcpy_s(dst->elements + dstIndex - dst->left,
              sizeof(T) * inputLen,
              src->elements + srcIndex - src->left,
              sizeof(T) * inputLen);
  return dst;
}

Note the comparison "srcIndex-src-> left > = 0". Difference of two unsigned numbers will always be greater than or equal to zero. Further on this function is used for working with the memory. The result can be different from the one that the programmer expected.

V547 Expression is always true. Probably the '&&' operator should be used here. bytecodegenerator.cpp 805

void ByteCodeGenerator::AssignRegister(Symbol *sym)
{
  AssertMsg(sym->GetDecl() == nullptr ||
            sym->GetDecl()->nop != knopConstDecl ||      // <=
            sym->GetDecl()->nop != knopLetDecl, "...."); // <=
            
  if (sym->GetLocation() == Js::Constants::NoRegister)
  {
    sym->SetLocation(NextVarRegister());
  }
}

In this Assert testing of some values is done only partially. If the expression "sym->GetDecl() == nullptr" is false, then the following conditions are always true. You can check it by building a truth table.

Picture 6

V547 Expression 'callSiteId >= 0' is always true. Unsigned type value is always >= 0. inline.cpp 1181

typedef uint16 ProfileId;

Func * Inline::BuildInlinee(Js::FunctionBody* funcBody, ....)
{
  ....
  Js::ProfileId callSiteId = static_cast<Js::ProfileId>(....);
  Assert(callSiteId >= 0);
  ....
}

In this and couple of other fragments the analyzer detected incorrect comparison of an unsigned number with null:

  • V547 Expression 'callSiteId >= 0' is always true. Unsigned type value is always >= 0. inline.cpp 2627
  • V547 Expression 'callSiteId >= 0' is always true. Unsigned type value is always >= 0. inline.cpp 3657

Conclusion

Microsoft has a positive tendency to release their projects under free licenses. For us it is an additional opportunity to test the analyzer on new projects and also a way to demonstrate usefulness and effectiveness of static code analysis on the projects of such a large and well known software vendor.

You may be interested to see the full list of all scanned projects that includes other projects from Microsoft, such as .NET CoreCLR, CoreFX .NET and Microsoft Code Contracts.