Our website uses cookies to enhance your browsing experience.
Accept
to the top
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
Finding bugs in the code of LLVM projec…

Finding bugs in the code of LLVM project with the help of PVS-Studio

Oct 31 2016
Author:

About two months ago I wrote an article about the analysis of GCC using PVS-Studio. The idea of the article was as follows: GCC warnings are great, but they're not enough. It is necessary to use specialized tools for code analysis, for example, PVS-Studio. As proof of my words I showed errors that PVS-Studio was able to find the GCC code. A number of readers have noticed that the quality of the GCC code, and its diagnosis, aren't really great; while Clang compiler is up to date, of high quality, and fresh. In general Clang is awesome! Well, apparently, it's time to check LLVM project with the help of PVS-Studio.

0446_LLVM/image1.png

Checking LLVM with the help of the Linux version of PVS-Studio

I think there are few who do not know what LLVM is. Nevertheless, I'll keep the tradition of giving a short description of the project that has been tested.

LLVM (Low Level Virtual Machine) - a universal system of analysis, transformation and optimization of programs, implementing a virtual machine with RISC-based instructions. It can be used as an optimizing compiler of byte code into machine code for various architectures, or for its interpretation and JIT-compilation (for some platforms). Within the scope of LLVM project, the developers made the Clang front-end for C,C++, and Objective-C, translating the source code into byte code LLVM and allowing the use of LLVM as a full-fledged compiler.

Official site: http://llvm.org/

We checked the revision 282481. The code was checked with a PVS-Studio version, working under Linux. Since PVS-Studio for Linux is a new product, I'll give more details about the process of analysis. I'm sure this will demonstrate that it's really not hard to use our analyzer on Linux, and that you should try it out on your project without hesitation.

0446_LLVM/image2.png

The Linux version of the analyzer is available for downloading on this page: http://www.viva64.com/en/pvs-studio-download-linux/

The previous projects were checked with a universal mechanism that tracks the compiler runs. This time we will use the information that PVS-Studio takes from the JSON Database Compilation for the analysis. Details can be found in the section "How to run PVS-Studio on Linux".

In LLVM 3.9 we have completely ceased using autoconf in favor of Cmake, and it was a good reason to try the support for JSON Compilation Database. What is it? This is a format used by the Clang utilities. It stores a list of compiler calls in the following way:

[
  {
    "directory": "/home/user/llvm/build",
    "command": "/usr/bin/c++ .... file.cc",
    "file": "file.cc"
  },
  ....
]

It's very simple to get such a file for CMake-projects - you simply generate the project with an additional option:

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=On ../llvm

After that there will be compile_commands.json in the current directory. It is this file that we need. Let's build the project first, because some projects use code generation.

make -j8

Now everything is ready for analysis. It starts with a single line:

pvs-studio-analyzer analyze -l ~/PVS-Studio.lic -o PVS-Studio.log -j

You can get compile_commands.json with the help of the Bear utility, for projects that aren't using CMake. But for complex assembly systems actively using environment variables or cross-compilation, the commands don't always provide detailed information about the translation unit.

Note N1. How to work with the report of PVS-Studio in Linux.

Note N2. We provide high-quality and quick support for our clients and potential users. So, if something is unclear or does not work, please contact us at support. You will like our service.

The analysis results

By the way, this is not the first check of LLVM. The article was inspired by previous checks:

Unfortunately, I can't say anything about the number of false positives or the density of errors found. The project is large, there are a lot of warnings, and I had quite a quick look at them. As an excuse I can say that the preparation for the Linux version of PVS-Studio took a lot of time, so I could not work on the article alone.

Enough talking, let's move on to the most interesting material. Let's take a look at the suspicious fragments in the LLVM code which PVS-Studio detected.

Non bit fields

So we have such enumeration in the code:

enum Type {
  ST_Unknown, // Type not specified
  ST_Data,
  ST_Debug,
  ST_File,
  ST_Function,
  ST_Other
};

This is a "classical enumeration", if we can say so. Each name in the enumeration is assigned with an integer value that corresponds to a specific place in the order of the values in the enumeration:

  • ST_Unknown = 0
  • ST_Data = 1
  • ST_Debug = 2
  • ST_File = 3
  • ST_Function = 4
  • ST_Other = 5

Again, let me emphasize that this is just enumeration, not a set of masks. If the constants could be combined, they would be a power of 2.

Now it's time to look at the code, where this enumeration is used incorrectly:

void MachODebugMapParser::loadMainBinarySymbols(....)
{
  ....
  SymbolRef::Type Type = *TypeOrErr;
  if ((Type & SymbolRef::ST_Debug) ||
      (Type & SymbolRef::ST_Unknown))
    continue;
  ....
}

PVS-Studio warning: V616 The 'SymbolRef::ST_Unknown' named constant with the value of 0 is used in the bitwise operation. MachODebugMapParser.cpp 448

Let's recall from memory, that the ST_Unknown constant is zero. Therefore, you can shorten the expression:

if (Type & SymbolRef::ST_Debug)

Clearly something is wrong here. Apparently the programmer who wrote this code decided that he is working with enumeration consisting of flags. That is, he expected that one or another bit matches every constant. But it is not so. I think the correct check should be like this:

if ((Type == SymbolRef::ST_Debug) || (Type == SymbolRef::ST_Unknown))

I think, enum class should have been used here to avoid such errors. In this case, an incorrect expression simply would not be compiled.

Single-iteration loops

The function is not a very complicated one, so I decided to cite it entirely. Before continuing to read the article, I suggest you try to guess what is suspicious here.

Parser::TPResult Parser::TryParseProtocolQualifiers() {
  assert(Tok.is(tok::less) && "Expected '<' for qualifier list");
  ConsumeToken();
  do {
    if (Tok.isNot(tok::identifier))
      return TPResult::Error;
    ConsumeToken();
    
    if (Tok.is(tok::comma)) {
      ConsumeToken();
      continue;
    }
    
    if (Tok.is(tok::greater)) {
      ConsumeToken();
      return TPResult::Ambiguous;
    }
  } while (false);
  
  return TPResult::Error;
}

PVS-Studio warning: V696 The 'continue' operator will terminate 'do { ... } while (FALSE)' loop because the condition is always false. Check lines: 1642, 1649. ParseTentative.cpp 1642

LLVM developers, of course, will be able to understand if there is a bug here or not. I have to play detective. Looking at the code, I was thinking in the following direction: The function should read the opening bracket '<', then it reads the identifiers and commas in the loop. If there is no comma, we expected a closing bracket. If something goes wrong, then the function returns an error code. I reckon there was supposed to be the following algorithm of the function work (pseudocode):

  • The beginning of the loop:
  • Read the identifier. If this is not an identifier, then return the status of an error.
  • Read the comma. If it is a comma, go back to the beginning of the loop.
  • Yup, we do not have a comma. If it is a closing parenthesis, then everything is fine, we exit from the function.
  • Otherwise, return the status of an error.

The trouble is that the programmer tries to resume the loop with the help of the continue operator. It passes the control not to the beginning of the loop body, but to the check of the condition of the loop continuation. And the condition is always false. As a result, the loop exits, and the algorithm becomes the following:

  • The beginning of the loop:
  • Read the identifier. If this is not an identifier, then return the status of an error.
  • Read the comma. If it is a comma, complete the loop and return an error status from the function.
  • Yup, we do not have a comma. If it is a closing parenthesis, then everything is fine, we exit from the function.
  • Otherwise, return the status of an error.

Thus, only the sequence from one element enclosed in square brackets can be correct. If there is more than one item in the sequence, separated by a comma, then the function will return an error status:TPResult::Error.

Now let's consider another case, when not more than one loop iteration is executed:

static bool checkMachOAndArchFlags(....) {
  ....
  unsigned i;
  for (i = 0; i < ArchFlags.size(); ++i) {
    if (ArchFlags[i] == T.getArchName())
      ArchFound = true;
    break;
  }
  ....
}

PVS-Studio warning: V612 An unconditional 'break' within a loop. MachODump.cpp 1206

Pay attention to the break statement. It will break the loop after the first iteration. I think the break statement must refer to a condition, so the correct code will look like this:

for (i = 0; i < ArchFlags.size(); ++i) {
  if (ArchFlags[i] == T.getArchName())
  {
    ArchFound = true;
    break;
  }
}

There are two more similar fragments, but in order not to make the article too long, I'll only copy the analyzer warnings here:

  • V612 An unconditional 'return' within a loop. R600OptimizeVectorRegisters.cpp 54
  • V612 An unconditional 'break' within a loop. llvm-size.cpp 525

The || and && operators are mixed up

static bool containsNoDependence(CharMatrix &DepMatrix,
                                 unsigned Row,
                                 unsigned Column) {
  for (unsigned i = 0; i < Column; ++i) {
    if (DepMatrix[Row][i] != '=' || DepMatrix[Row][i] != 'S' ||
        DepMatrix[Row][i] != 'I')
      return false;
  }
  return true;
}

PVS-Studio warning: V547 Expression is always true. Probably the '&&' operator should be used here. LoopInterchange.cpp 208

The expression makes no sense. I'll simplify the code to highlight the essence of the error:

if (X != '=' || X != 'S' || X != 'I')

The variable X will never be equal to something. As a result, the condition is always true. Most likely, instead of the "||" operators, the "&&" should have been used, then the expression would make sense.

A function returns a reference to a local object

SingleLinkedListIterator<T> &operator++(int) {
  SingleLinkedListIterator res = *this;
  ++*this;
  return res;
}

PVS-Studio warning: V558 Function returns the reference to temporary local object: res. LiveInterval.h 679

The function is a traditional implementation of a postfix increment:

  • The current state is stored in a temporary object;
  • The current state of an object gets changed;
  • The old state of an object returns.

The error is that the function returns a reference. This reference is not valid, because the temporary object res gets destroyed when the function is exited.

To fix this, you need to return a value, rather than a reference:

SingleLinkedListIterator<T> operator++(int) { .... }

Repeated assignment

I'll copy the entire function, so as to show that before the repeating assignment the variable ZeroDirective isn't used in any way.

HexagonMCAsmInfo::HexagonMCAsmInfo(const Triple &TT) {
  Data16bitsDirective = "\t.half\t";
  Data32bitsDirective = "\t.word\t";
  Data64bitsDirective = nullptr;
  ZeroDirective = "\t.skip\t";                            // <=
  CommentString = "//";

  LCOMMDirectiveAlignmentType = LCOMM::ByteAlignment;
  InlineAsmStart = "# InlineAsm Start";
  InlineAsmEnd = "# InlineAsm End";
  ZeroDirective = "\t.space\t";                           // <=
  AscizDirective = "\t.string\t";

  SupportsDebugInformation = true;
  MinInstAlignment = 4;
  UsesELFSectionDirectiveForBSS  = true;
  ExceptionsType = ExceptionHandling::DwarfCFI;
}

PVS-Studio warning: V519 The 'ZeroDirective' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 25, 31. HexagonMCAsmInfo.cpp 31

The variable ZeroDirective is a simple pointer of const char * type. In the beginning it points to a string "\t.skip\t", but further on it is assigned with a line address "\t.space\t". It's weird, and doesn't make sense. There is a high probability that one of the assignments should change a completely different variable.

Let's look at another case of repetitive assignment.

template <class ELFT>
void GNUStyle<ELFT>::printFileHeaders(const ELFO *Obj) {
  ....
  Str = printEnum(e->e_ident[ELF::EI_OSABI], makeArrayRef(ElfOSABI));
  printFields(OS, "OS/ABI:", Str);
  Str = "0x" + to_hexString(e->e_version);                  // <=
  Str = to_hexString(e->e_ident[ELF::EI_ABIVERSION]);       // <=
  printFields(OS, "ABI Version:", Str);
  Str = printEnum(e->e_type, makeArrayRef(ElfObjectFileType));
  printFields(OS, "Type:", Str);
  ....
}

PVS-Studio warning: V519 The 'Str' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 2407, 2408. ELFDumper.cpp 2408

Apparently we are dealing with a typo. Instead of doing the reassignment, the programmer had to link two lines with the help of += operator. Then the correct code could be like this:

Str = "0x" + to_hexString(e->e_version);
Str += to_hexString(e->e_ident[ELF::EI_ABIVERSION]);

There are several more code fragments with the repeated assignment. In my opinion, these repetitive assignments do not pose any danger, so I'll just copy the warnings as a list:

  • V519 The variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 55, 57. coff2yaml.cpp 57
  • V519 The 'O' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 394, 395. llvm-pdbdump.cpp 395
  • V519 The 'servAddr.sin_family' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 63, 64. server.cpp 64

Suspicious handling of smart pointers

Expected<std::unique_ptr<PDBFile>>
PDBFileBuilder::build(
  std::unique_ptr<msf::WritableStream> PdbFileBuffer)
{
  ....
  auto File = llvm::make_unique<PDBFile>(
    std::move(PdbFileBuffer), Allocator);

  File->ContainerLayout = *ExpectedLayout;

  if (Info) {
    auto ExpectedInfo = Info->build(*File, *PdbFileBuffer);
  ....
}

PVS-Studio warning: V522 Dereferencing of the null pointer 'PdbFileBuffer' might take place. PDBFileBuilder.cpp 106

The code is not clear to me, as I have not studied what llvm::make_unique is, and how it works in general. Nevertheless, both myself and the analyzer are confused by the fact that at first glance the possession of an object from a smart pointer PdbFileBuffer goes to File. After that we have dereferencing of a null pointer PdbFileBuffer that already contains nullptr. Specifically, this fragment looks strange:

.... llvm::make_unique<PDBFile>(::move(PdbFileBuffer), Allocator);
....
.... Info->build(*File, *PdbFileBuffer);

If this is a bug, then it should be fixed in 3 more fragments in the same file:

  • V522 Dereferencing of the null pointer 'PdbFileBuffer' might take place. PDBFileBuilder.cpp 113
  • V522 Dereferencing of the null pointer 'PdbFileBuffer' might take place. PDBFileBuilder.cpp 120
  • V522 Dereferencing of the null pointer 'PdbFileBuffer' might take place. PDBFileBuilder.cpp 127

A typo in the condition

static bool areExclusiveRanges(BinaryOperatorKind OpcodeLHS,
                               const APSInt &ValueLHS,
                               BinaryOperatorKind OpcodeRHS,
                               const APSInt &ValueRHS) {
  ....
  // Handle cases where the constants are different.
  if ((OpcodeLHS == BO_EQ ||
       OpcodeLHS == BO_LE ||                 // <=
       OpcodeLHS == BO_LE)                   // <=
      &&
      (OpcodeRHS == BO_EQ ||
       OpcodeRHS == BO_GT ||
       OpcodeRHS == BO_GE))
    return true;
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'OpcodeLHS == BO_LE' to the left and to the right of the '||' operator. RedundantExpressionCheck.cpp 174

This is a classic typo. The variable OpcodeLHS is compared with the BO_LE constant twice. It seems to me that one of the BO_LE constants should be replaced by BO_LT. As you can see the names of the constants are very similar and can be easily confused.

The following example demonstrates how static analysis complements other methodologies of writing high quality code. Let's inspect the incorrect code:

std::pair<Function *, Function *>
llvm::createSanitizerCtorAndInitFunctions(
    ....
    ArrayRef<Type *> InitArgTypes, ArrayRef<Value *> InitArgs,
    ....)
{
  assert(!InitName.empty() && "Expected init function name");
  assert(InitArgTypes.size() == InitArgTypes.size() &&
    "Sanitizer's init function expects "
    "different number of arguments");
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'InitArgTypes.size()' to the left and to the right of the '==' operator. ModuleUtils.cpp 107

One of the many good ways to improve code safety, is to use assert() macros. This macro, and ones similar to it, help to detect various errors at the developmental stage, and during debugging. But I won't go into detail here about the benefits of such macros, as it is beyond the scope of this article.

It is important to us that the assert() macros are used in the function createSanitizerCtorAndInitFunctions() to check the correctness of the input data. Too bad the second assert() macro is useless, because of a typo.

Fortunately, the static analyzer is of great help here, as it notices that the array size is compared with itself. As a result we can fix this check, and the correct condition in assert() may help to prevent some other error in the future.

Apparently, in the condition the array sizes InitArgTypes and InitArgs should be compared:

assert(InitArgTypes.size() == InitArgs.size() &&
  "Sanitizer's init function expects "
  "different number of arguments");

Confusion between the release() and reset()

In the std::unique_ptr class there are two functions with similar names: release and reset. My observations show, they are sometimes confused. Apparently this is what happened here:

std::unique_ptr<DiagnosticConsumer> takeClient()
  { return std::move(Owner); }

VerifyDiagnosticConsumer::~VerifyDiagnosticConsumer() {
  ....
  SrcManager = nullptr;
  CheckDiagnostics();
  Diags.takeClient().release();
}

PVS-Studio warning: V530 The return value of function 'release' is required to be utilized. VerifyDiagnosticConsumer.cpp 46

Perhaps there is no error here, and the programmer used some tricky logic. But it looks more like a resource leak. In any case, the developers should take one more look at this code fragment.

Redundant conditions

bool ARMDAGToDAGISel::tryT1IndexedLoad(SDNode *N) {
  LoadSDNode *LD = cast<LoadSDNode>(N);
  EVT LoadedVT = LD->getMemoryVT();
  ISD::MemIndexedMode AM = LD->getAddressingMode();
  if (AM == ISD::UNINDEXED ||
      LD->getExtensionType() != ISD::NON_EXTLOAD ||
      AM != ISD::POST_INC ||
      LoadedVT.getSimpleVT().SimpleTy != MVT::i32)
    return false;
  ....
}

PVS-Studio warning: V590 Consider inspecting this expression. The expression is excessive or contains a misprint. ARMISelDAGToDAG.cpp 1565

The condition is long, so I'll highlight the most important part:

AM == ISD::UNINDEXED || AM != ISD::POST_INC

This condition is redundant, and you can simplify it to:

AM != ISD::POST_INC

So, we see here redundancy in the condition, or some error. There is a chance, that the redundancy shows that some other condition was meant here. I cannot judge how dangerous this is, but it is certainly worth reviewing. Also, I want to draw the developers attention to two more analyzer warnings:

  • V590 Consider inspecting this expression. The expression is excessive or contains a misprint. ASTReader.cpp 4178
  • V590 Consider inspecting this expression. The expression is excessive or contains a misprint. BracesAroundStatementsCheck.cpp 46

My favorite V595 warnings

Pointers in C and C++ - an endless headache for programmers. You verify them against null, and then somewhere there is null pointer dereference again! The V595 diagnostic detects situations where the verification against null is done too late. Before this check the pointer already gets used. This is one of the most typical errors that we find in the code of various applications (proof). However, speaking in support of C/C++, I will say that the situation in C# is not much better. Despite the fact that the C# pointers are now called references, such bugs haven't disappeared (proof).

Let's go back to the LLVM code and look at a simple variant of the bug:

bool PPCDarwinAsmPrinter::doFinalization(Module &M) {
  ....
  MachineModuleInfoMachO &MMIMacho =
      MMI->getObjFileInfo<MachineModuleInfoMachO>();

  if (MAI->doesSupportExceptionHandling() && MMI) {
  ....
}

PVS-Studio warning: V595 The 'MMI' pointer was utilized before it was verified against nullptr. Check lines: 1357, 1359. PPCAsmPrinter.cpp 1357

The case is simple, and everything is quite obvious. The check (... && MMI) tells us that the pointer MMI can be null. If so, the program won't get to this check during execution. It will be terminated earlier because of the null pointer dereference.

Let's look at one more code fragment:

void Sema::CodeCompleteObjCProtocolReferences(
  ArrayRef<IdentifierLocPair> Protocols)
{
  ResultBuilder 
    Results(*this, CodeCompleter->getAllocator(),
            CodeCompleter->getCodeCompletionTUInfo(),
            CodeCompletionContext::CCC_ObjCProtocolName);
  
  if (CodeCompleter && CodeCompleter->includeGlobals()) {
    Results.EnterNewScope();
  ....
}

PVS-Studio warning: V595 The 'CodeCompleter' pointer was utilized before it was verified against nullptr. Check lines: 5952, 5955. SemaCodeComplete.cpp 5952

The pointer CodeCompleter gets dereferenced first, and further on there is verification of the pointer against null. The same code was detected three more times in the same file:

  • V595 The 'CodeCompleter' pointer was utilized before it was verified against nullptr. Check lines: 5980, 5983. SemaCodeComplete.cpp 5980
  • V595 The 'CodeCompleter' pointer was utilized before it was verified against nullptr. Check lines: 7455, 7458. SemaCodeComplete.cpp 7455
  • V595 The 'CodeCompleter' pointer was utilized before it was verified against nullptr. Check lines: 7483, 7486. SemaCodeComplete.cpp 7483

These were simple cases, but sometimes the code is more complex, where it's hard to say how dangerous it is. So my suggestion to the developers, is to check the following fragments of the LLVM code:

  • V595 The 'Receiver' pointer was utilized before it was verified against nullptr. Check lines: 2543, 2560. SemaExprObjC.cpp 2543
  • V595 The 'S' pointer was utilized before it was verified against nullptr. Check lines: 1267, 1296. SemaLookup.cpp 1267
  • V595 The 'TargetDecl' pointer was utilized before it was verified against nullptr. Check lines: 4037, 4046. CGExpr.cpp 4037
  • V595 The 'CurrentToken' pointer was utilized before it was verified against nullptr. Check lines: 705, 708. TokenAnnotator.cpp 705
  • V595 The 'FT' pointer was utilized before it was verified against nullptr. Check lines: 540, 554. Expr.cpp 540
  • V595 The 'II' pointer was utilized before it was verified against nullptr. Check lines: 448, 450. IdentifierTable.cpp 448
  • V595 The 'MF' pointer was utilized before it was verified against nullptr. Check lines: 268, 274. X86RegisterInfo.cpp 268
  • V595 The 'External' pointer was utilized before it was verified against nullptr. Check lines: 40, 45. HeaderSearch.cpp 40
  • V595 The 'TLI' pointer was utilized before it was verified against nullptr. Check lines: 4239, 4244. CodeGenPrepare.cpp 4239
  • V595 The 'SU->getNode()' pointer was utilized before it was verified against nullptr. Check lines: 292, 297. ResourcePriorityQueue.cpp 292
  • V595 The 'BO0' pointer was utilized before it was verified against nullptr. Check lines: 2835, 2861. InstCombineCompares.cpp 2835
  • V595 The 'Ret' pointer was utilized before it was verified against nullptr. Check lines: 2090, 2092. ObjCARCOpts.cpp 2090

Strange code

I apologize that I cite here a hard to read code fragment. A little more patience, please, the article is almost at an end.

static bool print_class_ro64_t(....) {
  ....
  const char *r;
  uint32_t offset, xoffset, left;
  ....
  r = get_pointer_64(p, offset, left, S, info);
  if (r == nullptr || left < sizeof(struct class_ro64_t))
    return false;
  memset(&cro, '\0', sizeof(struct class_ro64_t));
  if (left < sizeof(struct class_ro64_t)) {
    memcpy(&cro, r, left);
    outs() << "   (class_ro_t entends past the .......)\n";
  } else
    memcpy(&cro, r, sizeof(struct class_ro64_t));
  ....
}

PVS-Studio warning: V649 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains function return. This means that the second 'if' statement is senseless. Check lines: 4410, 4413. MachODump.cpp 4413

Pay attention to the check:

if (.... || left < sizeof(struct class_ro64_t))
  return false;

If the value in the left variable is less than the class size, then the function will exit. It turns out that this choice of behavior does not make sense:

if (left < sizeof(struct class_ro64_t)) {
  memcpy(&cro, r, left);
  outs() << "   (class_ro_t entends past the .......)\n";
} else
  memcpy(&cro, r, sizeof(struct class_ro64_t));

The condition is always false, and therefore the else-branch always executes. This is very strange. Perhaps the program contains a logical error, or, we are dealing with a typo.

This place also needs some revision:

  • V649 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains function return. This means that the second 'if' statement is senseless. Check lines: 4612, 4615. MachODump.cpp 4615

A couple of small notes

A class SequenceNumberManager is declared inside a template class RPC. It has such a move assignment operator:

SequenceNumberManager &operator=(SequenceNumberManager &&Other) {
  NextSequenceNumber = std::move(Other.NextSequenceNumber);
  FreeSequenceNumbers = std::move(Other.FreeSequenceNumbers);
}

PVS-Studio warning: V591 Non-void function should return a value. RPCUtils.h 719

As you can see return was forgotten in the end:

return *this;

Actually there is nothing terrible here. Compilers generally do not work with bodies of functions of template classes in any way, if these functions are not used. Apparently, we have this case here. Although I have not tested it, I am quite sure: if you call this move operator, the compiler will generate an error, or yield a warning. So, there is nothing wrong here, but I decided to point out this flaw.

There were several strange code fragments, where the value of the pointer returned by the new operator is verified against null. This code does not make sense, because if you are not able to allocate the memory, the exception std::bad_alloc will be thrown. Here's one such place:

LLVMDisasmContextRef LLVMCreateDisasmCPUFeatures(....) {
  ....
  // Set up the MCContext for creating symbols and MCExpr's.
  MCContext *Ctx = new MCContext(MAI, MRI, nullptr);
  if (!Ctx)
    return nullptr;
  ....
}

PVS-Studio warning: V668 There is no sense in testing the 'Ctx' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. Disassembler.cpp 76

Two more warnings:

  • V668 There is no sense in testing the 'DC' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. Disassembler.cpp 103
  • V668 There is no sense in testing the 'JITCodeEntry' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. GDBRegistrationListener.cpp 180

These code fragments don't look dangerous, so I decided to describe them in the section on unimportant warnings. Most likely, all three of these checks can simply be removed.

Conclusion

As you can see, the compiler warnings are good, but they are not enough. Specialized tools for static analysis, such as PVS-Studio will always outpace the compilers in diagnostic capabilities and configuration flexibility working with false positives. That's actually how the analyzer developers make money.

It's also important to note that the main effect from static analysis, will only be achieved with regular use of static code analyzers. A lot of errors will be detected at the earliest stage, so there won't be need to debug, or ask users to give a detailed description of the actions that led to the program crash. In the static analysis we have the warnings similar to the warnings of a compiler (actually, they are almost the same, but more intelligent). I think that everybody always checks the compiler warnings, not just one a month?!

I suggest downloading and trying out PVS-Studio on your project's code.



Comments (0)

Next comments next comments
close comment form