23.12.2015

New Year PVS-Studio 6.00 Release: Scanning Roslyn

The long wait is finally over. We have released a static code analyzer PVS-Studio 6.00 that supports the analysis of C# projects. It can now analyze projects written in languages C, C++, C++/CLI, C++/CX, and C#. For this release, we have prepared a report based on the analysis of open-source project Roslyn. It is thanks to Roslyn that we were able to add the C# support to PVS-Studio, and we are very grateful to Microsoft for this project.

Picture 1

PVS-Studio 6.00

PVS-Studio is a static code analyzer designed to detect software bugs at the coding stage and created to be easy-to-use.

We regularly add new diagnostic rules to enable the search of new bug patterns in applications written in C/C++. For example, we have recently added search of class members that are not initialized in constructors, and it was quite a non-trivial task indeed. However, improving the diagnostics can't be deemed a reason for upgrading the product's major version number; for this we waited until we brought something really new to our analyzer. And now, it has finally happened. Meet the sixth version of PVS-Studio with support for C# programming language.

PVS-Studio 6.00 trial version can be downloaded here:

http://www.viva64.com/en/pvs-studio-download/

In the sixth version of our tool, we ended support for older Visual Studio versions, VS2005 and VS2008. If your team still uses one of these, we suggest that you stick with the previous version of PVS-Studio, 5.31, or its updates if there are any.

The demo version has just one limitation. You have 50 click-jumps to the code. Once you've used these, the tool will suggest filling out a small questionnaire. If you agree, you'll be granted 50 more jumps.

While the demo version's limitation may seem severe, we experimented a lot before we shaped this limitation, and there are solid reasons behind it.

The small number of "clicks" will prompt you to get engaged in communication via email sooner. If you want to view the other messages, we can grant you a registration key for, say, one week. Just send us an email. Communication by email usually allows us to help users start taking advantage of the analyzer quicker and easier.

Roslyn

Programmers are immune to ads. You can't lure them with words like "mission", "perfectly", and "innovation-focused". Every programmer is accustomed to ignoring ads and knows how to disable banners using Adblock Plus.

The only way to attract their attention is to show how they can benefit by using a particular software tool. It is this path we have taken; we show how useful static analysis tools can be by scanning open-source projects.

PVS-Studio can find bugs in well-known projects such as CoreCLR, LibreOffice, Linux Kernel, Qt, Unreal Engine 4, Chromium, and so on. By now, our team has scanned 230 open-source projects and found a total of 9355 bugs. Yes, that's right: 9355 is the number of bugs, not diagnostic messages. To read about the most interesting scans, see the corresponding articles.

Now we've finally got to C# projects too. It is little wonder that we picked Roslyn as one of the first projects to be analyzed. After all, it is thanks to this software that we were given the opportunity to support C#-code analysis in PVS-Studio.

I'd like to thank the Microsoft company for developing this open-source project; it will surely have a significant impact on the C#-infrastructure's development. Well, it has already started to happen! For example, such famous projects as ReSharper and CodeRush are being ported to the Roslyn platform.

Now a few words about the Roslyn project itself.

.NET Compiler Platform, better known by its codename "Roslyn", is a set of open-source compilers and code analysis APIs for C# and Visual Basic .NET languages from Microsoft.

The project notably includes self-hosting versions of the C# and VB.NET compilers - compilers written in the languages themselves. The compilers are available via the traditional command-line programs, but also as APIs available natively from within .NET code. Roslyn exposes modules for syntactic (lexical) analysis of code, semantic analysis, dynamic compilation to CIL, and code emission. Roslyn APIs are of three types, namely feature APIs, work-space APIs and compiler APIs. Feature APIs make the refactoring and fixing process easier. Work-space APIs allow plugin developers to perform actions specifically required in integrated development environments (IDEs) like Visual Studio such as finding references of a variable or code formatting. Compiler APIs allow even more sophisticated analysis of source code, by exposing direct calls to perform syntax tree and binding flow analysis.

References:

Bugs found

There are not many errors that PVS-Studio found in Roslyn, and this is not surprising for such a famous project, especially since it is developed by Microsoft, with their long established quality-control systems. Finding anything at all in Roslyn's code would be already a victory, and we are proud that we can do it.

Many of the bugs discussed refer to tests or error handlers, and this is normal too. Errors in the code which gets executed most frequently are fixed through testing and user reports. But it is the fact that PVS-Studio can catch these bugs beforehand that matters. It means that many bugs, which previously took much time and effort to fix, could potentially be fixed immediately by using PVS-Studio on a regular basis.

In other words: it is regular use that makes the analyzer valuable, not casual runs. View static code analysis as an extension to compiler warnings. It's not a good idea to enable compiler warnings once a year, is it? You have to address them as they appear; it helps save development time on tracking down silly mistakes through debugging. It's just the same with static analyzers.

Let's see now what interesting bugs we managed to find with PVS-Studio in the Roslyn project:

Error No. 1, in tests. Copy-Paste.

public void IndexerMemberRace()
{
  ....
  for (int i = 0; i < 20; i++)
  {
    ....
    if (i % 2 == 0)
    {
      thread1.Start();
      thread2.Start();
    }
    else
    {
      thread1.Start();
      thread2.Start();
    }
    ....
  }
  ....
}

PVS-Studio diagnostic message: V3004 The 'then' statement is equivalent to the 'else' statement. GetSemanticInfoTests.cs 2269

This is an example of errors found in tests. They can live there for years since they don't cause any trouble. It's just that the test doesn't check all that it was meant to. In both branches, thread 1 starts all the time, followed by thread 2. The code was most likely meant to look like this:

if (i % 2 == 0)
{
  thread1.Start();
  thread2.Start();
}
else
{
  // Threads' start order changed
  thread2.Start();
  thread1.Start();
}

Error No. 2, in tests. Typo.

public DiagnosticAsyncToken(
  AsynchronousOperationListener listener,
  string name,
  object tag,
  string filePath,
  int lineNumber)
  : base(listener)
{
  Name = Name;
  Tag = tag;
  FilePath = filePath;
  LineNumber = lineNumber;
  StackTrace = PortableShim.StackTrace.GetString();
}

PVS-Studio diagnostic message: V3005 The 'Name' variable is assigned to itself. AsynchronousOperationListener.DiagnosticAsyncToken.cs 32

It's not easy to spot the error here. Poor variable naming is to blame. Class attributes' names differ from those of function arguments only in having the first letter capitalized. That way, it's easy to make a typo, and that's exactly what happened: Name = Name.

Errors No. 3, No. 4. Copy-Paste.

private Task<SyntaxToken>
GetNewTokenWithRemovedOrToggledPragmaAsync(....)
{
  var result = isStartToken
    ? GetNewTokenWithPragmaUnsuppress(
        token, indexOfTriviaToRemoveOrToggle, _diagnostic, Fixer,
        isStartToken, toggle)
    : GetNewTokenWithPragmaUnsuppress(
        token, indexOfTriviaToRemoveOrToggle, _diagnostic, Fixer,
        isStartToken, toggle);
        
  return Task.FromResult(result);
}

PVS-Studio diagnostic message: V3012 The '?:' operator, regardless of its conditional expression, always returns one and the same value. AbstractSuppressionCodeFixProvider.RemoveSuppressionCodeAction_Pragma.cs 177

No matter what value 'isStartToken' refers to, the GetNewTokenWithPragmaUnsuppress() function is called with the same set of arguments. It seems that the function call was duplicated through Copy-Paste and the programmer forgot to edit the copy.

Here is another similar case:

private void DisplayDiagnostics(....)
{
  ....
  _console.Out.WriteLine(
    string.Format((notShown == 1) ?
      ScriptingResources.PlusAdditionalError :
      ScriptingResources.PlusAdditionalError,
      notShown));
  ....
}

PVS-Studio diagnostic message: V3012 The '?:' operator, regardless of its conditional expression, always returns one and the same value: ScriptingResources.PlusAdditionalError. CommandLineRunner.cs 428

Errors No. 5, No. 6. Carelessness.

I haven't collected many statistics on typical mistakes made by C# programmers yet, but, besides ordinary typos, there is obviously one bug pattern that's in the lead. It has to do with casting a reference using the 'as' operator, and checking the source reference instead of the resulting one. Here's a synthetic example:

var A = B as T;
if (B) A.Foo();

And this is how this bug looks in real-life code:

public override bool Equals(object obj)
{
  var d = obj as DiagnosticDescription;

  if (obj == null)
    return false;
    
  if (!_code.Equals(d._code))
    return false;
  ....
}

PVS-Studio diagnostic message: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'd'. DiagnosticDescription.cs 201

The next example is lengthier, but the problem is the same:

protected override bool AreEqual(object other)
{
  var otherResourceString = other as LocalizableResourceString;
  return
    other != null &&
    _nameOfLocalizableResource == 
      otherResourceString._nameOfLocalizableResource &&
    _resourceManager == otherResourceString._resourceManager &&
    _resourceSource == otherResourceString._resourceSource &&
    ....
}

PVS-Studio diagnostic message: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'other', 'otherResourceString'. LocalizableResourceString.cs 121

Error No. 7. Double detection.

Sometimes a bug may trigger two, or even three, different warnings. Here's such an example:

private bool HasMatchingEndTag(
  XmlElementStartTagSyntax parentStartTag)
{
  if (parentStartTag == null)
  {
    return false;
  }

  var parentElement = parentStartTag.Parent as XmlElementSyntax;
  if (parentStartTag == null)
  {
    return false;
  }
  var endTag = parentElement.EndTag;
  ....
}

PVS-Studio diagnostic messages:

  • V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'parentStartTag', 'parentElement'. XmlTagCompletionCommandHandler.cs 123
  • V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless XmlTagCompletionCommandHandler.cs 117

In the beginning of the function's body, the 'parentStartTag' argument is checked for null. If it is null, the function returns.

After that, the programmer wanted to check if the reference really points to a class of type 'XmlElementSyntax', but at this point, a typo sneaked in. Instead of 'parentElement', 'parentStartTag' is checked for the second time.

The analyzer detects two anomalies at once here. The first has to do with rechecking the value of 'parentStartTag' as it doesn't make sense, since the function has already returned if it was a null reference. The second deals with the analyzer's suspicions that a wrong variable might be checked after the 'as' operator.

The fixed version of that code should look like this:

if (parentStartTag == null)
{
  return false;
}
var parentElement = parentStartTag.Parent as XmlElementSyntax;
if (parentElement == null)
{
  return false;
}

Errors No. 8, No. 9. Copy-Paste.

Sorry for a long sample, but I didn't feel it is right to replace it with a synthetic one:

internal static bool ReportConflictWithParameter(....)
{
  ....
  if (newSymbolKind == SymbolKind.Parameter ||
      newSymbolKind == SymbolKind.Local)
  {
    diagnostics.Add(ErrorCode.ERR_LocalSameNameAsTypeParam,
                    newLocation, name);
    return true;
  }
  if (newSymbolKind == SymbolKind.TypeParameter)
  {
    return false;
  }
  if (newSymbolKind == SymbolKind.Parameter ||
      newSymbolKind == SymbolKind.Local)
  {
    diagnostics.Add(ErrorCode.ERR_LocalSameNameAsTypeParam,
                    newLocation, name);
    return true;
  }
  ....
}

PVS-Studio diagnostic message: V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless InMethodBinder.cs 264

In this code, the first and the third 'if' statements are the same. It's probably because the programmer copied a code block and forgot to change it. On the other hand, the duplicate 'if' might just be an extra line, and should be removed.

There was another code fragment like that, but don't worry, I won't make you read it. Just note the diagnostic message:

V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless WithLambdaParametersBinder.cs 131

Error No. 10. Incorrect condition.

public enum TypeCode
{
  ....
  Object = 1,
  ....
  DateTime = 16,
  ....
}

static object GetHostObjectValue(Type lmrType, object rawValue)
{
  var typeCode = Metadata.Type.GetTypeCode(lmrType);
  return (lmrType.IsPointer || lmrType.IsEnum ||
          typeCode != TypeCode.DateTime ||
          typeCode != TypeCode.Object)
            ? rawValue : null;
}

PVS-Studio diagnostic message: V3022 Expression 'lmrType.IsPointer || lmrType.IsEnum || typeCode != TypeCode.DateTime || typeCode != TypeCode.Object' is always true. DkmClrValue.cs 136

The expression is pretty complicated, so here's the gist of it:

(typeCode != 1 || typeCode != 16)

This expression is always true, no matter what value the 'typeCode' variable refers to.

Error No. 11. Redundant condition.

public enum EventCommand
{
  Disable = -3,
  Enable = -2,
  SendManifest = -1,
  Update = 0
}

protected override void OnEventCommand(
  EventCommandEventArgs command)
{
  base.OnEventCommand(command);

  if (command.Command == EventCommand.SendManifest ||
      command.Command != EventCommand.Disable ||
       FunctionDefinitionRequested(command))
  ....
}

PVS-Studio diagnostic message: V3023 Consider inspecting this expression. The expression is excessive or contains a misprint. RoslynEventSource.cs 79

Again, the main idea is this:

if (A == -1 || A != -3)

This expression is either incorrect or redundant and can be reduced to the following:

if (A != -3)

Error No. 12. Logging error

static CompilerServerLogger()
{
  ....
  loggingFileName = Path.Combine(loggingFileName,
    string.Format("server.{1}.{2}.log",
                  loggingFileName,
                  GetCurrentProcessId(),
                  Environment.TickCount));
  ....
}

PVS-Studio diagnostic message: V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Expected: 2. Present: 3. CompilerServerLogger.cs 49

The 'loggingFileName' variable is not used in any way in the call on function Format(). It doesn't look right.

Error No. 13, in error handler.

private const string WriteFileExceptionMessage =
  @"{1}
  To reload the Roslyn compiler package, close Visual Studio and
  any MSBuild processes, then restart Visual Studio.";
  
private void WriteMSBuildFiles(....)
{
  ....
  catch (Exception e)
  {
    VsShellUtilities.ShowMessageBox(
      this,
      string.Format(WriteFileExceptionMessage, e.Message),
      null,
      OLEMSGICON.OLEMSGICON_WARNING,
      OLEMSGBUTTON.OLEMSGBUTTON_OK,
      OLEMSGDEFBUTTON.OLEMSGDEFBUTTON_FIRST);
  }
}

PVS-Studio diagnostic message: V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Expected: 2. Present: 1. CompilerPackage.cs 105

An exception is very likely to be raised when the program tries displaying the Message Box. The reason is that the Format() function is trying to print the second additional argument which is absent.

I suspect that the constant format string should begin with the following:

@"{0} 

Errors No. 14, No. 15, in error handler.

I can't agree with the statement that function DumpAttributes() is not used in any way for now. There are two bugs at once found in it, each of which triggers an exception:

private void DumpAttributes(Symbol s)
{
  int i = 0;
  foreach (var sa in s.GetAttributes())
  {
    int j = 0;
    foreach (var pa in sa.CommonConstructorArguments)
    {
      Console.WriteLine("{0} {1} {2}", pa.ToString());
      j += 1;
    }
    j = 0;
    foreach (var na in sa.CommonNamedArguments)
    {
      Console.WriteLine("{0} {1} {2} = {3}",
        na.Key, na.Value.ToString());
      j += 1;
    }
    i += 1;
  }
}

PVS-Studio diagnostic messages:

  • V3025 Incorrect format. A different number of format items is expected while calling 'WriteLine' function. Expected: 3. Present: 1. LoadingAttributes.cs 551
  • V3025 Incorrect format. A different number of format items is expected while calling 'WriteLine' function. Expected: 4. Present: 2. LoadingAttributes.cs 558

In both calls on function WriteLine(), it receives fewer arguments than expected. As a result, FormatExceptions are raised.

Error No. 16. Dangerous expression.

I bet you'll just glance over the code below and skip it for good. It's excellent proof that we need those tireless code analyzers.

private static bool SymbolsAreCompatibleCore(....)
{
  ....
  var type = methodSymbol.ContainingType;
  var newType = newMethodSymbol.ContainingType;
  if ((type != null && type.IsEnumType() &&
       type.EnumUnderlyingType != null &&
       type.EnumUnderlyingType.SpecialType ==
         newType.SpecialType) ||
      (newType != null && newType.IsEnumType() &&
       newType.EnumUnderlyingType != null &&
       newType.EnumUnderlyingType.SpecialType ==
         type.SpecialType))
  {
    return true;
  }
  ....
}

PVS-Studio diagnostic message: V3027 The variable 'newType' was utilized in the logical expression before it was verified against null in the same logical expression. AbstractSpeculationAnalyzer.cs 383

To show what makes this code dangerous, here's a simple synthetic example based on it:

if ((A != null && A.x == B.y) || (B != null && B.q == A.w))

As you can see, the condition's logic implies that A and B may be null references. The expression consists of two parts: in the first part reference A is checked, but reference B isn't; in the second part reference B is checked, but reference A isn't.

This code may be lucky enough to stay runnable, but it does look strange and dangerous.

Errors No. 17, No. 18. Double assignments.

public static string Stringize(this Diagnostic e)
{
  var retVal = string.Empty;
  if (e.Location.IsInSource)
  {
    retVal = e.Location.SourceSpan.ToString() + ": ";
  }
  else if (e.Location.IsInMetadata)
  {
    return "metadata: ";
  }
  else
  {
    return "no location: ";
  }
  retVal = e.Severity.ToString() + " " + e.Id + ": " +
           e.GetMessage(CultureInfo.CurrentCulture);
  return retVal;
}

PVS-Studio diagnostic message: V3008 The 'retVal' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 324, 313. DiagnosticExtensions.cs 324

Notice how variable 'retVal' is assigned a value in one of the 'if' statement's branches, but is then assigned another value at the end of the function's body. I'm not sure, but the second assignment should probably be rewritten in the following way:

retVal = retVal  + e.Severity.ToString() + " " + e.Id + ": " +
         e.GetMessage(CultureInfo.CurrentCulture);

Here is another similar case:

public int GetMethodsInDocument(
  ISymUnmanagedDocument document,
  int bufferLength, 
  out int count,
  ....)
{
  ....
  if (bufferLength > 0)
  {
    ....
    count = actualCount;
  }
  else
  {
    count = extentsByMethod.Length;
  }
  count = 0;
  return HResult.S_OK;
}

PVS-Studio diagnostic message: V3008 The 'count' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 317, 314. SymReader.cs 317

The function returns a value by 'count' reference. In different parts of the function, 'count' is assigned different values. What doesn't look right is that 'count' is for some reason assigned 0 at the end of the function's body all the time. This is quite strange.

Error No. 19, in tests. Typo.

internal void VerifySemantics(....)
{
  ....
  if (additionalOldSources != null)
  {
    oldTrees = oldTrees.Concat(
      additionalOldSources.Select(s => ParseText(s)));
  }
  
  if (additionalOldSources != null)
  {
    newTrees = newTrees.Concat(
      additionalNewSources.Select(s => ParseText(s)));
  }
  ....
}

PVS-Studio diagnostic message: V3029 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 223, 228. EditAndContinueTestHelpers.cs 223

In the second condition, 'additionalNewSources' should be checked instead of 'additionalOldSources'. If the 'additionalNewSources' reference turns out to be null, an exception will be raised when trying to call function Select().

Error No. 20. Questionable.

I've not shown all the warnings output by the PVS-Studio analyzer, of course. There are lots of warnings that are obviously false positives, but there are even more cases where I'm simply not familiar with Roslyn well enough to tell if they are bugs or not. This is one such case:

public static SyntaxTrivia Whitespace(string text)
{
  return Syntax.InternalSyntax.SyntaxFactory.Whitespace(
           text, elastic: false);
}
public static SyntaxTrivia ElasticWhitespace(string text)
{
  return Syntax.InternalSyntax.SyntaxFactory.Whitespace(
           text, elastic: false);
}

V3013 It is odd that the body of 'Whitespace' function is fully equivalent to the body of 'ElasticWhitespace' function (118, line 129). SyntaxFactory.cs 118

Two functions have the same bodies. The analyzer doesn't like it, and neither do I. But I don't know the project well enough to be sure; this code may well be correct. So, I can only make an assumption: in the ElasticWhitespace() function, argument 'elastic', which equals 'true', should probably be used.

Error Nxx.

I hope you understand that I can't investigate every case like the one above in detail. I scan lots of projects, and I don't have much knowledge about each of them. That's why I discuss only the most evident errors in my articles. In this one, I've discussed 20 such bugs, but I suspect PVS-Studio found many more. This is why I encourage Roslyn developers to scan the project themselves, rather than relying solely on this article. The demo version won't be sufficient for this task, but we can grant a temporary registration key.

Comparison with ReSharper

I've written just a few articles on C# analysis, and have given only one conference talk at this point. But what I've already found, is that one question is asked all the time: "Do you have a comparison with ReSharper?"

I don't like this for two reasons. Firstly, these tools belong to different fields. PVS-Studio is a typical code analyzer designed for bug searching. ReSharper is a productivity tool designed to facilitate programing, and capable of generating a large set of recommendations.

PVS-Studio and ReSharper employ totally different approaches to many aspects. For example, PVS-Studio comes with documentation with detailed descriptions for each diagnostic, accompanied by examples and advice on corrections. ReSharper claims to apply "1400 code inspections in C#, VB.NET, XAML, XML, ASP.NET, ASP.NET MVC, Razor, JavaScript, TypeScript, HTML, CSS, ResX". The figure 1400 does look impressive, but it doesn't actually tell you anything. The descriptions of all these code inspections are probably somewhere out there, but personally I failed to find them. How can I compare our tool with ReSharper when I can't even know what errors in particular ReSharper can detect in C# applications?

Secondly, any comparison we could offer would be knocked. We've already been through such an experience before, and sworn off doing any such comparisons again. For example, we carried out a thorough comparison of PVS-Studio with Cppcheck and Visual Studio SCA once, and it took us lots of time and effort. The results were presented in brief and detailed versions. After that, there was probably no programmer left who hadn't criticized us for doing everything wrong, or accused us of being biased, and that it was unfair to choose these projects for comparison. So we see no point in wasting time on a comparison. No matter how thorough and honest it was, one could always label it biased.

However, I still have to answer the question if we are in any way better than ReSharper. And I have an answer.

Do you use ReSharper? Nice. Now install PVS-Studio and see if it can find bugs in your project!

Conclusion

I suggest without further delay, download PVS-Studio and running it on your projects. You've got warnings? But your code is quite runnable, isn't it? The bugs you've found are most likely inhabiting rarely used code areas because in frequently used ones, you had them fixed long ago, though hard and painfully. Now imagine how much effort you could save using PVS-Studio regularly. Of course, it can't catch all the bugs. But instead of wasting time hunting typos and slip-ups, spend it on something more worthwhile, and let PVS-Studio take care of those silly mistakes.

P.S. You don't make silly mistakes? Well, well! It only seems you don't. Everyone does - just have a look here.