Source Lines of Code

Jul 15 2013

Physical and Logical SLOC
An Example of SLOC Measurement
SLOC and Software Properties
Conclusion
References

Source Lines of Code (SLOC) is a software metric frequently used to measure the size and complexity of a software project. It is typically used to predict the amount of effort and time that will be required to develop a program, as well as to estimate the programming productivity once the software is produced.

Physical and Logical SLOC

There are two major types of SLOC measures: physical SLOC and logical SLOC. Specific definitions of these terms vary depending on particular circumstances. The most common definition of physical SLOC is a count of lines in the text of the program's source code including comment lines and, sometimes, blank lines. Logical SLOC attempts to measure the number of executable expressions (such as operators, functions, etc.), but their specific definitions are tied to specific computer languages.

Therefore, each approach has its own strong and weak points: physical SLOC is easier to measure, but it is very sensitive to coding style conventions and code formatting, while logical SLOC is less sensitive to these factors yet not so easy to measure.

An Example of SLOC Measurement

Have a look at this code:

for (i=0; i<100; ++i) printf("%d bottles of beer on the wall\n");
//How many LOCs is here?

It has 2 physical SLOC, 2 logical SLOC (the loop operator for and the function call operator printf) and 1 comment line.

Now let's change the code formatting in the following way:

for (i=0; i<100; ++i)
{
    printf("%d bottles of beer on the wall\n ");
}
//How many LOCs is here?

We've got 5 physical SLOC and the same 2 logical SLOC and 1 comment line.

SLOC and Software Properties

The SLOC metric is obviously associated with the system complexity: the larger the code's size, the more complex the system is. For instance, SLOC for Windows NT 3.1 is about 4-5 million and 45 million for Windows XP; Linux kernel 2.6 has 5.6 million SLOC, and Linux kernel 3.6 has 15.9 million SLOC.

However, it's not as definite in case of software quality and reliability. Every real-life software product contains bugs, and the tendency is that larger programs have more bugs. The point becomes pretty clear when we introduce the "bugs/SLOC" ratio: even if it remains constant, the absolute quantity of bugs grows alongside with the program's size. Intuition tells us that it happens due to the rising system complexity (A. Tanenbaum). And not just intuition (see diagram: "typical error density"). This consideration underlies such development principles as KISS, DRY and SOLID. To support this idea, let me quote a meaningful phrase by the classic E. Dijkstra: "Simplicity is prerequisite for reliability", and a paragraph from his work "The Fruits of Misunderstanding":

...Yet people talk about programming as if it were a production process and measure "programmer productivity" in terms of "number of lines of code produced". In so doing they book that number on the wrong side of the ledger: we should always refer to "the number of lines of code spent".

Conclusion

Thus, we have found out that a software project grows in complexity when growing in size (the SLOC measure), which leads to more bugs. Unfortunately (or otherwise), the technological progress is unstoppable, and computer systems' complexity will go on to grow, requiring ever more resources to find and fix bugs (not without adding new ones at the same time, of course), that's why developers should consider using the static analysis methodology and specialized static analysis tools to reduce the number of bugs and enhance the efficiency of the development process in general.