Software quality has been a topic and an area of interest since the dawn of software itself. And as software evolved so did the techniques and approaches to assuring its high quality. Better computers providing more computing power, bigger storage and faster communication have allowed software developers to detect issues in their code sooner and faster. And so we got from getting a syntax error after two days of waiting for the box of punch cards to go through the queue of boxes and get loaded into a computer running a compiler to getting such errors from a compiler in seconds or even in real-time from the code editor. And we got from bugs being detected by actually seeing real bugs on punch cards with machine instructions to operating systems providing bug reports with coredumps, tracebacks and lots of information helping the developers to identify the problem, tests detecting problems before the code gets into production, or compilers and tooling detecting them before the code is even executed the first time. We can afford doing things like fuzz testing, we have enough computational power for compilers and special tools to analyze the code and check all possible paths through it and much more. At the same time, software has become a part of almost everything we use or interact with every day and so with the incomparably greater amount of software potentially affecting our lives there is an incomparably greater amount of bugs that need to be detected and fixed or at least handled gracefully. With some software being more critical than other and with bugs ranging from minor annoyances to losses of human lives. Many things have changed in this evolution, but one rule has always been key:
The sooner a bug is detected, the cheaper it is to fix it.
Static checking
Static checking or, more formally, Static program analysis is a set of software quality assurance techniques that analyze software without executing (running) it. Instead, they work on the source code level, with some intermediate representation or with object (machine) code. The complexity of the static checking techniques and tools ranges from basic warnings and errors provided by the compilers over special tools analyzing paths through the code and combinations of possible values of variable and parameters to very advanced tools doing formal verification. The more advanced and powerful the analysis is, the longer it takes, but it's generally much faster and easier to do than testing the code in a production-like environment. And unlike dynamic testing of particular pieces of code (unit testing) and testing of specific cases in integration testing, static analysis is much more generic. Often the results from static analysis are warnings and/or errors that cannot practically happen in real world due to some implicit facts or assumptions that are not expressed in the code. But even those are worth attention because those implicit assumptions should either be expressed in the code, if the programming language (or environment) allows it, or at least in comments, potentially suppressing the specific errors/warnings from the static checking tools. It is then up to the person doing code review (another software quality assurance technique) to check those assumptions and later if/when the execution environment for the program changes a revision of those assumptions needs to be performed. However, the assumptions need to be specified there first and the closer to the code it is the better. Thus techniques requiring or at least encouraging that can prevent bugs.
On the other hand, while static program analysis is evolving and getting better and better, it often yields false positives. For this reason, all the tools provide mechanisms for suppressing some types of warnings/errors, for suppressing specific warnings/errors in specific places in the code and/or for completely disabling some types of checks. However, sometimes it is best to change the code to avoid the particular false positive because then no suppression is needed and the code often becomes easier to read and understand for developers.
As described above, the very basic, easiest to do and very cheap static
checking technique is to enable as many warnings from a compiler as
possible and, ideally, to treat them as errors. Or in other words, for
people familiar with the GCC compilers, use of the -Wall
, -Wextra
and -Werror
flags. This sounds like a trivial thing to do and one
could even ask why those are not enabled by default. Well, even the mere
existence of the -Wextra
option adding extra warnings on top of the
-Wall
option gives the answer – those warnings break many builds when
treated as errors or produce a lot of noise when treated as warnings.
This is a problem especially with code that has been developed without
all the warnings enabled and very often that is the case of old code
because many of the warnings are new or at least providing much better
results (i.e. fewer false positives) than they used to. Last but not
least, some of those warnings actually trigger on code using techniques
that used to be considered good/best practice at some point or are still
considered good/best practice in special cases.
Static program analysis has been evolving rapidly and greatly in the last couple years, new tools are appearing and the existing tools (e.g. compilers) are being enhanced in this regard. New programming languages are more and more adapting techniques from programming languages focused on critical software that make static checking easier, better or possible to extend to cover more aspects. However, much of the software (still) "running the world" was written years or even decades ago in languages and using techniques that didn't care about static analysis too much.
Static checking of CFEngine
CFEngine’s origins go back to year 1993 and while most of the code was
rewritten over the years, there is still some code written more than 20
years ago that is part of the packages released these days. C code, of
course. The naïve approach like "Let's just add
-Wall -Wextra -Werror
everywhere and fix all the issues" was thus
impossible. And not only because some parts of the code are quite old
and weren't written with static checking in mind, but also because
CFEngine is being built on many platforms that only provide old
compilers. Good news is that static checking is almost
platform-independent. Of course some parts of the code depend on some
platform-specific macros and may even not be compiled on all platforms,
but vast majority of the code is shared and platform-independent, much
more than machine code, of course. And even though compilers are good
elementary static checking tools, static program analysis working on the
source code level doesn't require the code to be compiled or even
compiling/compilable. Thus issues found by static checking on one
platform are very often a problem on other platforms and very little is
left uncovered.
Container-based checking
Of course every developer working on CFEngine can run the compiler with
special flags on their own system. However, as mentioned above, newer
versions of compilers provide better results, i.e. more warnings/errors.
Thus we decided to use containers because while it's unfortunately not
easy to run CFEngine in a container, it's very easy to build it
in a container. And with that flexibility, we were able to choose a
container image that has new versions of the tools we want to use –
gcc (GNU project C and C++ compiler), Clang (C front-end for LLVM)
and cppcheck (a static analysis tool for C and C++ code). We decided
to use Fedora images because Fedora is a distribution that very early
adopts new versions of compilers, standard libraries, etc. To make it
simple for the developers, we added a new make target –
static-check
– which simply runs a
script
that creates a new container from a base Fedora image with the required
tools installed, copies all sources into it and then runs another
script
inside the container that runs builds of CFEngine using gcc and
Clang and then runs cppcheck. The builds use the
-Werror -Wall -Wextra -Wno-sign-compare
options to enable all warnings except for the
sign-compare warning which is triggered by lines of code that are very
hard to change (but we still hope to do it in the future). The
cppcheck run uses a list of
suppressions
that prevent some (currently 5) false-positives from being reported. We
also experimented with the -fanalyzer
option of gcc, but it was
producing too many false-positives for us to be practically usable.
However, many improvements have been made in the recent versions of
gcc 1 so we will surely give it another try soon.
This simple solution has detected many issues in our code. Real bugs as well as just missing or badly expressed assumptions or just pieces of code that were correct, but too cryptic for the above mentioned tools and of course it turned out they were cryptic for human readers as well. Plus, with every new release of Fedora (and thus the tools we use), we discover and fix new issues.
Having an easy way to run the checks locally is one thing. It's nice and handy and it takes less than 5 minutes on an average developer laptop, but we don't want to require all developers to do it and swear they did it before they push or merge some changes. That's why we also have a job in our CI system that runs the static-check target on all pull requests and also as part of our nightly builds.
LGTM
Apart from using gcc, Clang and cppcheck all of which we have available locally, we also use LGTM – an online code analysis platform which is available for free for open source projects.
Even with the default configuration it has many rules that detect issues in code written in various programming languages plus it can be easily extended with custom rules. We have two custom rules in CFEngine:
- boolean type mismatch between function type and return value and
- missing
NULL
check for arguments.
The first
one
detected, and prevents us from re-introducing, problems where a function
returning bool
values actually returns int
values. true
and
false
are 1
and 0
, respectively, in C so they are integer values.
But the problem is that all other integer values can be treated as
boolean values – 0
meaning false
and everything else meaning
true
. A typical problem is when the code in a function with return
type bool
return -1
or some other negative values to indicate an
error. This is a common (good) practice, but then the function needs
to have int
(or some other signed integer type) as its return type and
the calling code should check the return value for being less than,
equal or greater than 0
instead of just doing if (retval)
which
evaluates to true
if retval
is -1
.
The second
rule
requires all function pointer arguments that are dereferenced in
function bodies to be checked for not being NULL
. Either the code
dereferencing the arguments needs to check that the argument is not
NULL
before dereferencing it or there must be an assert(arg != NULL)
for every argument that is dereferenced in the function body. This rule
actually has two benefits – a failure of such assert()
stops the
program in a way that makes it easy to detect where the problem is
(instead of potentially reading from or writing to random bits of
memory) and the assert()
gives information to the developer using the
function that the particular argument(s) cannot be NULL
. The asserts
are only effective in code compiled with debugging options and so they
don't introduce any performance penalty in production code.
The above two rules helped us to identify bugs as well as to improve our code quality and readability. The amazing thing about LGTM is that it can be easily used with a codebase that triggers alerts and warnings (ideally eliminating them over time) with only newly-introduced alerts and warnings being reported for open pull requests or merged commits. This greatly facilitates integration of LGTM into existing projects like CFEngine.
Summary
Static checking is a great software quality assurance technique which has recently gone through a great evolution and development that keeps continuing. It provides fast analysis of existing codebase and quick feedback for newly introduced code and can detect serious issues. It has also become easily available and easy to use. With CFEngine, a project the codebase of which goes back to 90s, we are showing that it can be of great help not just for new projects using modern programming languages and modern coding practices. We definitely plan to keep using static checking and even extend it in the future and we are definitely open to share our experience with it. If you have any questions related to how we do static checking, don't hesitate to ask, for example at GitHub discussions.