This is a modified (primarily extended) version of an article that appeared in the February 1997 issue of °Dr. Dobbs' Journal. Please note that the material in this article has, for the most part, not been updated since the research was originally performed in 1996.

A First Look at C++ Program Analyzers

by °Scott Meyers and °Martin Klaus

C++ has a well-deserved reputation for power and flexibility. It has an equally well-deserved reputation for complexity; its gotchas are legion. For example, omitting a virtual destructor in a base class typically leads to incomplete destruction of derived class objects when they are deleted through base class pointers (see ONMOUSEOVER = "self.status = 'Item E14'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">Item E14).

Experienced C++ programmers learn to avoid these kinds of problematic constructs, but experience should not really be necessary: troublesome C++ can often be detected by static analysis, i.e., by tools that parse and analyze C++ source code. Such tools are becoming widely available, and during the summer and fall of 1996, we undertook an investigation to identify these tools and to assess their capabilities. In this article, we summarize the initial results of our investigation.

We were interested in the following questions:

This article addresses only the first two questions.

Identifying Products

When we began this project, we were aware of several static analysis tools for C++, but we suspected there were tools we didn't know about. To fill the gaps in our knowledge, we posted a request for information to several Usenet newsgroups, including groups devoted to C++ programming, object-oriented programming, and programming on various platforms. Based on the responses we received, we identified the offerings in Table 1.

VendorProductSupported Under
°Abraxas Software °CodeCheck DOS,Windows,Unix
°CenterLine Software °AcquaProva Unix
°ConSol Software GmbH °xlint++ Unix
°Gimpel Software °Flexe-/PC-Lint DOS,Windows,Unix
°Hewlett Packard °CodeAdvisor Unix
°ParaSoft °CodeWizard Unix
°Productivity through Software °ProLint Unix
°Programming Research Ltd. °QA/C++ Unix
°Rational Software °Apex C/C++ Unix
Table 1:Static Analysis Tools for C++

We then attempted to contact the vendors to get information on each product. We were unable to get in touch with ConSol Software GmbH, so we eliminated xlint++ from our study. [Note: after this article went to press, ConSol Software GmbH contacted us, and we have included in Table 3 the information they gave us.] We also eliminated Productivity through Software's ProLint, because it is fundamentally a GUI wrapper around Gimpel Software's (command-line based) FlexeLint, and we had already decided to include FlexeLint in our examination.

That left us with the following seven products:

To these choices we added our own noncommercial program, CCEL (References), purely for purposes of comparison. CCEL began as a research project on static analysis of C++ programs under the direction of one of us (Meyers) and was eventually implemented through independent work by the other (Klaus). We added CCEL to our investigation because we were familiar with its capabilities and limitations, and we felt it would be interesting to compare commercial approaches to our research-based initiative. (Regrettably, the CCEL implementation cannot be made available, not even for research purposes. To find everything about CCEL that is available, consult the references below.)

Approach

Our testing of the tools was broken into three phases:

  1. We developed of a set of benchmark rules constraining the structure of C++ programs. For example, one rule is that all base classes must have virtual destructors. We tried to develop a set of rules that was representative of the kinds of rules real programmers would find useful.
  2. We contacted the vendor for each tool and asked them which rules their product could enforce. This information proved useful during our later empirical tests, because discrepancies between vendor claims and our empirical findings often identified subtle differences between our rules and those enforced by the vendors.
  3. We developed of a set of sample source files seeded with rule violations. We ran each tool on each source file to see whether the seeded rule violation was correctly identified.

Our results yielded a table showing how well each tool enforced our benchmark rules on our benchmark programs.

Choosing Rules

One can imagine many ways to compose a set of benchmark rules for C++ programs, but it is difficult to argue that one set is "better" than another. As a result, we made no attempt to develop the "best" set of rules. Instead, we fell back on the fact that one of us (Meyers) has authored two books containing guidelines for C++ programming (Effective C++ and More Effective C++), and we chose nearly all our rules from those books.

This approach is not as gratuitous as it might appear. Effective C++ and More Effective C++ have been well-received in the C++ programming community, and one or both form the basis for many sets of corporate coding guidelines. In addition, at least two of the static analysis tools in our investigation are based on these books. Finally, by drawing our rules from well-known and easily accessible sources, we avoided the need to explicitly justify individual rules in our benchmark set. Instead, the justification for nearly every rule is available in the books, and we simply refer to the appropriate book location as the rationale for each rule.

We chose 36 rules (34 of which were drawn from Effective C++ or More Effective C++), which we divided into eight categories. These rules are listed in Table 2. Each rule begins with its rule number, followed by a reference to the Item number in either Effective C++ (E) or More Effective C++ (M) from which it is derived. The text of the rule is often different from the text of the book Item, because the book Items tend to be worded too generally to be checked.

Table 2: Benchmark Rules
General
1 ONMOUSEOVER = "self.status = 'E1'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E1Use const instead of #define for constants at global and file scope.
2 ONMOUSEOVER = "self.status = 'M2'; return true" ONMOUSEOUT = "self.status = self.defaultStatus" >M2 Use new-style casts instead of C-style casts.
3 ONMOUSEOVER = "self.status = 'M3'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M3 Don't treat a pointer to Derived[] as a pointer to Base[].
Use of new and delete
4 ONMOUSEOVER = "self.status = 'E5'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E5 Use the same form for calls to new and delete. (In general, this calls for dynamic analysis, but static analysis can catch some special cases, e.g., calls to new in ctors and to delete in dtors.)
5 ONMOUSEOVER = "self.status = 'E6'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E6When the result of a new expression in a ctor is stored in a dumb pointer class member, make sure delete is called on that member in the dtor.
6 ONMOUSEOVER = "self.status = 'E9'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E9 Avoid hiding the default signature for operator new and operator delete.
Constructors/Destructors/Assignment
7a ONMOUSEOVER = "self.status = 'E11'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E11Declare a copy constructor for each class declaring a pointer data member.
7b ONMOUSEOVER = "self.status = 'E11'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E11 Declare an assignment operator for each class declaring a pointer data member.
8 ONMOUSEOVER = "self.status = 'E12'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E12 Initialize each class data member via the member initialization list.
9 ONMOUSEOVER = "self.status = 'E13'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E13List members in a member initialization list in an order consistent with the order in which they are actually initialized.
10 ONMOUSEOVER = "self.status = 'E14'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E14 Make destructors virtual in base classes.
11 ONMOUSEOVER = "self.status = 'E15'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E15 Have the definition of operator= return a reference to *this. (Note: this says nothing about declarations.)
12a ONMOUSEOVER = "self.status = 'E16'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E16Assign to every local data member inside operator=.
12b ONMOUSEOVER = "self.status = 'E16'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E16Call a base class operator= from a derived class operator=.
12c ONMOUSEOVER = "self.status = 'E16'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E16Use the member initialization list to ensure that a base class copy ctor is called from a derived class copy ctor.
13N/ADon't call virtual functions in constructors or destructors.
Design
14 ONMOUSEOVER = "self.status = 'E19'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E19 Use non-member functions for binary operations like +-/* when a class has a converting ctor.
15 ONMOUSEOVER = "self.status = 'E20'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E20Avoid public data members.
16 ONMOUSEOVER = "self.status = 'E22'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E22Use pass-by-ref-to-const instead of pass-by-value where both are valid and the former is likely to be more efficient.
17a ONMOUSEOVER = "self.status = 'E23'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E23Have operators like +-/* return an object, not a reference.
17b ONMOUSEOVER = "self.status = 'M6'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M6And make those return values const.
18 ONMOUSEOVER = "self.status = 'E25'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E25Don't overload on a pointer and an int.
19 ONMOUSEOVER = "self.status = 'M33'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M33Make non-leaf classes abstract.
20 ONMOUSEOVER = "self.status = 'M24'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M24Avoid gratuitous use of virtual inheritance, i.e., make sure there are at least two inheritance paths to each virtual base class.
Implementation
21 ONMOUSEOVER = "self.status = 'E29'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E29Don't return pointers/references to internal data structures unless they are pointers/references-to-const.
22 ONMOUSEOVER = "self.status = 'M26'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M26Never define a static variable inside a non-member inline function unless the function is declared extern. [Note: In July 1996, changes to the nascent standard for ANSI/ISO C++ obviated the need for this rule, at least on paper. However, the need still exists in practice, because many compilers continue to heed the older rules that can lead to duplicated variables in inline non-member functions.]
23N/AAvoid use of "..." in function parameter lists.
Inheritance
24 ONMOUSEOVER = "self.status = 'E37'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E37Don't redefine an inherited non virtual function.
25 ONMOUSEOVER = "self.status = 'E38'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">E38Don't redefine an inherited default parameter value.
Operators
26 ONMOUSEOVER = "self.status = 'M5'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M5Avoid use of user-defined conversion operators (i.e., non-explicit single-argument ctors and implicit type conversion operators).
27 ONMOUSEOVER = "self.status = 'M7'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M7Don't overload &&, ||, or ,.
28 ONMOUSEOVER = "self.status = 'M6'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M6Make sure operators ++ and — have the correct return type.
29 ONMOUSEOVER = "self.status = 'M6'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M6Use prefix ++ and — when the result of the increment or decrement expression is unused.
30 ONMOUSEOVER = "self.status = 'M22'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M22 Declare op= if you declare binary op (e.g., declare += if you declare +, declare -= if you declare -, etc.). One way to satisfy this constraint is by providing a template that yields the appropriate function.
Exceptions
31 ONMOUSEOVER = "self.status = 'M11'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M11Prevent exceptions from leaving destructors.
32 ONMOUSEOVER = "self.status = 'M13'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M13Catch exceptions by reference.

Some of the rules may seem controversial, especially in light of the C++ found in many popular class libraries. Rule 15 (no public data members), for example, is widely violated in the Microsoft Foundation Classes, while almost no library adheres to Rule 19 (make all non-leaf classes abstract). With the exceptions of Rules 13 and 23 (which, in this article, we hope are self-explanatory), Effective C++ and More Effective C++ offer firm technical foundations for each rule. We believe it is therefore important that programmers be able to enforce those constraints, even if the majority of programmers choose not to. Furthermore, our decision to include rules that are commonly violated helps us evaluate the effectiveness of the tools' filtering capabilities. (We do not report on this aspect of the tools in this paper, but it is an important consideration in the practical application of any tool; see below.)

Benchmark Programs

For each of our 36 rules, we developed a trivial source file seeded with a violation of the rule. We then executed each tool on each source file to see if the tools correctly identified the seeded errors. These source files were truly trivial — many were under 10 lines long. Our goal with these files was not to provide a realistic test of the tools, it was just to see whether the tools could identify rule violations in the simplest of cases. (Sometimes this backfired and yielded misleading results — see below.)

As an example, here is the complete source code for the file used to test Rule 20:

Compilers Versus Special Tools

Several people responded to our request for information on static analysis tools by remarking that they found little need for such tools. Instead, they wrote, they relied on their compilers to flag conditions that were likely to lead to trouble. The following comment was typical:

I find GNU G++ with -ansi -pedantic -Wall -O flags useful.

In fact, the Gnu compiler was singled out by more than one writer as being especially good at warning about troublesome C++. This piqued our curiosity about compiler warnings. How many of our candidate rules would compilers identify?

To find out, we submitted our benchmark programs to five compilers, in each case enabling as many warnings as possible. As Table 3 shows, the results were disappointing. Even g++ identified at most two of the 36 rule violations, and three of the compilers identified none. This confirmed our impression (based on our experience as C++ programmers) that compilers are good at many things, but identifying legal, but potentially troublesome, C++ source code is not one of them.

Specifying Constraints

The tools in our study allow programmers to specify what conditions to check for in one of two ways. Most tools follow the lint model, whereby the tool is created with the ability to enforce some set of predefined constraints, and programmers turn these constraints on or off. There is no way to extend the capabilities of such tools. For example, a tool is either capable of detecting that an exception may leave a destructor (Rule 31) or it's not. If it's not, there is no way for a tool user to add that capability.

A different approach, one employed by °Abraxas' °Codecheck and °HP's °CodeAdvisor, and our research program CCEL, is to provide tool users with a language in which to express constraints of their own. Such tools may or not be useful "out of the box" (it depends on the existence and utility of predefined rule libraries), but they can be extended to check for new, user-defined conditions. This approach is more powerful, but, as in the case of C++ itself, complexity often accompanies power. The power is inaccessible until one has mastered the constraint expression language. Furthermore, the addition of user-defined constraints may affect an analysis tool's performance, because enforcement of such constraints may require arbitrary amounts of time, memory, or other resources.

We made no attempt to master the various constraint expression languages used by the different tools, but the examples we saw (see the appendix) reinforced the lessons we learned during the design and implementation of CCEL: it's hard to design a language for expressing constraints on a language as feature-filled as C++, and given such a constraint language, it's nontrivial to learn to use it. °Abraxas, for example, reports that it takes between 3 and 6 months to become proficient in the °CodeCheck constraint language. Most of their customers hire specialists to compose their rules instead of learning to write the rules themselves.

Most programmable tools attempt to offer the best of both worlds by shipping with a set of predefined rule libraries that check for commonly-desired constraints. This eliminates the need to write rules to cover common constraints.

Results And Discussion

The results of running the various tools on the collection of benchmark programs is shown in Table 3.

Table 3: Results of Submitting Sample Programs to Tools
  Compilers Static Analysis Tools
Product Borland C++ GNU g++ Visual C++ SunSoft C++ Symantec C++ CodeWizard PC-Lint CodeCheck QA/C++ CCEL CodeAdvisor AcquaProva Apex C++
xlint++
Version 5.0 2.7.2 10.00.5270 SC3.0.1 7.20B1 1.0 7.00c 6.04 3.0.1 N/A C.05.00 1.0 2.0.6C Not Tested
Options -w -ansi -Wall -W4 +w2 -w- unsuppress all -w4 Note 14 (Defaults) N/A (Defaults) (Defaults) (Defaults) Not Tested
Tested Under NT 3.51 SunOS 4.1.4 NT 3.51 Solaris 2.5 NT 3.51 Solaris 2.5 NT 3.51 NT 3.51 Solaris 2.5 Solaris 2.5 Solaris 2.5 Solaris 2.5 Solaris 2.5
Not Tested
Rule 1 - - - - - - - X X X CWP - - C
Rule 2 - - - - - - - CWP X - - - - -
Rule 3 - - - - - - Note 5 CWP - - CWP CD - -
Rule 4 - - - - - Note 4 - CWP Note 4 - CWP CD - -
Rule 5 - - - - - X X CWP - - CWP CD - -
Rule 6 - - - - - X - CWP - X CWP X Note 3 -
Rule 7a - - - - - X Note 6 CWP X X Note 6 X - C
Rule 7b - - - - - X Note 6 CWP X X Note 6 X - C
Rule 8 - - - - - Note 3 X Note 11 X - CWP Note 13 - C
Rule 9 - X - - - X X Note 11 X - Note 7 X - C
Rule 10 - Note 2 - - - Note 3 Note 10 Note 2 Note 2 X Note 2 Note 3 X C
Rule 11 - - - - - Note 3 - CWP Note 3 - CWP X - -
Rule 12a - - - - - X - CWP - - CWP - - C
Rule 12b - - - - - - - CWP - - CWP - - -
Rule 12c - - - - - - Note 7 Note 11 X - CWP - - -
Rule 13 - - - - - - X CWP - - X - - -
Rule 14 - - - - - - - CWP - X CWP - - -
Rule 15 - - - - - X - CWP X X CWP X X C
Rule 16 - - - - - X - CWP Note 3 X CWP Note 3 - -
Rule 17a - - - - - - - CWP - X CWP - - -
Rule 17b - - - - - - - CWP - X
CWP
- - -
Rule 18
- - - - -
Note 3
-
CWP
- X
CWP
X -
C
Rule 19
- - - - - - -
CWP
- X
CWP
- - -
Rule 20
- - - - - - - - - X
CWP
- - -
Rule 21
- - - - - X
Note 3
CWP
Note 3
-
CWP
- - -
Rule 22
Note 1
- - - - - -
CWP
Note 7
-
CWP
- - -
Rule 23
- - - - - - - X X X
CWP
- -
C
Rule 24
- - - - - X X
CWP
- X
Note 3
X -
C
Rule 25
- - - - - X - - - X
CWP
X - -
Rule 26
- - - - - - -
CWP
Note 3
-
Note 12
- - -
Rule 27
- - - - - - -
CWP
- X
CWP
X -
C
Rule 28
- - - - - - -
CWP
- X
CWP
- - -
Rule 29
- - - - - - -
CWP
X - - - - -
Rule 30
- - - - - - -
CWP
- X
CWP
- -
C
Rule 31
- - - - - - -
CWP
- - - - - -
Rule 32
- - - - - - -
CWP
- - - X -
-
Legend
- The tool failed to detect violations of this rule on our benchmark programs.
X The tool detected the seeded rule violation in our benchmark programs.
C The vendor claims the tool will detect violations of this rule, but we were unable to acquire a copy of the tool to test the claim.
CD The vendor claims the tool will detect violations of this rule dynamically, but we were unable to acquire a copy of the tool to test the claim.
CWPThe vendor claims that users can program the tool to detect violations of the constraint. We did not attempt to test the claim.
Note 1 Warning issued that functions containing static variables are not expanded inline.
Note 2 Will note a non virtual desTRuctor in classes with virtual functions.
Note 3 Vendor claims the condition will be detected, but tests exposed at least one failure to do so.
Note 4 A diagnostic is issued only when the calls to new and delete are within the same function.
Note 5 Erroneously diagnosed an error in a valid source file.
Note 6 Doesn't diagnose this particular condition, but uses similar heuristics to identify classes where assignment operators and copy constructors should be declared.
Note 7 A diagnostic is issued that hints at the problem, but the problem is not directly identified.
Note 8 A diagnostic is issued only when the member is initialized in the body of the consTRuctor.
Note 9 (This note is not used.)
Note 10 Attempts to warn about this only when the condition would cause a runtime problem, but tests showed at least one failure to do so.
Note 11 The vendor claims that users can program the tool to detect violations of the constraint, but technical support was unable to describe how when asked.
Note 12 A constraint to enforce the rule can be written, but the °HP parser doesn't yet support explicit.
Note 13 This condition isn't detected, but use of uninitialized memory is dynamically detected.
Note 14 "-Rcplus" used, but °Abraxas describes this as a "tutorial and example file" not designed for production use.

Several features in the data are of interest. First, no tool was able to enforce all our 36 benchmark rules, not even the tools supporting user-defined constraints. Thus, even the best of currently available tools offers only partial coverage of C++. This is especially noteworthy, because our benchmark rules themselves failed to exercise all major language features; templates are a particularly obvious omission.

Second, the number of benchmark rules that can be enforced without programming — i.e., "out of the box" — is at most 17 of 36. (CCEL supports 19, but CCEL is a discontinued research project, not a commercial product.) If we speculate that our set of benchmark rules is somehow representative of the kinds of constraints real programmers might want to enforce, this suggests that current tools cover at best only about half those constraints. Of course, automatic enforcement of half a set of requirements is better than no enforcement at all, but the data in Table 3 suggest that there is much room for increased language coverage by static analysis tools for C++.

Third, it is not uncommon for there to be subtle mismatches between a benchmark rule and the conditions detected by the analysis tools. In most cases, this is an outgrowth of the vendors' attempts to avoid generating warning messages when no truly harmful condition exists. For example, consider again Rule 10:

Make destructors virtual in base classes.

Many programmers consider this rule too aggressive, and a common alternative form of the same rule is this:

Make destructors virtual in classes containing virtual functions.

This form has the advantage that no virtual table pointer is added to a class simply to satisfy the rule (see Items ONMOUSEOVER = "self.status = 'E14'; return true" ONMOUSEOUT = "self.status = self.defaultStatus" >E14 and ONMOUSEOVER = "self.status = 'M24'; return true" ONMOUSEOUT = "self.status = self.defaultStatus">M24). (This is the rule variant that's employed by the Gnu C++ compiler, by °HP's °CodeAdvisor, and by °Programming Research's °QA/C++.)

The motivation for this rule (in any form) is that the following code is generally harmful if the base class lacks a virtual destructor:

In truth, this code is only harmful if one or more of the following conditions holds:

At least one tool vendor attempts to issue a diagnostic only if these more stringent conditions exist, and the conditions do not exist in our test program:

The tool in question thus issues no diagnostic on our sample program, but if class Derived were nontrivial, the tool might issue a warning.

This more precise analysis should be beneficial for users, because a diagnostic should be issued only if a problem truly exists. However, the rules of C++ can be both complicated and unintuitive, and their subtlety can cut both ways. In the case of the vendor attempting to check for the more detailed conditions outlined above, the test for data members with destructors in the derived class was omitted. Hence, though the tool avoids issuing warnings in harmless cases, it also avoids issuing warnings in some rare, but harmful cases. Yet these are precisely the cases in which static analysis tools that correctly understand the detailed rules of C++ are most useful!

Another tool had trouble issuing correct diagnostics when compiler-generated functions — default constructors, copy constructors, assignment operators, and destructors (especially derived class destructors) — were involved. Because of the minimalist nature of our test cases, our programs had many instances of such functions, and this led to incorrect results from some tools.

Whether such shortcomings would cause problems when the tools are applied to real programs is unknown, but it hints at a deeper problem we found with some of the tools: the vendors didn't seem to understand the subtleties of C++ as well as they need to. We believe that vendors of C++ analysis tools must understand C++ as well as compiler vendors, but based on our experience with the tools in this study, we must report that such expertise cannot yet be taken for granted.

Caveats

The results in Table 3 provide insight into the state of existing lint-like tools for C++ (as of August/September 1996), but it is important to recognize what they do not show. We were interested only in the capability of such tools to handle the "++" part of C++, but most of the tools also provide significant other capabilities. For example:

In addition, our set of benchmark rules was far from exhaustive. Some vendors check for C++-specific conditions we didn't consider, and Table 3 says nothing about such capabilities.

All this is to say that Table 3 is anything but a "Buyer's Guide." Furthermore, there are many non-technical characteristics of analysis tools you should consider before deciding which, if any, is suitable for your circumstances. The following questions immediately come to mind:

Our study considered none of these issues.

Finally, it is important to bear in mind that the results in Table 3 are based on tests we performed in August and September, 1996. Virtually all of the tools we examined are under active development, so it's likely that new versions of the tools exist even as you read this report. For example, we know that °Abraxas is currently beta-testing a set of predefined constraints derived from material in Meyers' books, and we know that °CenterLine and °Rational are planning upgrades to °AcquaProva and °Apex, respectively, that will allow users to define new constraints. Other tool vendors are similarly active. The data in Table 3 represents a mere snapshot of the commercial state of the art in September, 1996, and that snapshot is, alas, somewhat out of date by the time you see it.

Summary

A number of analysis tools is now available that read C++ source code and warn about possible behavioral problems. They cover varying aspects of C++, though none offers truly comprehensive coverage of the language. Based on simple tests, we believe that many dangerous C++ constructs can be detected, though the complexity of C++ leads to incorrect behavior on the part of some tools, especially where compiler-generated functions are concerned. C++ analysis tools are under active development, and it is likely that the data in this article fails to accurately reflect the current capabilities of the tools we examined. If you are interested in static analysis tools for C++, we encourage you to contact the vendors in Table 1, conduct your own tests, come to your own conclusions, and share them with us.

Acknowledgements

We are grateful to Jill Huchital and to an anonymous reviewer for their comments on drafts of this article, and we thank the editorial staff at °Dr. Dobb's Journal for helping us publish our findings much more quickly than is usually possible.

References

Appendix: Constraint Expression Languages
As an example of the different ways in which constraints on C++ programs may be expressed, consider our Rule 10. In English, the constraint is: Make destructors virtual in base classes. When this constraint is formalized and made amenable to enforcement by computer, it quickly becomes more complicated. CCEL offers a fairly succinct way to express this constraint, because CCEL was specifically designed to make the expression of constraints like this straightforward. Still, CCEL's formal nature makes it wordier than English. (It also makes it more precise.) This is Rule 10 in CCEL:
BaseClassDtor (
    Class B;
    Class D | D.is_descendant(B);

    Assert(MemberFunction B::m; | m.name() == "~" +  B.name() && m.is_virtual());
  );
CCEL embodies a declarative approach to the specification of constraints: you specify what you want to enforce, not how to enforce it. This CCEL constraint can be read like this: For all classes B and all classes D that inherit from B, B must have a member function m such that m is a destructor and m is virtual.

Compared to CCEL, all the programmable analysis tools we investigated are long-winded. That's because they employ a procedural approach to constraint specification: you write code specifying how to go about detecting violations of the constraints you define. In practice, a procedural approach is more powerful, but it's also more complicated.

For example, °Abraxas, which offers a C-like programming language for new constraints, suggests the following implementation for our Rule 10:

int isBaseClass;
  int inBase;
  int inClassHead;
  char tagList[ MAXLIST ];

  if ( dcl_virtual )
  {
    if ( dcl_base == DESTRUCTOR_TYPE )
      {
	if ( tagList[ 0 ] == 0 )
	{
	  i = 0;
	  while ( i <= strlen( tag_name() ) )
	    {
	      tagList[ i ] = tag_name()[ i ];
	      i++;
	    }
	  }
	  else
	  {
	    p = tagList;
	    while ( 1 )
	    {
	      if ( p[ 0 ] == 0 )
		{
		  i = 0;
		  while ( i <= strlen( tag_name() ) )
		    {
		      p[ i ] = tag_name()[ i ];
		      i++;
		    }
		    break;
		  }
		  else
		  {
		    p = p + strlen( p ) +1;
	   }
	 }
       }
    }
  }

  if ( keyword( "class" ) )
  {
    if ( strcmp( prev_token(), "<" ) != 0 )
    {
      inClassHead = TRUE;
    }
  }

  if ( op_colon_1 )
  {
    if ( inClassHead )
    {
      inBase = TRUE;
    }
  }

  if ( op_separator )
  {
    if ( inBase )
    {
      if ( strcmp( prev_token(), ">" ) != 0 )
	{
	  isBaseClass = TRUE;
	}
    }
  }

  if ( op_open_angle )
  {
    if ( inBase )
    {
      if ( angleLevel == 0 )
	{
	  isBaseClass = TRUE;
	}
	angleLevel++;
    }
  }

  if ( op_close_angle )
  {
    if ( inBase )
      {
	angleLevel--;
	if ( angleLevel == 0 )
	   {
	     isBaseClass = FALSE;
      }
    }
  }

  if ( isBaseClass )
  {
    p = tagList;
    while ( p[ 0 ] != 0 )
    {
      if (strcmp( p, prev_token() ) == 0 )
	{
	  break;
	  }
	  else
	  {
	     p = p + strlen( p ) + 1;
	  }
    }
    if ( p[ 0 ] == 0 )
    {
      warn( 1401, "The destructor of class %s should be declared as virtual.", tag_name() );
    }
  }

  if ( tag_begin )
  {
    inBase = FALSE;
    inClassHead = FALSE;
  }

°HP's °CodeAdvisor uses C++ as an extension language. New constraints are implemented via classes inheriting from the predefined Rule class. These classes are then compiled and linked to a runtime library. Here is our implementation of Rule 10:

class Rule_VirtDtorInBaseClass : public Rule
  {
  public:
    virtual int kindMask() const;
    virtual Language langMask() const;
    void check(SymbolTable *, const Symbol &);
    virtual const char *errorMess() const;
    virtual const char *name() const;
  };

  // This rule is invoked for all "Tag" objects.  Tags include all
  // compound objects, such as Classes, Templates, Structs, Unions, and Enums.
  // Can't specify a kindMask() of KIND_CLASS, since check() is invoked
  // only on Symbol objects.  Class is not a Symbol; Tag is.

  int Rule_VirtDtorInBaseClass::kindMask() const
  { return 1 << KIND_TAG; }

  // RETURNS A MASK OF THE LANGUAGE(S) THIS RULE APPLIES TO.

  LANGUAGE RULE_VIRTDTORINBASECLASS::LANGMASK() CONST
  { RETURN LANGUAGE_CPP; }

  // FIND ALL BASE CLASSES NOT DECLARING A VIRTUAL DESTRUCTOR.

  VOID RULE_VIRTDTORINBASECLASS::CHECK( SYMBOLTABLE *, CONST SYMBOL &SYM)
  {
    TAG TAG;
    CLASS CL;

    // DON'T WANT TO CHECK INSTANCES. THE !TAG.CLASSTYPE(CL) CALL ALSO REJECTS ENUMS.

    IF (!SYM.SYMBOLTOTAG(TAG) || !TAG.CLASSTYPE(CL) || IS_INSTANTIATED(CL.ATTRIB()))
      RETURN;

    // REJECT STRUCTS AND UNIONS, WHICH ARE ALSO REPRESENTED AS CLASSES.

    IF (WAS_STRUCT(CL.ATTRIB()) || WAS_UNION(CL.ATTRIB()))
      RETURN;

    // GET POSSIBLE BASE CLASSES AND DERIVED CLASSES.

    ATTRIBUTE_ITERATOR(TAG) BCL=CL.BASECLASSES();
    ATTRIBUTE_ITERATOR(TAG) DCL=CL.DERIVEDCLASSES();

    // REJECT DERIVED CLASSES.

    ITERATE_BEGIN(BCL) { RETURN; } ITERATE_END(BCL);

    // CHECK ONLY BASE CLASSES.

    ITERATE_BEGIN(DCL) {

    FUNCTIONMEMBER FM;

    CHAR DTOR_NAME[256];
    CHAR MSG[1024];

    // DESTRUCTORS CAN BE IDENTIFIED BY THEIR NAME.

    STRCPY(DTOR_NAME, TAG.NAME());
    STRCAT(DTOR_NAME, "::~");
    STRCAT(DTOR_NAME, TAG.NAME());
    STRCAT(DTOR_NAME, "()");

    // BASE CLASS MUST DECLARE A DESTRUCTOR MEMBER FUNCTION THAT IS VIRTUAL.

    IF (CL.FINDFUNCTIONMEMBER(DTOR_NAME, FM) && IS_VIRTUAL(FM.ATTRIB())) {
      RETURN;
    }

    // ISSUE THE VIOLATION MESSAGE.

    SPRINTF(MSG, "BASECLASS '%S' SHOULD DECLARE A VIRTUAL DESTRUCTOR", TAG.NAME());
    VIOLATION(TAG, MSG);
    RETURN;

    } ITERATE_END(DCL);
  }

  // RETURNS A ONE-LINE SUMMARY OF THE VIOLATION WITH NO INSTANCE-SPECIFIC INFORMATION.

  CONST CHAR *RULE_VIRTDTORINBASECLASS::ERRORMESS() CONST
  {
    RETURN("BASE CLASS MUST DECLARE A VIRTUAL DESTRUCTOR.");
  }

  // RETURNS THE NAME OF THIS RULE.

  CONST CHAR *RULE_VIRTDTORINBASECLASS::NAME() CONST
  {
    RETURN("RULE_VIRTDTORINBASECLASS");
  }

  STATIC RULE_VIRTDTORINBASECLASS INSTANCE;

Finally, here's Rule 10 in the constraint expression language °CenterLine plans to release in an upcoming version of °AcquaProva. This code was provided by CenterLine:

advBool advSMNonvirtualDestructor::DoesEntityViolateRule
  (const advClassStructOrUnionType& t,
   advDiagnostic& d) const
  {
    advBool result = adv_false;

    do {
      if (t.HasBaseClasses() && t.HasDestructor()) {
	if (t.Destructor()->IsCompilerGenerated()) {

      d << "Class type:" << t <<
      "Base classes without virtual destructors:";

	//  Make sure that all direct
	// bases have virtual destructors.  (There's no need to check
	// indirect bases since they will have been checked previously.)
	class CheckForBaseClassesWithNonvirtualDestructors :
      public
      advForEachFunction<advClassStructOrUnionType::BaseClassInfo>
	{
	public:
      CheckForBaseClassesWithNonvirtualDestructors(advDiagnostic& d,
	advBool& result)
	: d_(d), result_(result) {}

      virtual void operator()(const
       advClassStructOrUnionType::BaseClassInfo& b,
	advBool&) {
	if (!b.base_class_.HasDestructor() ||
	  !b.base_class_.Destructor()->IsVirtual()) {
	// A base class with a non-virtual destructor!
	result_ = adv_true;

	      // Add the offending base class to the diagnostic so that
	      // the user will know what was wrong.
	d_ << b.base_class_;
      } // if

  }

	private:
      advDiagnostic& d_;
      advBool& result_;
	} check_for_base_classes_with_nonvirtual_destructors(d, result);

	(void)
      t.ForEachBaseClass
      (check_for_base_classes_with_nonvirtual_destructors);
      } // if
    } while (adv_false);

    return result;
  }