Effective C++, 2E | Item 18: Strive for class interfaces that are complete and minimal

Back to Design and Declaration
Continue to Item 19: Differentiate among member functions, non-member functions, and friend functions.

Item 18: Strive for class interfaces that are complete and minimal.

The client interface for a class is the interface that is accessible to the programmers who use the class. Typically, only functions exist in this interface, because having data members in the client interface has a number of drawbacks (see Item 20).

Trying to figure out what functions should be in a class interface can drive you crazy. You're pulled in two completely different directions. On the one hand, you'd like to build a class that is easy to understand, straightforward to use, and easy to implement. That usually implies a fairly small number of member functions, each of which performs a distinct task. On other hand, you'd like your class to be powerful and convenient to use, which often means adding functions to provide support for commonly performed tasks. How do you decide which functions go into the class and which ones don't?

Try this: aim for a class interface that is complete and minimal.

A complete interface is one that allows clients to do anything they might reasonably want to do. That is, for any reasonable task that clients might want to accomplish, there is a reasonable way to accomplish it, although it may not be as convenient as clients might like. A minimal interface, on the other hand, is one with as few functions in it as possible, one in which no two member functions have overlapping functionality. If you offer a complete, minimal interface, clients can do whatever they want to do, but the class interface is no more complicated than absolutely necessary.

The desirability of a complete interface seems obvious enough, but why a minimal interface? Why not just give clients everything they ask for, adding functionality until everyone is happy?

Aside from the moral issue — is it really right to mollycoddle your clients? — there are definite technical disadvantages to a class interface that is crowded with functions. First, the more functions in an interface, the harder it is for potential clients to understand. The harder it is for them to understand, the more reluctant they will be to learn how to use it. A class with 10 functions looks tractable to most people, but a class with 100 functions is enough to make many programmers run and hide. By expanding the functionality of your class to make it as attractive as possible, you may actually end up discouraging people from learning how to use it.

A large interface can also lead to confusion. Suppose you create a class that supports cognition for an artificial intelligence application. One of your member functions is called think, but you later discover that some people want the function to be called ponder, and others prefer the name ruminate. In an effort to be accommodating, you offer all three functions, even though they do the same thing. Consider then the plight of a potential client of your class who is trying to figure things out. The client is faced with three different functions, all of which are supposed to do the same thing. Can that really be true? Isn't there some subtle difference between the three, possibly in efficiency or generality or reliability? If not, why are there three different functions? Rather than appreciating your flexibility, such a potential client is likely to wonder what on earth you were thinking (or pondering, or ruminating over).

A second disadvantage to a large class interface is that of maintenance (see Item M32). It's simply more difficult to maintain and enhance a class with many functions than it is a class with few. It is more difficult to avoid duplicated code (with the attendant duplicated bugs), and it is more difficult to maintain consistency across the interface. It's also more difficult to document.

Finally, long class definitions result in long header files. Because header files typically have to be read every time a program is compiled (see Item 34), class definitions that are longer than necessary can incur a substantial penalty in total compile-time over the life of a project.

The long and short of it is that the gratuitous addition of functions to an interface is not without costs, so you need to think carefully about whether the convenience of a new function (a new function can only be added for convenience if the interface is already complete) justifies the additional costs in complexity, comprehensibility, maintainability, and compilation speed.

Yet there's no sense in being unduly miserly. It is often justifiable to offer more than a minimal set of functions. If a commonly performed task can be implemented much more efficiently as a member function, that may well justify its addition to the interface. (Then again, it may not. See Item M16.) If the addition of a member function makes the class substantially easier to use, that may be enough to warrant its inclusion in the class. And if adding a member function is likely to prevent client errors, that, too, is a powerful argument for its being part of the interface.

Consider a concrete example: a template for classes that implement arrays with client-defined upper and lower bounds and that offer optional bounds-checking. The beginning of such an array template is shown below:

template<class T>
class Array {
public:
  enum BoundsCheckingStatus {NO_CHECK_BOUNDS = 0,
                             CHECK_BOUNDS = 1};

  Array(int lowBound, int highBound,
       BoundsCheckingStatus check = NO_CHECK_BOUNDS);

  Array(const Array& rhs);

  ~Array();

  Array& operator=(const Array& rhs);

private:
  int lBound, hBound;         // low bound, high bound

vector<T> data;             // contents of array; see
                              // Item 49 for vector info

  BoundsCheckingStatus checkingBounds;
};

The member functions declared so far are the ones that require basically no thinking (or pondering or ruminating). You have a constructor to allow clients to specify each array's bounds, a copy constructor, an assignment operator, and a destructor. In this case, you've declared the destructor nonvirtual, which implies that this class is not to be used as a base class (see Item 14).

The declaration of the assignment operator is actually less clear-cut than it might at first appear. After all, built-in arrays in C++ don't allow assignment, so you might want to disallow it for your Array objects, too (see Item 27). On the other hand, the array-like vector template (in the standard library — see Item 49) permits assignments between vector objects. In this example, you'll follow vector's lead, and that decision, as you'll see below, will affect other portions of the classes's interface.

Old-time C hacks would cringe to see this interface. Where is the support for declaring an array of a particular size? It would be easy enough to add another constructor,

Array(int size,
      BoundsCheckingStatus check = NO_CHECK_BOUNDS);

but this is not part of a minimal interface, because the constructor taking an upper and lower bound can be used to accomplish the same thing. Nonetheless, it might be a wise political move to humor the old geezers, possibly under the rubric of consistency with the base language.

What other functions do you need? Certainly it is part of a complete interface to index into an array:

// return element for read/write
T& operator[](int index);

// return element for read-only
const T& operator[](int index) const;

By declaring the same function twice, once const and once non-const, you provide support for both const and non-const Array objects. The difference in return types is significant, as is explained in Item 21.

As it now stands, the Array template supports construction, destruction, pass-by-value, assignment, and indexing, which may strike you as a complete interface. But look closer. Suppose a client wants to loop through an array of integers, printing out each of its elements, like so:

Array<int> a(10, 20);      // bounds on a are 10 to 20

...

for (int i = lower bound of a; i <= upper bound of a; ++i)
  cout << "a[" << i << "] = " << a[i] << '\n';

How is the client to get the bounds of a? The answer depends on what happens during assignment of Array objects, i.e., on what happens inside Array::operator=. In particular, if assignment can change the bounds of an Array object, you must provide member functions to return the current bounds, because the client has no way of knowing a priori what the bounds are at any given point in the program. In the example above, if a was the target of an assignment between the time it was defined and the time it was used in the loop, the client would have no way to determine the current bounds of a.

On the other hand, if the bounds of an Array object cannot be changed during assignment, then the bounds are fixed at the point of definition, and it would be possible (though cumbersome) for a client to keep track of these bounds. In that case, though it would be convenient to offer functions to return the current bounds, such functions would not be part of a truly minimal interface.

Proceeding on the assumption that assignment can modify the bounds of an object, the bounds functions could be declared thus:

int lowBound() const;
int highBound() const;

Because these functions don't modify the object on which they are invoked, and because you prefer to use const whenever you can (see Item 21), these are both declared const member functions. Given these functions, the loop above would be written as follows:

for (int i = a.lowBound(); i <= a.highBound(); ++i)
  cout << "a[" << i << "] = " << a[i] << '\n';

Needless to say, for such a loop to work for an array of objects of type T, an operator<< function must be defined for objects of type T. (That's not quite true. What must exist is an operator<< for T or for some other type to which T may be implicitly converted (see Item M5). But you get the idea.)

Some designers would argue that the Array class should also offer a function to return the number of elements in an Array object. The number of elements is simply highBound()-lowBound()+1, so such a function is not really necessary, but in view of the frequency of off-by-one errors, it might not be a bad idea to add such a function.

Other functions that might prove worthwhile for this class include those for input and output, as well as the various relational operators (e.g., <, >, ==, etc.). None of those functions is part of a minimal interface, however, because they can all be implemented in terms of loops containing calls to operator[].

Speaking of functions like operator<<, operator>>, and the relational operators, Item 19 discusses why they are frequently implemented as non-member friend functions instead of as member functions. That being the case, don't forget that friend functions are, for all practical purposes, part of a class's interface. That means that friend functions count toward a class interface's completeness and minimalness.

Back to Design and Declaration
Continue to Item 19: Differentiate among member functions, non-member functions, and friend functions.