Effective C++, 2E | Item 36: Differentiate between inheritance of interface and inheritance of implementation

Back to Item 35: Make sure public inheritance models "isa."
Continue to Item 37: Never redefine an inherited nonvirtual function.

Item 36: Differentiate between inheritance of interface and inheritance of implementation.

The seemingly straightforward notion of (public) inheritance turns out, upon closer examination, to be composed of two separable parts: inheritance of function interfaces and inheritance of function implementations. The difference between these two kinds of inheritance corresponds exactly to the difference between function declarations and function definitions discussed in the Introduction to this book.

As a class designer, you sometimes want derived classes to inherit only the interface (declaration) of a member function; sometimes you want derived classes to inherit both the interface and the implementation for a function, but you want to allow them to override the implementation you provide; and sometimes you want them to inherit both interface and implementation without allowing them to override anything.

To get a better feel for the differences among these options, consider a class hierarchy for representing geometric shapes in a graphics application:

class Shape {
public:
  virtual void draw() const = 0;

  virtual void error(const string& msg);

  int objectID() const;

...

};

class Rectangle: public Shape { ... };

class Ellipse: public Shape { ... };

Shape is an abstract class; its pure virtual function draw marks it as such. As a result, clients cannot create instances of the Shape class, only of the classes derived from it. Nonetheless, Shape exerts a strong influence on all classes that (publicly) inherit from it, because

Member function interfaces are always inherited. As explained in Item 35, public inheritance means isa, so anything that is true of a base class must also be true of its derived classes. Hence, if a function applies to a class, it must also apply to its subclasses.

Three functions are declared in the Shape class. The first, draw, draws the current object on an implicit display. The second, error, is called by member functions if they need to report an error. The third, objectID, returns a unique integer identifier for the current object; Item 17 gives an example of how such a function might be used. Each function is declared in a different way: draw is a pure virtual function; error is a simple (impure?) virtual function; and objectID is a nonvirtual function. What are the implications of these different declarations?

Consider first the pure virtual function draw. The two most salient features of pure virtual functions are that they must be redeclared by any concrete class that inherits them, and they typically have no definition in abstract classes. Put these two traits together, and you realize that

The purpose of declaring a pure virtual function is to have derived classes inherit a function interface only.

This makes perfect sense for the Shape::draw function, because it is a reasonable demand that all Shape objects must be drawable, but the Shape class can provide no reasonable default implementation for that function. The algorithm for drawing an ellipse is very different from the algorithm for drawing a rectangle, for example. A good way to interpret the declaration of Shape::draw is as saying to designers of subclasses, "You must provide a draw function, but I have no idea how you're going to implement it."

Incidentally, it is possible to provide a definition for a pure virtual function. That is, you could provide an implementation for Shape::draw, and C++ wouldn't complain, but the only way to call it would be to fully specify the call with the class name:

Shape *ps = new Shape;           // error! Shape is abstract

Shape *ps1 = new Rectangle;      // fine
ps1->draw();                     // calls Rectangle::draw

Shape *ps2 = new Ellipse;        // fine
ps2->draw();                     // calls Ellipse::draw

ps1->Shape::draw();              // calls Shape::draw

ps2->Shape::draw();              // calls Shape::draw

Aside from helping impress fellow programmers at cocktail parties, knowledge of this feature is generally of limited utility. As you'll see below, however, it can be employed as a mechanism for providing a safer-than-usual default implementation for simple (impure) virtual functions.

Sometimes it's useful to declare a class containing nothing but pure virtual functions. Such a Protocol class can provide only function interfaces for derived classes, never implementations. Protocol classes are described in Item 34 and are mentioned again in Item 43.

The story behind simple virtual functions is a bit different from that behind pure virtuals. As usual, derived classes inherit the interface of the function, but simple virtual functions traditionally provide an implementation that derived classes may or may not choose to override. If you think about this for a minute, you'll realize that

The purpose of declaring a simple virtual function is to have derived classes inherit a function interface as well as a default implementation.

In the case of Shape::error, the interface says that every class must support a function to be called when an error is encountered, but each class is free to handle errors in whatever way it sees fit. If a class doesn't want to do anything special, it can just fall back on the default error-handling provided in the Shape class. That is, the declaration of Shape::error says to designers of subclasses, "You've got to support an error function, but if you don't want to write your own, you can fall back on the default version in the Shape class."

It turns out that it can be dangerous to allow simple virtual functions to specify both a function declaration and a default implementation. To see why, consider a hierarchy of airplanes for XYZ Airlines. XYZ has only two kinds of planes, the Model A and the Model B, and both are flown in exactly the same way. Hence, XYZ designs the following hierarchy:

class Airport { ... };      // represents airports

class Airplane {
public:
  virtual void fly(const Airport& destination);

...

};

void Airplane::fly(const Airport& destination)
{
default code for flying an airplane to
  the given destination
}

class ModelA: public Airplane { ... };

class ModelB: public Airplane { ... };

To express that all planes have to support a fly function, and in recognition of the fact that different models of plane could, in principle, require different implementations for fly, Airplane::fly is declared virtual. However, in order to avoid writing identical code in the ModelA and ModelB classes, the default flying behavior is provided as the body of Airplane::fly, which both ModelA and ModelB inherit.

This is a classic object-oriented design. Two classes share a common feature (the way they implement fly), so the common feature is moved into a base class, and the feature is inherited by the two classes. This design makes common features explicit, avoids code duplication, facilitates future enhancements, and eases long-term maintenance — all the things for which object-oriented technology is so highly touted. XYZ Airlines should be proud.

Now suppose that XYZ, its fortunes on the rise, decides to acquire a new type of airplane, the Model C. The Model C differs from the Model A and the Model B. In particular, it is flown differently.

XYZ's programmers add the class for Model C to the hierarchy, but in their haste to get the new model into service, they forget to redefine the fly function:

class ModelC: public Airplane {

...                          // no fly function is
                               // declared
};

In their code, then, they have something akin to the following:

Airport JFK(...);              // JFK is an airport in
                               // New York City

Airplane *pa = new ModelC;

...

pa->fly(JFK);                  // calls Airplane::fly!

This is a disaster: an attempt is being made to fly a ModelC object as if it were a ModelA or a ModelB. That's not the kind of behavior that inspires confidence in the traveling public.

The problem here is not that Airplane::fly has default behavior, but that ModelC was allowed to inherit that behavior without explicitly saying that it wanted to. Fortunately, it's easy to offer default behavior to subclasses, but not give it to them unless they ask for it. The trick is to sever the connection between the interface of the virtual function and its default implementation. Here's one way to do it:

class Airplane {
public:
  virtual void fly(const Airport& destination) = 0;

...

protected:
  void defaultFly(const Airport& destination);
};

void Airplane::defaultFly(const Airport& destination)
{
default code for flying an airplane to
  the given destination
}

Notice how Airplane::fly has been turned into a pure virtual function. That provides the interface for flying. The default implementation is also present in the Airplane class, but now it's in the form of an independent function, defaultFly. Classes like ModelA and ModelB that want to use the default behavior simply make an inline call to defaultFly inside their body of fly (but see Item 33 for information on the interaction of inlining and virtual functions):

class ModelA: public Airplane {
public:
  virtual void fly(const Airport& destination)
  { defaultFly(destination); }

...

};

class ModelB: public Airplane {
public:
  virtual void fly(const Airport& destination)
  { defaultFly(destination); }

...

};

For the ModelC class, there is no possibility of accidentally inheriting the incorrect implementation of fly, because the pure virtual in Airplane forces ModelC to provide its own version of fly.

class ModelC: public Airplane {
public:
  virtual void fly(const Airport& destination);
...

};

void ModelC::fly(const Airport& destination)
{
code for flying a ModelC airplane to the given destination
}

This scheme isn't foolproof (programmers can still copy-and-paste themselves into trouble), but it's more reliable than the original design. As for Airplane::defaultFly, it's protected because it's truly an implementation detail of Airplane and its derived classes. Clients using airplanes should care only that they can be flown, not how the flying is implemented.

It's also important that Airplane::defaultFly is a nonvirtual function. This is because no subclass should redefine this function, a truth to which Item 37 is devoted. If defaultFly were virtual, you'd have a circular problem: what if some subclass forgets to redefine defaultFly when it's supposed to?

Some people object to the idea of having separate functions for providing interface and default implementation, such as fly and defaultFly above. For one thing, they note, it pollutes the class namespace with a proliferation of closely-related function names. Yet they still agree that interface and default implementation should be separated. How do they resolve this seeming contradiction? By taking advantage of the fact that pure virtual functions must be redeclared in subclasses, but they may also have implementations of their own. Here's how the Airplane hierarchy could take advantage of the ability to define a pure virtual function:

class Airplane {
public:
  virtual void fly(const Airport& destination) = 0;

...

};

void Airplane::fly(const Airport& destination)
{
default code for flying an airplane to
  the given destination
}

class ModelA: public Airplane {
public:
  virtual void fly(const Airport& destination)
  { Airplane::fly(destination); }

...

};

class ModelB: public Airplane {
public:
  virtual void fly(const Airport& destination)
  { Airplane::fly(destination); }

...

};

class ModelC: public Airplane {
public:
  virtual void fly(const Airport& destination);

...

};

void ModelC::fly(const Airport& destination)
{
code for flying a ModelC airplane to the given destination
}

This is almost exactly the same design as before, except that the body of the pure virtual function Airplane::fly takes the place of the independent function Airplane::defaultFly. In essence, fly has been broken into its two fundamental components. Its declaration specifies its interface (which derived classes must use), while its definition specifies its default behavior (which derived classes may use, but only if they explicitly request it). In merging fly and defaultFly, however, you've lost the ability to give the two functions different protection levels: the code that used to be protected (by being in defaultFly) is now public (because it's in fly).

Finally, we come to Shape's nonvirtual function, objectID. When a member function is nonvirtual, it's not supposed to behave differently in derived classes. In fact, a nonvirtual member function specifies an invariant over specialization, because it identifies behavior that is not supposed to change, no matter how specialized a derived class becomes. As such,

The purpose of declaring a nonvirtual function is to have derived classes inherit a function interface as well as a mandatory implementation.

You can think of the declaration for Shape::objectID as saying, "Every Shape object has a function that yields an object identifier, and that object identifier is always computed in the same way. That way is determined by the definition of Shape::objectID, and no derived class should try to change how it's done." Because a nonvirtual function identifies an invariant over specialization, it should never be redefined in a subclass, a point that is discussed in detail in Item 37.

The differences in declarations for pure virtual, simple virtual, and nonvirtual functions allow you to specify with precision what you want derived classes to inherit: interface only, interface and a default implementation, or interface and a mandatory implementation, respectively. Because these different types of declarations mean fundamentally different things, you must choose carefully among them when you declare your member functions. If you do, you should avoid the two most common mistakes made by inexperienced class designers.

The first mistake is to declare all functions nonvirtual. That leaves no room for specialization in derived classes; nonvirtual destructors are particularly problematic (see Item 14). Of course, it's perfectly reasonable to design a class that is not intended to be used as a base class. Item M34 gives an example of a case where you might want to. In that case, a set of exclusively nonvirtual member functions is appropriate. Too often, however, such classes are declared either out of ignorance of the differences between virtual and nonvirtual functions or as a result of an unsubstantiated concern over the performance cost of virtual functions (see Item M24). The fact of the matter is that almost any class that's to be used as a base class will have virtual functions (again, see Item 14).

If you're concerned about the cost of virtual functions, allow me to bring up the rule of 80-20 (see Item M16), which states that in a typical program, 80 percent of the runtime will be spent executing just 20 percent of the code. This rule is important, because it means that, on average, 80 percent of your function calls can be virtual without having the slightest detectable impact on your program's overall performance. Before you go gray worrying about whether you can afford the cost of a virtual function, then, take the simple precaution of making sure that you're focusing on the 20 percent of your program where the decision might really make a difference.

The other common problem is to declare all member functions virtual. Sometimes this is the right thing to do — witness Protocol classes (see Item 34), for example. However, it can also be a sign of a class designer who lacks the backbone to take a firm stand. Some functions should not be redefinable in derived classes, and whenever that's the case, you've got to say so by making those functions nonvirtual. It serves no one to pretend that your class can be all things to all people if they'll just take the time to redefine all your functions. Remember that if you have a base class B, a derived class D, and a member function mf, then each of the following calls to mf must work properly:

D *pd = new D;
B *pb = pd;

pb->mf();                    // call mf through a
                             // pointer-to-base

pd->mf();                    // call mf through a
                             // pointer-to-derived

Sometimes, you must make mf a nonvirtual function to ensure that everything behaves the way it's supposed to (see Item 37). If you have an invariant over specialization, don't be afraid to say so!

Back to Item 35: Make sure public inheritance models "isa."
Continue to Item 37: Never redefine an inherited nonvirtual function.