More Effective C++ | Item 3: Never treat arrays polymorphically

Back to Item 2: Prefer C++-style casts
Continue to Item 4: Avoid gratuitous default constructors

Item 3: Never treat arrays polymorphically.

One of the most important features of inheritance is that you can manipulate derived class objects through pointers and references to base class objects. Such pointers and references are said to behave polymorphically — as if they had multiple types. C++ also allows you to manipulate arrays of derived class objects through base class pointers and references. This is no feature at all, because it almost never works the way you want it to.

For example, suppose you have a class BST (for binary search tree objects) and a second class, BalancedBST, that inherits from BST:

class BST { ... };

class BalancedBST: public BST { ... };

In a real program such classes would be templates, but that's unimportant here, and adding all the template syntax just makes things harder to read. For this discussion, we'll assume BST and BalancedBST objects contain only ints.

Consider a function to print out the contents of each BST in an array of BSTs:

void printBSTArray(ostream& s,
                   const BST array[],
                   int numElements)
{
  for (int i = 0; i < numElements; ++i) {
    s << array[i];              // this assumes an
  }                             // operator<< is defined
}                               // for BST objects

This will work fine when you pass it an array of BST objects:

BST BSTArray[10];

...

printBSTArray(cout, BSTArray, 10);          // works fine

Consider, however, what happens when you pass printBSTArray an array of BalancedBST objects:

BalancedBST bBSTArray[10];

...

printBSTArray(cout, bBSTArray, 10);         // works fine?

Your compilers will accept this function call without complaint, but look again at the loop for which they must generate code:

for (int i = 0; i < numElements; ++i) {
  s << array[i];
}

Now, array[i] is really just shorthand for an expression involving pointer arithmetic: it stands for *(array+i). We know that array is a pointer to the beginning of the array, but how far away from the memory location pointed to by array is the memory location pointed to by array+i? The distance between them is i*sizeof(an object in the array), because there are i objects between array[0] and array[i]. In order for compilers to emit code that walks through the array correctly, they must be able to determine the size of the objects in the array. This is easy for them to do. The parameter array is declared to be of type array-of-BST, so each element of the array must be a BST, and the distance between array and array+i must be i*sizeof(BST).

At least that's how your compilers look at it. But if you've passed an array of BalancedBST objects to printBSTArray, your compilers are probably wrong. In that case, they'd assume each object in the array is the size of a BST, but each object would actually be the size of a BalancedBST. Derived classes usually have more data members than their base classes, so derived class objects are usually larger than base class objects. We thus expect a BalancedBST object to be larger than a BST object. If it is, the pointer arithmetic generated for printBSTArray will be wrong for arrays of BalancedBST objects, and there's no telling what will happen when printBSTArray is invoked on a BalancedBST array. Whatever does happen, it's a good bet it won't be pleasant.

The problem pops up in a different guise if you try to delete an array of derived class objects through a base class pointer. Here's one way you might innocently attempt to do it:

// delete an array, but first log a message about its
// deletion
void deleteArray(ostream& logStream, BST array[])
{
  logStream << "Deleting array at address "
            << static_cast<void*>(array) << '\n';

delete [] array;
}

BalancedBST *balTreeArray =                  // create a BalancedBST
new BalancedBST[50];                       // array

...

deleteArray(cout, balTreeArray);             // log its deletion

You can't see it, but there's pointer arithmetic going on here, too. When an array is deleted, a destructor for each element of the array must be called (see Item 8). When compilers see the statement

delete [] array;

they must generate code that does something like this:

// destruct the objects in *array in the inverse order
// in which they were constructed
for (int i = the number of elements in the array - 1;
     i >= 0;
     --i)
{
  array[i].BST::~BST();                     // call array[i]'s
}                                           // destructor

Just as this kind of loop failed to work when you wrote it, it will fail to work when your compilers write it, too. The °language specification says the result of deleting an array of derived class objects through a base class pointer is undefined, but we know what that really means: executing the code is almost certain to lead to grief. Polymorphism and pointer arithmetic simply don't mix. Array operations almost always involve pointer arithmetic, so arrays and polymorphism don't mix.

Note that you're unlikely to make the mistake of treating an array polymorphically if you avoid having a concrete class (like BalancedBST) inherit from another concrete class (such as BST). As Item 33 explains, designing your software so that concrete classes never inherit from one another has many benefits. I encourage you to turn to Item 33 and read all about them.

Back to Item 2: Prefer C++-style casts
Continue to Item 4: Avoid gratuitous default constructors