More Effective C++ | Item 17: Consider using lazy evaluation

Back to Item 16: Remember the 80-20 rule
Continue to Item 18: Amortize the cost of expected computations

Item 17: Consider using lazy evaluation.

From the perspective of efficiency, the best computations are those you never perform at all. That's fine, but if you don't need to do something, why would you put code in your program to do it in the first place? And if you do need to do something, how can you possibly avoid executing the code that does it?

The key is to be lazy.

Remember when you were a child and your parents told you to clean your room? If you were anything like me, you'd say "Okay," then promptly go back to what you were doing. You would not clean your room. In fact, cleaning your room would be the last thing on your mind — until you heard your parents coming down the hall to confirm that your room had, in fact, been cleaned. Then you'd sprint to your room and get to work as fast as you possibly could. If you were lucky, your parents would never check, and you'd avoid all the work cleaning your room normally entails.

It turns out that the same delay tactics that work for a five year old work for a C++ programmer. In Computer Science, however, we dignify such procrastination with the name lazy evaluation. When you employ lazy evaluation, you write your classes in such a way that they defer computations until the results of those computations are required. If the results are never required, the computations are never performed, and neither your software's clients nor your parents are any the wiser.

Perhaps you're wondering exactly what I'm talking about. Perhaps an example would help. Well, lazy evaluation is applicable in an enormous variety of application areas, so I'll describe four.

Reference Counting

Consider this code:

class String { ... };                        // a string class (the standard
                                             // string type may be implemented
                                             // as described below, but it
                                             // doesn't have to be)

String s1 = "Hello";

String s2 = s1;                              // call String copy ctor

A common implementation for the String copy constructor would result in s1 and s2 each having its own copy of "Hello" after s2 is initialized with s1. Such a copy constructor would incur a relatively large expense, because it would have to make a copy of s1's value to give to s2, and that would typically entail allocating heap memory via the new operator (see Item 8) and calling strcpy to copy the data in s1 into the memory allocated by s2. This is eager evaluation: making a copy of s1 and putting it into s2 just because the String copy constructor was called. At this point, however, there has been no real need for s2 to have a copy of the value, because s2 hasn't been used yet.

The lazy approach is a lot less work. Instead of giving s2 a copy of s1's value, we have s2 share s1's value. All we have to do is a little bookkeeping so we know who's sharing what, and in return we save the cost of a call to new and the expense of copying anything. The fact that s1 and s2 are sharing a data structure is transparent to clients, and it certainly makes no difference in statements like the following, because they only read values, they don't write them:

cout << s1;                              // read s1's value

cout << s1 + s2;                         // read s1's and s2's values

In fact, the only time the sharing of values makes a difference is when one or the other string is modified; then it's important that only one string be changed, not both. In this statement,

s2.convertToUpperCase();

it's crucial that only s2's value be changed, not s1's also.

To handle statements like this, we have to implement String's convertToUpperCase function so that it makes a copy of s2's value and makes that value private to s2 before modifying it. Inside convertToUpperCase, we can be lazy no longer: we have to make a copy of s2's (shared) value for s2's private use. On the other hand, if s2 is never modified, we never have to make a private copy of its value. It can continue to share a value as long as it exists. If we're lucky, s2 will never be modified, in which case we'll never have to expend the effort to give it its own value.

The details on making this kind of value sharing work (including all the code) are provided in Item 29, but the idea is lazy evaluation: don't bother to make a copy of something until you really need one. Instead, be lazy — use someone else's copy as long as you can get away with it. In some application areas, you can often get away with it forever.

Distinguishing Reads from Writes

Pursuing the example of reference-counting strings a bit further, we come upon a second way in which lazy evaluation can help us. Consider this code:

String s = "Homer's Iliad";                  // Assume s is a
                                             // reference-counted string
...

cout << s[3];                         // call operator[] to read s[3]
s[3] = 'x';                           // call operator[] to write s[3]

The first call to operator[] is to read part of a string, but the second call is to perform a write. We'd like to be able to distinguish the read call from the write, because reading a reference-counted string is cheap, but writing to such a string may require splitting off a new copy of the string's value prior to the write.

This puts us in a difficult implementation position. To achieve what we want, we need to do different things inside operator[] (depending on whether it's being called to perform a read or a write). How can we determine whether operator[] has been called in a read or a write context? The brutal truth is that we can't. By using lazy evaluation and proxy classes as described in Item 30, however, we can defer the decision on whether to take read actions or write actions until we can determine which is correct.

Lazy Fetching

As a third example of lazy evaluation, imagine you've got a program that uses large objects containing many constituent fields. Such objects must persist across program runs, so they're stored in a database. Each object has a unique object identifier that can be used to retrieve the object from the database:

class LargeObject {                        // large persistent objects
public:
  LargeObject(ObjectID id);                // restore object from disk

const string& field1() const;            // value of field 1
  int field2() const;                      // value of field 2
  double field3() const;                   // ...
  const string& field4() const;
  const string& field5() const;
  ...

};

Now consider the cost of restoring a LargeObject from disk:

void restoreAndProcessObject(ObjectID id)
{
  LargeObject object(id);                  // restore object

...

}

Because LargeObject instances are big, getting all the data for such an object might be a costly database operation, especially if the data must be retrieved from a remote database and pushed across a network. In some cases, the cost of reading all that data would be unnecessary. For example, consider this kind of application:

void restoreAndProcessObject(ObjectID id)
{
  LargeObject object(id);

if (object.field2() == 0) {
    cout << "Object " << id << ": null field2.\n";
  }
}

Here only the value of field2 is required, so any effort spent setting up the other fields is wasted.

The lazy approach to this problem is to read no data from disk when a LargeObject object is created. Instead, only the "shell" of an object is created, and data is retrieved from the database only when that particular data is needed inside the object. Here's one way to implement this kind of "demand-paged" object initialization:

class LargeObject {
public:
  LargeObject(ObjectID id);

const string& field1() const;
  int field2() const;
  double field3() const;
  const string& field4() const;
  ...

private:
  ObjectID oid;

mutable string *field1Value;               // see below for a
  mutable int *field2Value;                  // discussion of "mutable"
  mutable double *field3Value;
  mutable string *field4Value;
  ...

};

LargeObject::LargeObject(ObjectID id)
: oid(id), field1Value(0), field2Value(0), field3Value(0), ...
{}

const string& LargeObject::field1() const
{
  if (field1Value == 0) {
read the data for field 1 from the database and make
    field1Value point to it;
}

return *field1Value;
}

Each field in the object is represented as a pointer to the necessary data, and the LargeObject constructor initializes each pointer to null. Such null pointers signify fields that have not yet been read from the database. Each LargeObject member function must check the state of a field's pointer before accessing the data it points to. If the pointer is null, the corresponding data must be read from the database before performing any operations on that data.

When implementing lazy fetching, you must confront the problem that null pointers may need to be initialized to point to real data from inside any member function, including const member functions like field1. However, compilers get cranky when you try to modify data members inside const member functions, so you've got to find a way to say, "It's okay, I know what I'm doing." The best way to say that is to declare the pointer fields mutable, which means they can be modified inside any member function, even inside const member functions (see Item E21). That's why the fields inside LargeObject above are declared mutable.

The mutable keyword is a relatively recent addition to C++, so it's possible your vendors don't yet support it. If not, you'll need to find another way to convince your compilers to let you modify data members inside const member functions. One workable strategy is the "fake this" approach, whereby you create a pointer-to-non-const that points to the same object as this does. When you want to modify a data member, you access it through the "fake this" pointer:

class LargeObject {
public:
  const string& field1() const;              // unchanged
  ...

private:
  string *field1Value;                       // not declared mutable
  ...                                        // so that older
};                                           // compilers will accept it

const string& LargeObject::field1() const
{
  // declare a pointer, fakeThis, that points where this
  // does, but where the constness of the object has been
  // cast away
  LargeObject * const fakeThis =
    const_cast<LargeObject* const>(this);

if (field1Value == 0) {
    fakeThis->field1Value =                  // this assignment is OK,
the appropriate data                   // because what fakeThis
from the database;                     // points to isn't const
  }

return *field1Value;
}

This function employs a const_cast (see Item 2) to cast away the constness of *this. If your compilers don't support const_cast, you can use an old C-style cast:

// Use of old-style cast to help emulate mutable
const string& LargeObject::field1() const
{
  LargeObject * const fakeThis = (LargeObject* const)this;

...                                        // as above

}

Look again at the pointers inside LargeObject. Let's face it, it's tedious and error-prone to have to initialize all those pointers to null, then test each one before use. Fortunately, such drudgery can be automated through the use of smart pointers, which you can read about in Item 28. If you use smart pointers inside LargeObject, you'll also find you no longer need to declare the pointers mutable. Alas, it's only a temporary respite, because you'll wind up needing mutable once you sit down to implement the smart pointer classes. Think of it as conservation of inconvenience.

Lazy Expression Evaluation

A final example of lazy evaluation comes from numerical applications. Consider this code:

template<class T>
class Matrix { ... };                         // for homogeneous matrices

Matrix<int> m1(1000, 1000);                   // a 1000 by 1000 matrix
Matrix<int> m2(1000, 1000);                   // ditto

...

Matrix<int> m3 = m1 + m2;                     // add m1 and m2

The usual implementation of operator+ would use eager evaluation; in this case it would compute and return the sum of m1 and m2. That's a fair amount of computation (1,000,000 additions), and of course there's the cost of allocating the memory to hold all those values, too.

The lazy evaluation strategy says that's way too much work, so it doesn't do it. Instead, it sets up a data structure inside m3 that indicates that m3's value is the sum of m1 and m2. Such a data structure might consist of nothing more than a pointer to each of m1 and m2, plus an enum indicating that the operation on them is addition. Clearly, it's going to be faster to set up this data structure than to add m1 and m2, and it's going to use a lot less memory, too.

Suppose that later in the program, before m3 has been used, this code is executed:

Matrix<int> m4(1000, 1000);

...                                           // give m4 some values

m3 = m4 * m1;

Now we can forget all about m3 being the sum of m1 and m2 (and thereby save the cost of the computation), and in its place we can start remembering that m3 is the product of m4 and m1. Needless to say, we don't perform the multiplication. Why bother? We're lazy, remember?

This example looks contrived, because no good programmer would write a program that computed the sum of two matrices and failed to use it, but it's not as contrived as it seems. No good programmer would deliberately compute a value that's not needed, but during maintenance, it's not uncommon for a programmer to modify the paths through a program in such a way that a formerly useful computation becomes unnecessary. The likelihood of that happening is reduced by defining objects immediately prior to use (see Item E32), but it's still a problem that occurs from time to time.

Nevertheless, if that were the only time lazy evaluation paid off, it would hardly be worth the trouble. A more common scenario is that we need only part of a computation. For example, suppose we use m3 as follows after initializing it to the sum of m1 and m2:

cout << m3[4];                                // print the 4th row of m3

Clearly we can be completely lazy no longer — we've got to compute the values in the fourth row of m3. But let's not be overly ambitious, either. There's no reason we have to compute any more than the fourth row of m3; the remainder of m3 can remain uncomputed until it's actually needed. With luck, it never will be.

How likely are we to be lucky? Experience in the domain of matrix computations suggests the odds are in our favor. In fact, lazy evaluation lies behind the wonder that is APL. APL was developed in the 1960s for interactive use by people who needed to perform matrix-based calculations. Running on computers that had less computational horsepower than the chips now found in high-end microwave ovens, APL was seemingly able to add, multiply, and even divide large matrices instantly! Its trick was lazy evaluation. The trick was usually effective, because APL users typically added, multiplied, or divided matrices not because they needed the entire resulting matrix, but only because they needed a small part of it. APL employed lazy evaluation to defer its computations until it knew exactly what part of a result matrix was needed, then it computed only that part. In practice, this allowed users to perform computationally intensive tasks interactively in an environment where the underlying machine was hopelessly inadequate for an implementation employing eager evaluation. Machines are faster today, but data sets are bigger and users less patient, so many contemporary matrix libraries continue to take advantage of lazy evaluation.

To be fair, laziness sometimes fails to pay off. If m3 is used in this way,

cout << m3;                                  // print out all of m3

the jig is up and we've got to compute a complete value for m3. Similarly, if one of the matrices on which m3 is dependent is about to be modified, we have to take immediate action:

m3 = m1 + m2;                                // remember that m3 is the
                                             // sum of m1 and m2

m1 = m4;                                     // now m3 is the sum of m2
                                             // and the OLD value of m1!

Here we've got to do something to ensure that the assignment to m1 doesn't change m3. Inside the Matrix<int> assignment operator, we might compute m3's value prior to changing m1 or we might make a copy of the old value of m1 and make m3 dependent on that, but we have to do something to guarantee that m3 has the value it's supposed to have after m1 has been the target of an assignment. Other functions that might modify a matrix must be handled in a similar fashion.

Because of the need to store dependencies between values; to maintain data structures that can store values, dependencies, or a combination of the two; and to overload operators like assignment, copying, and addition, lazy evaluation in a numerical domain is a lot of work. On the other hand, it often ends up saving significant amounts of time and space during program runs, and in many applications, that's a payoff that easily justifies the significant effort lazy evaluation requires.

Summary

These four examples show that lazy evaluation can be useful in a variety of domains: to avoid unnecessary copying of objects, to distinguish reads from writes using operator[], to avoid unnecessary reads from databases, and to avoid unnecessary numerical computations. Nevertheless, it's not always a good idea. Just as procrastinating on your clean-up chores won't save you any work if your parents always check up on you, lazy evaluation won't save your program any work if all your computations are necessary. Indeed, if all your computations are essential, lazy evaluation may slow you down and increase your use of memory, because, in addition to having to do all the computations you were hoping to avoid, you'll also have to manipulate the fancy data structures needed to make lazy evaluation possible in the first place. Lazy evaluation is only useful when there's a reasonable chance your software will be asked to perform computations that can be avoided.

There's nothing about lazy evaluation that's specific to C++. The technique can be applied in any programming language, and several languages — notably APL, some dialects of Lisp, and virtually all dataflow languages — embrace the idea as a fundamental part of the language. Mainstream programming languages employ eager evaluation, however, and C++ is mainstream. Yet C++ is particularly suitable as a vehicle for user-implemented lazy evaluation, because its support for encapsulation makes it possible to add lazy evaluation to a class without clients of that class knowing it's been done.

Look again at the code fragments used in the above examples, and you can verify that the class interfaces offer no hints about whether eager or lazy evaluation is used by the classes. That means it's possible to implement a class using a straightforward eager evaluation strategy, but then, if your profiling investigations (see Item 16) show that class's implementation is a performance bottleneck, you can replace its implementation with one based on lazy evaluation. (See also Item E34.) The only change your clients will see (after recompilation or relinking) is improved performance. That's the kind of software enhancement clients love, one that can make you downright proud to be lazy.

Back to Item 16: Remember the 80-20 rule
Continue to Item 18: Amortize the cost of expected computations