Item 23: Don't try to return a reference when you must return an object.
It is said that Albert Einstein once offered this advice: make things as simple as possible, but no simpler. The C++ analogue might well be to make things as efficient as possible, but no more
Once programmers grasp the efficiency implications of pass-by-value for objects (see Item 22), they become crusaders, determined to root out the evil of pass-by-value wherever it may hide. Unrelenting in their pursuit of pass-by-reference purity, they invariably make a fatal mistake: they start to pass references to objects that don't exist. This is not a good
Consider a class for representing rational numbers, including a friend function (see Item 19) for multiplying two rationals
class Rational { public: Rational(int numerator = 0, int denominator = 1);
...
private: int n, d; // numerator and denominator
friend const Rational // see Item 21 for why operator*(const Rational& lhs, // the return value is const Rational& rhs) // const };
inline const Rational operator*(const Rational& lhs, const Rational& rhs) { return Rational(lhs.n * rhs.n, lhs.d * rhs.d); }
Clearly, this version of operator*
is returning its result object by value, and you'd be shirking your professional duties if you failed to worry about the cost of that object's construction and destruction. Another thing that's clear is that you're cheap and you don't want to pay for such a temporary object (see Item M19) if you don't have to. So the question is this: do you have to
Well, you don't have to if you can return a reference instead. But remember that a reference is just a name, a name for some existing object. Whenever you see the declaration for a reference, you should immediately ask yourself what it is another name for, because it must be another name for something (see Item M1). In the case of operator*
, if the function is to return a reference, it must return a reference to some other Rational
object that already exists and that contains the product of the two objects that are to be multiplied
There is certainly no reason to expect that such an object exists prior to the call to operator*
. That is, if you
Rational a(1, 2); // a = 1/2 Rational b(3, 5); // b = 3/5 Rational c = a * b; // c should be 3/10
it seems unreasonable to expect that there already exists a rational number with the value three-tenths. No, if operator*
is to return a reference to such a number, it must create that number object
A function can create a new object in only two ways: on the stack or on the heap. Creation on the stack is accomplished by defining a local variable. Using that strategy, you might try to write your operator*
as
// the first wrong way to write this function inline const Rational& operator*(const Rational& lhs, const Rational& rhs) { Rational result(lhs.n * rhs.n, lhs.d * rhs.d); return result; }
You can reject this approach out of hand, because your goal was to avoid a constructor call, and result
will have to be constructed just like any other object. In addition, this function has a more serious problem in that it returns a reference to a local object, an error that is discussed in depth in Item 31.
That leaves you with the possibility of constructing an object on the heap and then returning a reference to it. Heap-based objects come into being through the use of new
. This is how you might write operator*
in that
// the second wrong way to write this function inline const Rational& operator*(const Rational& lhs, const Rational& rhs) { Rational *result = new Rational(lhs.n * rhs.n, lhs.d * rhs.d); return *result; }
Well, you still have to pay for a constructor call, because the memory allocated by new
is initialized by calling an appropriate constructor (see Items 5 and M8), but now you have a different problem: who will apply delete
to the object that was conjured up by your use of new
?
In fact, this is a guaranteed memory leak. Even if callers of operator*
could be persuaded to take the address of the function's result and use delete
on it (astronomically unlikely Item 31 shows what the code would have to look like), complicated expressions would yield unnamed temporaries that programmers would never be able to get at. For example,
Rational w, x, y, z;
w = x * y * z;
both calls to operator*
yield unnamed temporaries that the programmer never sees, hence can never delete. (Again, see Item 31.)
But perhaps you think you're smarter than the average bear or the average programmer. Perhaps you notice that both the on-the-stack and the on-the-heap approaches suffer from having to call a constructor for each result returned from operator*
. Perhaps you recall that our initial goal was to avoid such constructor invocations. Perhaps you think you know of a way to avoid all but one constructor call. Perhaps the following implementation occurs to you, an implementation based on operator*
returning a reference to a static Rational
object, one defined inside the
// the third wrong way to write this function inline const Rational& operator*(const Rational& lhs, const Rational& rhs) { static Rational result; // static object to which a // reference will be returned
somehow multiply lhs and rhs and put the resulting value inside result;
return result; }
This looks promising, though when you try to compose real C++ for the italicized pseudocode above, you'll find that it's all but impossible to give result
the correct value without invoking a Rational
constructor, and avoiding such a call is the whole reason for this game. Let us posit that you manage to find a way, however, because no amount of cleverness can ultimately save this star-crossed
To see why, consider this perfectly reasonable client
bool operator==(const Rational& lhs, // an operator== const Rational& rhs); // for Rationals
Rational a, b, c, d;
...
if ((a * b) == (c * d)) {
do whatever's appropriate when the products are equal;
} else {
do whatever's appropriate when they're not;
}
Now ponder this: the expression ((a*b)
==
(c*d))
will always evaluate to true
, regardless of the values of a
, b
, c
, and d
!
It's easiest to understand this vexing behavior by rewriting the test for equality in its equivalent functional
if (operator==(operator*(a, b), operator*(c, d)))
Notice that when operator==
is called, there will already be two active calls to operator*
, each of which will return a reference to the static Rational
object inside operator*
. Thus, operator==
will be asked to compare the value of the static Rational
object inside operator*
with the value of the static Rational
object inside operator*
. It would be surprising indeed if they did not compare equal.
With luck, this is enough to convince you that returning a reference from a function like operator*
is a waste of time, but I'm not so naive as to believe that luck is always sufficient. Some of you and you know who you are are at this very moment thinking, "Well, if one static isn't enough, maybe a static array will do the
Stop. Please. Haven't we suffered enough
I can't bring myself to dignify this design with example code, but I can sketch why even entertaining the notion should cause you to blush in shame. First, you must choose n, the size of the array. If n is too small, you may run out of places to store function return values, in which case you'll have gained nothing over the single-static
design we just discredited. But if n is too big, you'll decrease the performance of your program, because every object in the array will be constructed the first time the function is called. That will cost you n constructors and n destructors, even if the function in question is called only once. If "optimization" is the process of improving software performance, this kind of thing should be called "pessimization." Finally, think about how you'd put the values you need into the array's objects and what it would cost you to do it. The most direct way to move a value between objects is via assignment, but what is the cost of an assignment? In general, it's about the same as a call to a destructor (to destroy the old value) plus a call to a constructor (to copy over the new value). But your goal is to avoid the costs of construction and destruction! Face it: this approach just isn't going to pan
No, the right way to write a function that must return a new object is to have that function return a new object. For Rational
's operator*
, that means either the following code (which we first saw back on page 102) or something essentially
inline const Rational operator*(const Rational& lhs, const Rational& rhs) { return Rational(lhs.n * rhs.n, lhs.d * rhs.d); }
Sure, you may incur the cost of constructing and destructing operator*
's return value, but in the long run, that's a small price to pay for correct behavior. Besides, the bill that so terrifies you may never arrive. Like all programming languages, C++ allows compiler implementers to apply certain optimizations to improve the performance of the generated code, and it turns out that in some cases, operator*
's return value can be safely eliminated (see Item M20). When compilers take advantage of that fact (and current compilers often do), your program continues to behave the way it's supposed to, it just does it faster than you
It all boils down to this: when deciding between returning a reference and returning an object, your job is to make the choice that does the right thing. Let your compiler vendors wrestle with figuring out how to make that choice as inexpensive as