Back to Item 16: Assign to all data members in operator=.   
  Continue to Classes and Functions: Design and Declaration

Item 17:   Check for assignment to self in operator=.

An assignment to self occurs when you do something like this:

This looks like a silly thing to do, but it's perfectly legal, so don't doubt for a moment that programmers do it. More importantly, assignment to self can appear in this more benign-looking form:

If b is another name for a (for example, a reference that has been initialized to a), then this is also an assignment to self, though it doesn't outwardly look like it. This is an example of aliasing: having two or more names for the same underlying object. As you'll see at the end of this Item, aliasing can crop up in any number of nefarious disguises, so you need to take it into account any time you write a function.

Two good reasons exist for taking special care to cope with possible aliasing in assignment operator(s). The lesser of them is efficiency. If you can detect an assignment to self at the top of your assignment operator(s), you can return right away, possibly saving a lot of work that you'd otherwise have to go through to implement assignment. For example, Item 16 points out that a proper assignment operator in a derived class must call an assignment operator for each of its base classes, and those classes might themselves be derived classes, so skipping the body of an assignment operator in a derived class might save a large number of other function calls.

A more important reason for checking for assignment to self is to ensure correctness. Remember that an assignment operator must typically free the resources allocated to an object (i.e., get rid of its old value) before it can allocate the new resources corresponding to its new value. When assigning to self, this freeing of resources can be disastrous, because the old resources might be needed during the process of allocating the new ones.

Consider assignment of String objects, where the assignment operator fails to check for assignment to self:

Consider now what happens in this case:

Inside the assignment operator, *this and rhs seem to be different objects, but in this case they happen to be different names for the same object. You can envision it like this:

The first thing the assignment operator does is use delete on data, and the result is the following state of affairs:

Now when the assignment operator tries to do a strlen on rhs.data, the results are undefined. This is because rhs.data was deleted when data was deleted, which happened because data, this->data, and rhs.data are all the same pointer! From this point on, things can only get worse.

By now you know that the solution to the dilemma is to check for an assignment to self and to return immediately if such an assignment is detected. Unfortunately, it's easier to talk about such a check than it is to write it, because you are immediately forced to figure out what it means for two objects to be "the same."

The topic you confront is technically known as that of object identity, and it's a well-known topic in object-oriented circles. This book is no place for a discourse on object identity, but it is worthwhile to mention the two basic approaches to the problem.

One approach is to say that two objects are the same (have the same identity) if they have the same value. For example, two String objects would be the same if they represented the same sequence of characters:

Here a and c have the same value, so they are considered identical; b is different from both of them. If you wanted to use this definition of identity in your String class, your assignment operator might look like this:

Value equality is usually determined by operator==, so the general form for an assignment operator for a class C that uses value equality for object identity is this:

Note that this function is comparing objects (via operator==), not pointers. Using value equality to determine identity, it doesn't matter whether two objects occupy the same memory; all that matters is the values they represent.

The other possibility is to equate an object's identity with its address in memory. Using this definition of object equality, two objects are the same if and only if they have the same address. This definition is more common in C++ programs, probably because it's easy to implement and the computation is fast, neither of which is always true when object identity is based on values. Using address equality, a general assignment operator looks like this:

This suffices for a great many programs.

If you need a more sophisticated mechanism for determining whether two objects are the same, you'll have to implement it yourself. The most common approach is based on a member function that returns some kind of object identifier:

Given object pointers a and b, then, the objects they point to are identical if and only if a->identity() == b->identity(). Of course, you are responsible for writing operator== for ObjectIDs.

The problems of aliasing and object identity are hardly confined to operator=. That's just a function in which you are particularly likely to run into them. In the presence of references and pointers, any two names for objects of compatible types may in fact refer to the same object. Here are some other situations in which aliasing can show its Medusa-like visage:

These examples happen to use references, but pointers would serve just as well.

As you can see, aliasing can crop up in a variety of guises, so you can't just forget about it and hope you'll never run into it. Well, maybe you can, but most of us can't. At the expense of mixing my metaphors, this is a clear case in which an ounce of prevention is worth its weight in gold. Anytime you write a function in which aliasing could conceivably be present, you must take that possibility into account when you write the code.

Back to Item 16: Assign to all data members in operator=.   
  Continue to Classes and Functions: Design and Declaration