Effective C++, 2E | Introduction Back to Acknowledgments Continue to Shifting from C to C++ Introduction Learning the fundamentals of a programming language is one thing; learning how to design and implement effective programs in that language is something else entirely. This is especially true of C++, a language boasting an uncommon range of power and expressiveness. Built atop a full-featured conventional language (C), it also offers a wide range of object-oriented features, as well as support for templates and exceptions. Properly used, C++ can be a joy to work with. An enormous variety of designs, both object-oriented and conventional, can be expressed directly and implemented efficiently. You can define new data types that are all but indistinguishable from their built-in counterparts, yet are substantially more flexible. A judiciously chosen and carefully crafted set of classes one that automatically handles memory management, aliasing, initialization and clean-up, type conversions, and all the other conundrums that are the bane of software developers can make application programming easy, intuitive, efficient, and nearly error-free. It isn't unduly difficult to write effective C++ programs, if you know how to do it. Used without discipline, C++ can lead to code that is incomprehensible, unmaintainable, inextensible, inefficient, and just plain wrong. The trick is to discover those aspects of C++ that are likely to trip you up and to learn how to avoid them. That is the purpose of this book. I assume you already know C++ as a language and that you have some experience in its use. What I provide here is a guide to using the language effectively, so that your software is comprehensible, maintainable, extensible, efficient, and likely to behave as you expect. The advice I proffer falls into two broad categories: general design strategies, and the nuts and bolts of specific language features. The design discussions concentrate on how to choose between different approaches to accomplishing something in C++. How do you choose between inheritance and templates? Between templates and generic pointers? Between public and private inheritance? Between private inheritance and layering? Between function overloading and parameter defaulting? Between virtual and nonvirtual functions? Between pass-by-value and pass-by-reference? It is important to get these decisions right at the outset, because an incorrect choice may not become apparent until much later in the development process, at which point its rectification is often difficult, time-consuming, demoralizing, and expensive. Even when you know exactly what you want to do, getting things just right can be tricky. What's the proper return type for the assignment operator? How should operator new behave when it can't find enough memory? When should a destructor be virtual? How should you write a member initialization list? It's crucial to sweat details like these, because failure to do so almost always leads to unexpected, possibly mystifying, program behavior. More importantly, the aberrant behavior may not be immediately apparent, giving rise to the specter of code that passes through quality control while still harboring a variety of undetected bugs ticking time-bombs just waiting to go off. This is not a book that must be read cover to cover to make any sense. You need not even read it front to back. The material is broken down into 50 Items, each of which stands more or less on its own. Frequently, however, one Item will refer to others, so one way to read the book is to start with a particular Item of interest and then follow the references to see where they lead you. The Items are grouped into general topic areas, so if you are interested in discussions related to a particular issue, such as memory management or object-oriented design, you can start with the relevant section and either read straight through or start jumping around from there. You will find, however, that all of the material in this book is pretty fundamental to effective C++ programming, so almost everything is eventually related to everything else in one way or another. This is not a reference book for C++, nor is it a way for you to learn the language from scratch. For example, I'm eager to tell you all about the gotchas in writing your own operator new (see Items 7-10), but I assume you can go elsewhere to discover that that function must return a void* and its first argument must be of type size_t. There are a number of introductory books on C++ that contain information such as that. The purpose of this book is to highlight those aspects of C++ programming that are usually treated superficially (if at all). Other books describe the different parts of the language. This book tells you how to combine those parts so you end up with effective programs. Other books tell you how to get your programs to compile. This book tells you how to avoid problems that compilers won't tell you about. Like most languages, C++ has a rich folklore that is usually passed from programmer to programmer as part of the language's grand oral tradition. This book is my attempt to record some of that accumulated wisdom in a more accessible form. At the same time, this book limits itself to legitimate, portable, C++. Only language features in the ISO/ANSI language standard (see Item M35) have been used here. In this book, portability is a key concern, so if you're looking for implementation-dependent hacks and kludges, this is not the place to find them. Alas, C++ as described by the standard is sometimes different from the C++ supported by your friendly neighborhood compiler vendors. As a result, when I point out places where relatively new language features are useful, I also show you how to produce effective software in their absence. After all, it would be foolish to labor in ignorance of what the future is sure to bring, but by the same token, you can't just put your life on hold until the latest, greatest, be-all-and-end-all C++ compilers appear on your computer. You've got to work with the tools available to you, and this book helps you do just that. Notice that I refer to compilers plural. Different compilers implement varying approximations to the standard, so I encourage you to develop your code under at least two compilers. Doing so will help you avoid inadvertent dependence on one vendor's proprietary language extension or its misinterpretation of the standard. It will also help keep you away from the bleeding edge of compiler technology, i.e., from new features supported by only one vendor. Such features are often poorly implemented (buggy or slow frequently both), and upon their introduction, the C++ community lacks experience to advise you in their proper application. Blazing trails can be exciting, but when your goal is producing reliable code, it's often best to let others do the bushwhacking for you. One thing you will not find in this book is the C++ Gospel, the One True Path to perfect C++ software. Each of the 50 Items in this book provides guidance on how to come up with better designs, how to avoid common problems, or how to achieve greater efficiency, but none of the Items is universally applicable. Software design and implementation is a complex task, one invariably colored by the constraints of the hardware, the operating system, and the application, so the best I can do is provide guidelines for creating better programs. If you follow all the guidelines all the time, you are unlikely to fall into the most common traps surrounding C++, but guidelines, by their very nature, have exceptions. That's why each Item has an explanation. The explanations are the most important part of the book. Only by understanding the rationale behind an Item can you reasonably determine whether it applies to the software you are developing and to the unique constraints under which you toil. The best use of this book, then, is to gain insight into how C++ behaves, why it behaves that way, and how to use its behavior to your advantage. Blind application of the Items in this book is clearly inappropriate, but at the same time, you probably shouldn't violate any of the guidelines without having a good reason for doing so. There's no point in getting hung up on terminology in a book like this; that form of sport is best left to language lawyers. However, there is a small C++ vocabulary that everybody should understand. The following terms crop up often enough that it is worth making sure we agree on what they mean. A declaration tells compilers about the name and type of an object, function, class, or template, but it omits certain details. These are declarations: extern int x; // object declaration int numDigits(int number); // function declaration class Clock; // class declaration template class SmartPointer; // template declaration A definition, on the other hand, provides compilers with the details. For an object, the definition is where compilers allocate memory for the object. For a function or a function template, the definition provides the code body. For a class or a class template, the definition lists the members of the class or template: int x; // object definition int numDigits(int number) // function definition { // (this function returns int digitsSoFar = 1; // the number of digits in // its parameter) if (number < 0) { number = -number; ++digitsSoFar; } while (number /= 10) ++digitsSoFar; return digitsSoFar; } class Clock { // class definition public: Clock(); ~Clock(); int hour() const; int minute() const; int second() const; ... }; template class SmartPointer { // template definition public: SmartPointer(T *p = 0); ~SmartPointer(); T * operator->() const; T& operator*() const; ... }; That brings us to constructors. A default constructor is one that can be called without any arguments. Such a constructor either has no parameters or has a default value for every parameter. You generally need a default constructor if you want to define arrays of objects: class A { public: A(); // default constructor }; A arrayA[10]; // 10 constructors called class B { public: B(int x = 0); // default constructor }; B arrayB[10]; // 10 constructors called, // each with an arg of 0 class C { public: C(int x); // not adefault constructor }; C arrayC[10]; // error! You may find that your compilers reject arrays of objects when a class's default constructor has default parameter values. For example, some compilers refuse to accept the definition of arrayB above, even though it receives the blessing of the C++ standard. This is an example of the kind of discrepancy that can exist between the standard's description of C++ and a particular compiler's implementation of the language. Every compiler I know of has a few of these shortcomings. Until compiler vendors catch up to the standard, be prepared to be flexible, and take solace in the certainty that someday in the not-too-distant future, the C++ described in the standard will be the same as the language accepted by C++ compilers. Incidentally, if you want to create an array of objects for which there is no default constructor, the usual ploy is to define an array of pointers instead. Then you can initialize each pointer separately by using new: C *ptrArray[10]; // no constructors called ptrArray[0] = new C(22); // allocate and construct // 1 C object ptrArray[1] = new C(4); // ditto ... This suffices almost all the time. When it doesn't, you'll probably have to fall back on the more advanced (and hence more obscure) "placement new" approach described in Item M4. Back on the terminology front, a copy constructor is used to initialize an object with a different object of the same type: class String { public: String(); // default constructor String(const String& rhs); // copy constructor ... private: char *data; }; String s1; // call default constructor String s2(s1); // call copy constructor String s3 = s2; // call copy constructor Probably the most important use of the copy constructor is to define what it means to pass and return objects by value. As an example, consider the following (inefficient) way of writing a function to concatenate two String objects: const String operator+(String s1, String s2) { String temp; delete [] temp.data; temp.data = new char[strlen(s1.data) + strlen(s2.data) + 1]; strcpy(temp.data, s1.data); strcat(temp.data, s2.data); return temp; } String a("Hello"); String b(" world"); String c = a + b; // c = String("Hello world") This operator+ takes two String objects as parameters and returns one String object as a result. Both the parameters and the result will be passed by value, so there will be one copy constructor called to initialize s1 with a, one to initialize s2 with b, and one to initialize c with temp. In fact, there might even be some additional calls to the copy constructor if a compiler decides to generate intermediate temporary objects, which it is allowed to do (see Item M19). The important point here is that pass-by-value means "call the copy constructor." By the way, you wouldn't really implement operator+ for Strings like this. Returning a const String object is correct (see Items 21 and 23), but you would want to pass the two parameters by reference (see Item 22). Actually, you wouldn't write operator+ for Strings at all if you could help it, and you should be able to help it almost all the time. That's because the standard C++ library (see Item 49) contains a string type (cunningly named string), as well as an operator+ for string objects that does almost exactly what the operator+ above does. In this book, I use both String and string objects, but I use them in different ways. (Note that the former name is capitalized, the latter name is not.) If I need just a generic string and I don't care how it's implemented, I use the string type that is part of the standard C++ library. That's what you should do, too. Often, however, I want to make a point about how C++ behaves, and in those cases, I need to show some implementation code. That's when I use the (nonstandard) String class. As a programmer, you should use the standard string type whenever you need a string object; the days of developing your own string class as a C++ rite of passage are behind us. However, you still need to understand the issues that go into the development of classes like string. String is convenient for that purpose (and for that purpose only). As for raw char*-based strings, you shouldn't use those antique throw-backs unless you have a very good reason. Well-implemented string types can now be superior to char*s in virtually every way including efficiency (see Item 49 and Items M29-M30). The next two terms we need to grapple with are initialization and assignment. An object's initialization occurs when it is given a value for the very first time. For objects of classes or structs with constructors, initialization is always accomplished by calling a constructor. This is quite different from object assignment, which occurs when an object that is already initialized is given a new value: string s1; // initialization string s2("Hello"); // initialization string s3 = s2; // initialization s1 = s3; // assignment From a purely operational point of view, the difference between initialization and assignment is that the former is performed by a constructor while the latter is performed by operator=. In other words, the two processes correspond to different function calls. The reason for the distinction is that the two kinds of functions must worry about different things. Constructors usually have to check their arguments for validity, whereas most assignment operators can take it for granted that their argument is legitimate (because it has already been constructed). On the other hand, the target of an assignment, unlike an object undergoing construction, may already have resources allocated to it. These resources typically must be released before the new resources can be assigned. Frequently, one of these resources is memory. Before an assignment operator can allocate memory for a new value, it must first deallocate the memory that was allocated for the old value. Here is how a String constructor and assignment operator could be implemented: // a possible String constructor String::String(const char *value) { if (value) { // if value ptr isn't null data = new char[strlen(value) + 1]; strcpy(data,value); } else { // handle null value ptr3 data = new char[1]; *data = '\0'; // add trailing null char } } // a possible String assignment operator String& String::operator=(const String& rhs) { if (this == &rhs) return *this; // see Item 17 delete [] data; // delete old memory data = // allocate new memory new char[strlen(rhs.data) + 1]; strcpy(data, rhs.data); return *this; // see Item 15 } Notice how the constructor must check its parameter for validity and how it must take pains to ensure that the member data is properly initialized, i.e., points to a char* that is properly null-terminated. On the other hand, the assignment operator takes it for granted that its parameter is legitimate. Instead, it concentrates on detecting pathological conditions, such as assignment to itself (see Item 17), and on deallocating old memory before allocating new memory. The differences between these two functions typify the differences between object initialization and object assignment. By the way, if the "[]" notation in the use of delete is new to you (pardon the pun), Items 5 and M8 should dispel any confusion you may have. A final term that warrants discussion is client. A client is a programmer, one who uses the code you write. When I talk about clients in this book, I am referring to people looking at your code, trying to figure out what it does; to people reading your class definitions, attempting to determine whether they want to inherit from your classes; to people examining your design decisions, hoping to glean insights into their rationale. You may not be used to thinking about your clients, but I'll spend a good deal of time trying to convince you to make their lives as easy as you can. After all, you are a client of the software other people develop. Wouldn't you want those people to make things easy for you? Besides, someday you may find yourself in the uncomfortable position of having to use your own code, in which case your client will be you! I use two constructs in this book that may not be familiar to you. Both are relatively recent additions to C++. The first is the bool type, which has as its values the keywords true and false. This is the type now returned by the built-in relational operators (e.g., <, >, ==, etc.) and tested in the condition part of if, for, while, and do statements. If your compilers haven't implemented bool, an easy way to approximate it is to use a typedef for bool and constant objects for true and false: typedef int bool; const bool false = 0; const bool true = 1; This is compatible with the traditional semantics of C and C++. The behavior of programs using this approximation won't change when they're ported to bool-supporting compilers. For a different way of approximating bool including a discussion of the advantages and disadvantages of each approach turn to the Introduction of More Effective C++. The second new construct is really four constructs, the casting forms static_cast, const_cast, dynamic_cast, and reinterpret_cast. Conventional C-style casts look like this: (type) expression // cast expression to be of // type type The new casts look like this: static_cast(expression) // cast expression to be of // type type const_cast(expression) dynamic_cast(expression) reinterpret_cast(expression) These different casting forms serve different purposes: const_cast is designed to cast away the constness of objects and pointers, a topic I examine in Item 21. dynamic_cast is used to perform "safe downcasting," a subject we'll explore in Item 39. reinterpret_cast is engineered for casts that yield implementation-dependent results, e.g., casting between function pointer types. (You're not likely to need reinterpret_cast very often. I don't use it at all in this book.) static_cast is sort of the catch-all cast. It's what you use when none of the other casts is appropriate. It's the closest in meaning to the conventional C-style casts. Conventional casts continue to be legal, but the new casting forms are preferable. They're much easier to identify in code (both for humans and for tools like grep), and the more narrowly specified purpose of each casting form makes it possible for compilers to diagnose usage errors. For example, only const_cast can be used to cast away the constness of something. If you try to cast away an object's or a pointer's constness using one of the other new casts, your cast expression won't compile. For more information on the new casts, see Item M2 or consult a recent introductory textbook on C++. In the code examples in this book, I have tried to select meaningful names for objects, classes, functions, etc. Many books, when choosing identifiers, embrace the time-honored adage that brevity is the soul of wit, but I'm not as interested in being witty as I am in being clear. I have therefore striven to break the tradition of using cryptic identifiers in books on programming languages. Nonetheless, I have at times succumbed to the temptation to use two of my favorite parameter names, and their meanings may not be immediately apparent, especially if you've never done time on a compiler-writing chain gang. The names are lhs and rhs, and they stand for "left-hand side" and "right-hand side," respectively. I use them as parameter names for functions implementing binary operators, especially operator== and arithmetic operators like operator*. For example, if a and b are objects representing rational numbers, and if rational numbers can be multiplied via a non-member operator* function, the expression a * b is equivalent to the function call operator*(a, b) As you will discover in Item 23, I declare operator* like this: const Rational operator*(const Rational& lhs, const Rational& rhs); As you can see, the left-hand operand, a, is known as lhs inside the function, and the right-hand operand is known as rhs. I've also chosen to abbreviate names for pointers according to this rule: a pointer to an object of type T is often called pt, "pointer to T." Here are some examples: string *ps; // ps = ptr to string class Airplane; Airplane *pa; // pa = ptr to Airplane class BankAccount; BankAccount *pba; // pba = ptr to BankAccount I use a similar convention for references. That is, rs might be a reference-to-string and ra a reference-to-Airplane. I occasionally use the name mf when I'm talking about member functions. On the off chance there might be some confusion, any time I mention the C programming language in this book, I mean the ISO/ANSI-sanctified version of C, not the older, less strongly-typed, "classic" C. Back to Acknowledgments Continue to Shifting from C to C++ 3 My String’s constructor taking a const char* argument handles the case where a null pointer is passed in, but the standard string type is not required to be so tolerant. Attempts to create a string from a null pointer yield undefined results. However, it is safe to create a string object from an empty char*-based string, i.e., from "". Return