Some guidelines for effective C++ programming defy convenient categorization. This section is where such guidelines come to roost. Not that that diminishes their importance. If you are to write effective software, you must understand what compilers are doing for you (to you?) behind your back, how to ensure that non-local static objects are initialized before they are used, what you can expect from the standard library, and where to go for insights into the language's underlying design philosophy. In this final section of the book, I expound on these issues, and
Item 45: Know what functions C++ silently writes and calls.
When is an empty class not an empty class? When C++ gets through with it. If you don't declare them yourself, your thoughtful compilers will declare their own versions of a copy constructor, an assignment operator, a destructor, and a pair of address-of operators. Furthermore, if you don't declare any constructors, they will declare a default constructor for you, too. All these functions will be public. In other words, if you write
class Empty{};
it's the same as if you'd written
class Empty { public: Empty(); // default constructor Empty(const Empty& rhs); // copy constructor
~Empty(); // destructor see // below for whether // it's virtual Empty& operator=(const Empty& rhs); // assignment operator
Empty* operator&(); // address-of operators const Empty* operator&() const; };
Now these functions are generated only if they are needed, but it doesn't take much to need them. The following code will cause each function to be
const Empty e1; // default constructor; // destructor
Empty e2(e1); // copy constructor
e2 = e1; // assignment operator
Empty *pe2 = &e2; // address-of // operator (non-const)
const Empty *pe1 = &e1; // address-of // operator (const)
Given that compilers are writing functions for you, what do the functions do? Well, the default constructor and the destructor don't really do anything. They just enable you to create and destroy objects of the class. (They also provide a convenient place for implementers to place code whose execution takes care of "behind the scenes" behavior see Items 33 and M24.) Note that the generated destructor is nonvirtual (see Item 14) unless it's for a class inheriting from a base class that itself declares a virtual destructor. The default address-of operators just return the address of the object. These functions are effectively defined like
inline Empty::Empty() {}
inline Empty::~Empty() {}
inline Empty * Empty::operator&() { return this; }
inline const Empty * Empty::operator&() const { return this; }
As for the copy constructor and the assignment operator, the official rule is this: the default copy constructor (assignment operator) performs memberwise copy construction (assignment) of the nonstatic data members of the class. That is, if m
is a nonstatic data member of type T
in a class C
and C
declares no copy constructor (assignment operator), m
will be copy constructed (assigned) using the copy constructor (assignment operator) defined for T
, if there is one. If there isn't, this rule will be recursively applied to m
's data members until a copy constructor (assignment operator) or built-in type (e.g., int
, double
, pointer, etc.) is found. By default, objects of built-in types are copy constructed (assigned) using bitwise copy from the source object to the destination object. For classes that inherit from other classes, this rule is applied to each level of the inheritance hierarchy, so user-defined copy constructors and assignment operators are called at whatever level they are
But just in case it's not, here's an example. Consider the definition of a NamedObject
template, whose instances are classes allowing you to associate names with
template<class T> class NamedObject { public: NamedObject(const char *name, const T& value); NamedObject(const string& name, const T& value);
...
private: string nameValue; T objectValue; };
Because the NamedObject
classes declare at least one constructor, compilers won't generate default constructors, but because the classes fail to declare copy constructors or assignment operators, compilers will generate those functions (if they are
Consider the following call to a copy
NamedObject<int> no1("Smallest Prime Number", 2);
NamedObject<int> no2(no1); // calls copy constructor
The copy constructor generated by your compilers must initialize no2.nameValue
and no2.objectValue
using no1.nameValue
and no1.objectValue
, respectively. The type of nameValue
is string
, and string
has a copy constructor (which you can verify by examining string
in the standard library see Item 49), so no2.nameValue
will be initialized by calling the string
copy constructor with no1.nameValue
as its argument. On the other hand, the type of NamedObject<int>::objectValue
is int
(because T
is int
for this template instantiation), and no copy constructor is defined for int
s, so no2.objectValue
will be initialized by copying the bits over from no1.objectValue
.
The compiler-generated assignment operator for NamedObject<int>
would behave the same way, but in general, compiler-generated assignment operators behave as I've described only when the resulting code is both legal and has a reasonable chance of making sense. If either of these tests fails, compilers will refuse to generate an operator=
for your class, and you'll receive some lovely diagnostic during
For example, suppose NamedObject
were defined like this, where nameValue
is a reference to a string and objectValue
is a const
T
:
template<class T> class NamedObject { public: // this ctor no longer takes a const name, because name- // Value is now a reference-to-non-const string. The char* // ctor is gone, because we must have a string to refer to NamedObject(string& name, const T& value);
... // as above, assume no // operator= is declared private: string& nameValue; // this is now a reference const T objectValue; // this is now const };
Now consider what should happen
string newDog("Persephone"); string oldDog("Satch");
NamedObject<int> p(newDog, 2); // as I write this, our dog //° Persephone is about to // have her second birthday
NamedObject<int> s(oldDog, 29); // the family dog Satch // (from my childhood) // would be 29 if she were // still alive
p = s; // what should happen to // the data members in p?
Before the assignment, p.nameValue
refers to some string
object and s.nameValue
also refers to a string
, though not the same one. How should the assignment affect p.nameValue
? After the assignment, should p.nameValue
refer to the string
referred to by s.nameValue
, i.e., should the reference itself be modified? If so, that breaks new ground, because C++ doesn't provide a way to make a reference refer to a different object (see Item M1). Alternatively, should the string
object to which p.nameValue
refers be modified, thus affecting other objects that hold pointers or references to that string
, i.e., objects not directly involved in the assignment? Is that what the compiler-generated assignment operator should
Faced with such a conundrum, C++ refuses to compile the code. If you want to support assignment in a class containing a reference member, you must define the assignment operator yourself. Compilers behave similarly for classes containing const
members (such as objectValue
in the modified class above); it's not legal to modify const
members, so compilers are unsure how to treat them during an implicitly generated assignment function. Finally, compilers refuse to generate assignment operators for derived classes that inherit from base classes declaring the standard assignment operator private
. After all, compiler-generated assignment operators for derived classes are supposed to handle base class parts, too (see Items 16 and M33), but in doing so, they certainly shouldn't invoke member functions the derived class has no right to
All this talk of compiler-generated functions gives rise to the question, what do you do if you want to disallow use of those functions? That is, what if you deliberately don't declare, for example, an operator=
because you never ever want to allow assignment of objects in your class? The solution to that little teaser is the subject of Item 27. For a discussion of the often-overlooked interactions between pointer members and compiler-generated copy constructors and assignment operators, check out Item 11.
Item 46: Prefer compile-time and link-time errors to runtime errors.
Other than in the few situations that cause C++ to throw exceptions (e.g., running out of memory see Item 7), the notion of a runtime error is as foreign to C++ as it is to C. There's no detection of underflow, overflow, division by zero, no checking for array bounds violations, etc. Once your program gets past a compiler and linker, you're on your own there's no safety net of any consequence. Much as with skydiving, some people are exhilarated by this state of affairs, others are paralyzed with fear. The motivation behind the philosophy, of course, is efficiency: without runtime checks, programs are smaller and
There is a different way to approach things. Languages like Smalltalk and LISP generally detect fewer kinds of errors during compilation and linking, but they provide hefty runtime systems that catch errors during execution. Unlike C++, these languages are almost always interpreted, and you pay a performance penalty for the extra flexibility they
Never forget that you are programming in C++. Even if you find the Smalltalk/LISP philosophy appealing, put it out of your mind. There's a lot to be said for adhering to the party line, and in this case, that means eschewing runtime errors. Whenever you can, push the detection of an error back from runtime to link-time, or, ideally, to
Such a methodology pays dividends not only in terms of program size and speed, but also in terms of reliability. If your program gets through compilers and a linker without eliciting error messages, you may be confident there aren't any compiler- or linker-detectable errors in your program, period. (The other possibility, of course, is that there are bugs in your compilers or linkers, but let us not depress ourselves by admitting to such
With runtime errors, the situation is very different. Just because your program doesn't generate any runtime errors during a particular run, how can you be sure it won't generate errors during a different run, when you do things in a different order, use different data, or run for a longer or shorter period of time? You can test your program until you're blue in the face, but you'll still never cover all the possibilities. As a result, detecting errors at runtime is simply less secure than is catching them during compilation or
Often, by making relatively minor changes to your design, you can catch during compilation what might otherwise be a runtime error. This frequently involves the addition of new types to the program. (See also Item M33.) For example, suppose you are writing a class to represent dates in time. Your first cut might look like
class Date { public: Date(int day, int month, int year);
...
};
If you were to implement this constructor, one of the problems you'd face would be that of sanity checking on the values for the day and the month. Let's see how you can eliminate the need to validate the value passed in for the
One obvious approach is to employ an enumerated type instead of an
enum Month { Jan = 1, Feb = 2, ... , Nov = 11, Dec = 12 };
class Date { public: Date(int day, Month month, int year);
...
};
Unfortunately, this doesn't buy you that much, because enums don't have to be
Month m; Date d(22, m, 1857); // m is undefined
As a result, the Date
constructor would still have to validate the value of the month
To achieve enough security to dispense with runtime checks, you've got to use a class to represent months, and you must ensure that only valid months are
class Month { public: static const Month Jan() { return 1; } static const Month Feb() { return 2; } ... static const Month Dec() { return 12; }
int asInt() const // for convenience, make { return monthNumber; } // it possible to convert // a Month to an int
private: Month(int number): monthNumber(number) {}
const int monthNumber; };
class Date { public: Date(int day, const Month& month, int year); ... };
Several aspects of this design combine to make it work the way it does. First, the Month
constructor is private. This prevents clients from creating new months. The only ones available are those returned by Month
's static member functions, plus copies thereof. Second, each Month
object is const
, so it can't be changed. (Otherwise the temptation to transform January into June might sometimes prove overwhelming, at least in northern latitudes.) Finally, the only way to get a Month
object is by calling a function or by copying an existing Month
(via the implicit Month
copy constructor see Item 45). This makes it possible to use Month
objects anywhere and anytime; there's no need to worry about accidently using one before it's been initialized. (Item 47 explains why this might otherwise be a
Given these classes, it is all but impossible for a client to specify an invalid month. It would be completely impossible were it not for the following
Month *pm; // define uninitialized ptr
Date d(1, *pm, 1997); // arghhh! use it!
However, this involves dereferencing an uninitialized pointer, the results of which are undefined. (See Item 3 for my feelings about undefined behavior.) Unfortunately, I know of no way to prevent or detect this kind of heresy. However, if we assume this never happens, or if we don't care how our software behaves if it does, the Date
constructor can dispense with sanity checking on its Month
parameter. On the other hand, the constructor must still check the day
parameter for validity how many days hath September, April, June, and
This Date
example replaces runtime checks with compile-time checks. You may be wondering when it is possible to use link-time checks. In truth, not very often. C++ uses the linker to ensure that needed functions are defined exactly once (see Item 45 for a description of what it takes to "need" a function). It also uses the linker to ensure that static objects (see Item 47) are defined exactly once. You'll tend to use the linker in the same way. For example, Item 27 describes how the linker's checks can make it useful to deliberately avoid defining a function you explicitly
Now don't get carried away. It's impractical to eliminate the need for all runtime checking. Any program that accepts interactive input, for example, is likely to have to validate that input. Similarly, a class implementing arrays that perform bounds checking (see Item 18) is usually going to have to validate the array index against the bounds every time an array access is made. Nonetheless, shifting checks from runtime to compile- or link-time is always a worthwhile goal, and you should pursue that goal whenever it is practical. Your reward for doing so is programs that are smaller, faster, and more
Item 47: Ensure that non-local static objects are initialized before they're used.
You're an adult now, so you don't need me to tell you it's foolhardy to use an object before it's been initialized. In fact, the whole notion may strike you as absurd; constructors make sure objects are initialized when they're created, n'est-ce pas?
Well, yes and no. Within a particular translation unit (i.e., source file), everything works fine, but things get trickier when the initialization of an object in one translation unit depends on the value of another object in a different translation unit and that second object itself requires
For example, suppose you've authored a library offering an abstraction of a file system, possibly including such capabilities as making files on the Internet look like they're local. Since your library makes the world look like a single file system, you might create a special object, theFileSystem
, within your library's namespace (see Item 28) for clients to use whenever they need to interact with the file system abstraction your library
class FileSystem { ... }; // this class is in your // library
FileSystem theFileSystem; // this is the object // with which library // clients interact
Because theFileSystem
represents something complicated, it's no surprise that its construction is both nontrivial and essential; use of theFileSystem
before it had been constructed would yield very undefined behavior. (However, consult Item M17 for ideas on how the effective initialization of objects like theFileSystem
can safely be
Now suppose some client of your library creates a class for directories in a file system. Naturally, their class uses theFileSystem
:
class Directory { // created by library client public: Directory(); ... };
Directory::Directory() { create a Directory object by invoking member functions on theFileSystem; }
Further suppose this client decides to create a distinguished global Directory
object for temporary
Directory tempDir; // directory for temporary // files
Now the problem of initialization order becomes apparent: unless theFileSystem
is initialized before tempDir
, tempDir
's constructor will attempt to use theFileSystem
before it's been initialized. But theFileSystem
and tempDir
were created by different people at different times in different files. How can you be sure that theFileSystem
will be created before tempDir
?
This kind of question arises anytime you have non-local static objects that are defined in different translation units and whose correct behavior is dependent on their being initialized in a particular order. Non-local static objects are objects that
theFileSystem
and tempDir
),
static
in a class, or
static
at file scope.
Regrettably, there is no shorthand term for "non-local static objects," so you should accustom yourself to this somewhat awkward
You do not want the behavior of your software to be dependent on the initialization order of non-local static objects in different translation units, because you have no control over that order. Let me repeat that. You have absolutely no control over the order in which non-local static objects in different translation units are initialized.
It is reasonable to wonder why this is the
It is the case because determining the "proper" order in which to initialize non-local static objects is hard. Very hard. Halting-Problem hard. In its most general form with multiple translation units and non-local static objects generated through implicit template instantiations (which may themselves arise via implicit template instantiations) it's not only impossible to determine the right order of initialization, it's typically not even worth looking for special cases where it is possible to determine the right
In the field of Chaos Theory, there is a principle known as the "Butterfly Effect." This principle asserts that the tiny atmospheric disturbance caused by the beating of a butterfly's wings in one part of the world can lead to profound changes in weather patterns in places far distant. Somewhat more rigorously, it asserts that for some types of systems, minute perturbations in inputs can lead to radical changes in
The development of software systems can exhibit a Butterfly Effect of its own. Some systems are highly sensitive to the particulars of their requirements, and small changes in requirements can significantly affect the ease with which a system can be implemented. For example, Item 29 describes how changing the specification for an implicit conversion from String
-to-char*
to String
-to-const
-char*
makes it possible to replace a slow or error-prone function with a fast, safe
The problem of ensuring that non-local static objects are initialized before use is similarly sensitive to the details of what you want to achieve. If, instead of demanding access to non-local static objects, you're willing to settle for access to objects that act like non-local static objects (except for the initialization headaches), the hard problem vanishes. In its stead is left a problem so easy to solve, it's hardly worth calling a problem any
The technique sometimes known as the Singleton pattern is simplicity itself. First, you move each non-local static object into its own function, where you declare it static
. Next, you have the function return a reference to the object it contains. Clients call the function instead of referring to the object. In other words, you replace non-local static objects with objects that are static
inside functions. (See also Item M26.)
The basis of this approach is the observation that although C++ says next to nothing about when a non-local static object is initialized, it specifies quite precisely when a static object inside a function (i.e. a local static object) is initialized: it's when the object's definition is first encountered during a call to that function. So if you replace direct accesses to non-local static objects with calls to functions that return references to local static objects inside them, you're guaranteed that the references you get back from the functions will refer to initialized objects. As a bonus, if you never call a function emulating a non-local static object, you never incur the cost of constructing and destructing the object, something that can't be said for true non-local static
Here's the technique applied to both theFileSystem
and tempDir
:
class FileSystem { ... }; // same as before
FileSystem& theFileSystem() // this function replaces { // the theFileSystem object
static FileSystem tfs; // define and initialize // a local static object // (tfs = "the file system")
return tfs; // return a reference to it }
class Directory { ... }; // same as before
Directory::Directory() { same as before, except references to theFileSystem are replaced by references to theFileSystem(); }
Directory& tempDir() // this function replaces { // the tempDir object
static Directory td; // define/initialize local // static object
return td; // return reference to it }
Clients of this modified system program exactly as they used to, except they now refer to theFileSystem()
and tempDir()
instead of theFileSystem
and tempDir
. That is, they refer only to functions returning references to those objects, never to the objects
The reference-returning functions dictated by this scheme are always simple: define and initialize a local static object on line 1, return it on line 2. That's it. Because they're so simple, you may be tempted to declare them inline
. Item 33 explains that late-breaking revisions to the C++ language specification make this a perfectly valid implementation strategy, but it also explains why you'll want to confirm your compilers' conformance with this aspect of
Now, there's no magic going on here. For this technique to be effective, it must be possible to come up with a reasonable initialization order for your objects. If you set things up such that object A must be initialized before object B, and you also make A's initialization dependent on B's having already been initialized, you are going to get in trouble, and frankly, you deserve it. If you steer shy of such pathological situations, however, the scheme described in this Item should serve you quite
Item 48: Pay attention to compiler warnings.
Many programmers routinely ignore compiler warnings. After all, if the problem were serious, it'd be an error, right? This kind of thinking may be relatively harmless in other languages, but in C++, it's a good bet compiler writers have a better grasp of what's going on than you do. For example, here's an error everybody makes at one time or
class B { public: virtual void f() const; }; class D: public B { public: virtual void f(); };
The idea is for D::f
to redefine the virtual function B::f
, but there's a mistake: in B
, f
is a const
member function, but in D
it's not declared const
. One compiler I know says this about
warning: D::f() hides virtual B::f()
Too many inexperienced programmers respond to this message by saying to themselves, "Of course D::f
hides B::f
that's what it's supposed to do!" Wrong. What this compiler is trying to tell you is that the f
declared in B
has not been redeclared in D
, it's been hidden entirely (see Item 50 for a description of why this is so). Ignoring this compiler warning will almost certainly lead to erroneous program behavior, followed by a lot of debugging to find out about something that this compiler detected in the first
After you gain experience with the warning messages from a particular compiler, of course, you'll learn to understand what the different messages mean (which is often very different from what they seem to mean, alas). Once you have that experience, there may be a whole range of warnings you'll choose to ignore. That's fine, but it's important to make sure that before you dismiss a warning, you understand exactly what it's trying to tell
As long as we're on the topic of warnings, recall that warnings are inherently implementation-dependent, so it's not a good idea to get sloppy in your programming, relying on compilers to spot your mistakes for you. The function-hiding code above, for instance, goes through a different (but widely used) compiler with nary a squawk. Compilers are supposed to translate C++ into an executable format, not act as your personal safety net. You want that kind of safety? Program in
Item 49: Familiarize yourself with the standard library.
C++'s standard library is big. Very big. Incredibly big. How big? Let me put it this way: the specification takes over 300 closely-packed pages in the
Bigger isn't always better, of course, but in this case, bigger is better, because a big library contains lots of functionality. The more functionality in the standard library, the more functionality you can lean on as you develop your applications. The C++ library doesn't offer everything (support for concurrency and for graphical user interfaces is notably absent), but it does offer a lot. You can lean almost anything against
Before summarizing what's in the library, I need to tell you a bit about how it's organized. Because the library has so much in it, there's a reasonable chance you (or someone like you) may choose a class or function name that's the same as a name in the standard library. To shield you from the name conflicts that would result, virtually everything in the standard library is nestled in the namespace std
(see Item 28). But that leads to a new problem. Gazillions of lines of existing C++ rely on functionality in the pseudo-standard library that's been in use for years, e.g., functionality declared in the headers <iostream.h>
, <complex.h>
, <limits.h>
, etc. That existing software isn't designed to use namespaces, and it would be a shame if wrapping the standard library by std
caused the existing code to break. (Authors of the broken code would likely use somewhat harsher language than "shame" to describe their feelings about having the library rug pulled out from underneath
Mindful of the destructive power of rioting bands of incensed programmers, the std
-wrapped components. The algorithm they chose for generating the new header names is as trivial as the results it produces are jarring: the .h
on the existing C++ headers was simply dropped. So <iostream.h>
became <iostream>
, <complex.h>
became <complex>
, etc. For C headers, the same algorithm was applied, but a c
was prepended to each result. Hence C's <string.h>
became <cstring>
, <stdio.h>
became <cstdio>
, etc. For a final twist, the old C++ headers were officially deprecated (i.e., listed as no longer supported), but the old C headers were not (to maintain C compatibility). In practice, compiler vendors have no incentive to disavow their customers' legacy software, so you can expect the old C++ headers to be supported for many
Practically speaking, then, this is the C++ header
<iostream.h>
are likely to continue to be supported, even though they aren't in the std
.
<iostream>
contain the same basic functionality as the corresponding old headers, but the contents of the headers are in namespace std
. (During standardization, the details of some of the library components were modified, so there isn't necessarily an exact match between the entities in an old C++ header and those in a new one.)
<stdio.h>
continue to be supported. The contents of such headers are not in std
.
<cstdio>
. They offer the same contents as the corresponding old C headers, but the contents are in std
.
All this seems a little weird at first, but it's really not that hard to get used to. The biggest challenge is keeping all the string headers straight: <string.h>
is the old C header for char*
-based string manipulation functions, <string>
is the std
-wrapped C++ header for the new string classes (see below), and <cstring>
is the std
-wrapped version of the old C header. If you can master that (and I know you can), the rest of the library is
The next thing you need to know about the standard library is that almost everything in it is a template. Consider your old friend iostreams. (If you and iostreams aren't friends, turn to Item 2 to find out why you should cultivate a relationship.) Iostreams help you manipulate streams of characters, but what's a character? Is it a char
? A wchar_t
? A Unicode character? Some other multi-byte character? There's no obviously right answer, so the library lets you choose. All the stream classes are really class templates, and you specify the character type when you instantiate a stream class. For example, the standard library defines the type of cout
to be ostream
, but ostream
is really a typedef for basic_ostream<char>
.
Similar considerations apply to most of the other classes in the standard library. string
isn't a class, it's a class template: a type parameter defines the type of characters in each string
class. complex
isn't a class, it's a class template: a type parameter defines the type of the real and imaginary components in each complex
class. vector
isn't a class, it's a class template. On and on it
You can't escape the templates in the standard library, but if you're used to working with only streams and strings of char
s, you can mostly ignore them. That's because the library defines typedefs for char
instantiations for these components of the library, thus letting you continue to program in terms of the objects cin
, cout
, cerr
, etc., and the types istream
, ostream
, string
, etc., without having to worry about the fact that cin
's real type is basic_istream<char>
and string
's is basic_string<char>
.
Many components in the standard library are templatized much more than this suggests. Consider again the seemingly straightforward notion of a string. Sure, it can be parameterized based on the type of characters it holds, but different character sets differ in details, e.g., special end-of-file characters, most efficient way of copying arrays of them, etc. Such characteristics are known in the standard as traits, and they are specified for string
instantiations by an additional template parameter. In addition, string
objects are likely to perform dynamic memory allocation and deallocation, but there are lots of different ways to approach that task (see Item 10). Which is best? You get to choose: the string
template takes an Allocator
parameter, and objects of type Allocator
are used to allocate and deallocate the memory used by string
Here's a full-blown declaration for the basic_string
template and the string
typedef that builds on it; you can find this (or something equivalent to it) in the header <string>
:
namespace std {
template<class charT, class traits = char_traits<charT>, class Allocator = allocator<charT> > class basic_string;
typedef basic_string<char> string;
}
Notice how basic_string
has default values for its traits
and Allocator
parameters. This is typical of the standard library. It offers flexibility to those who need it, but "typical" clients who just want to do the "normal" thing can ignore the complexity that makes possible the flexibility. In other words, if you just want string objects that act more or less like C strings, you can use string
objects and remain merrily ignorant of the fact that you're really using objects of type basic_string<char,
char_traits<char>,
allocator<char>
>
.
Well, usually you can. Sometimes you have to peek under the hood a bit. For example, Item 34 discusses the advantages of declaring a class without providing its definition, and it remarks that the following is the wrong way to declare the string
class string; // this will compile, but // you don't want to do it
Setting aside namespace considerations for a moment, the real problem here is that string
isn't a class, it's a typedef. It would be nice if you could solve the problem this way:
typedef basic_string<char> string;
but that won't compile. "What is this basic_string
of which you speak?," your compilers will wonder, though they'll probably phrase the question rather differently. No, to declare string
, you would first have to declare all the templates on which it depends. If you could do it, it would look something like
template<class charT> struct char_traits;
template<class T> class allocator;
template<class charT, class traits = char_traits<charT>, class Allocator = allocator<charT> > class basic_string;
typedef basic_string<char> string;However, you can't declare
string
. At least you shouldn't. That's because library implementers are allowed to declare string
(or anything else in the std
namespace) differently from what's specified in basic_string
implementation could add a fourth template parameter, but that parameter's default value would have to yield code that acts as the standard says an unadorned basic_string
End result? Don't try to manually declare string
(or any other part of the standard library). Instead, just include the appropriate header, e.g. <string>
.
With this background on headers and templates under our belts, we're in a position to survey the primary components of the standard C++
stringstream
classes) and internationalization (via locales see below). Still, most everything you've come to expect from the iostream library continues to exist. In particular, it still supports stream buffers, formatters, manipulators, and files, plus the objects cin
, cout
, cerr
, and clog
. That means you can treat string
s and files as streams, and you have extensive control over stream behavior, including buffering and formatting.
string
objects were designed to eliminate the need to use char*
pointers in most applications. They support the operations you'd expect (e.g., concatenation, constant-time access to individual characters via operator[]
, etc.), they're convertible to char*
s for compatibility with legacy code, and they handle memory management automatically. Some string
implementations employ reference counting (see Item M29), which can lead to better performance (in both time and space) than char*
-based strings.
string
s are containers. That's important, because it means anything you can do to a container (see below), you can also do to a string
.
What's that? You want to know how I know the library implementations are efficient? Easy: the library specifies each class's interface, and part of each interface specification is a set of performance guarantees. So, for example, no matter how vector
is implemented, it's not enough to offer just access to its elements, it must offer constant-time access. If it doesn't, it's not a valid vector
In many C++ programs, dynamically allocated strings and arrays account for most uses of new
and delete
, and new
/delete
errors especially leaks caused by failure to delete new
ed memory are distressingly common. If you use string
and vector
objects (both of which perform their own memory management) instead of char*
s and pointers to dynamically allocated arrays, many of your new
s and delete
s will vanish, and so will the difficulties that frequently accompany their use (e.g., Items 6 and 11).
Algorithms treat the contents of a container as a sequence, and each algorithm may be applied to either the sequence corresponding to all the values in a container or to a subsequence. Among the standard algorithms are for_each
(apply a function to each element of a sequence), find
(find the first location in a sequence holding a given value Item M35 shows its implementation), count_if
(count the number of elements in a sequence for which a given predicate is true), equal
(determine whether two sequences hold equal-valued elements), search
(find the first position in one sequence where a second sequence occurs as a subsequence), copy
(copy one sequence into another), unique
(remove duplicate values from a sequence), rotate
(rotate the values in a sequence) and sort
(sort the values in a sequence). Note that this is just a sampling of the algorithms available; the library contains many
Just as container operations come with performance guarantees, so do algorithms. For example, the stable_sort
algorithm is required to perform no more than O(N log N) comparisons. (If the "Big O" notation in the previous sentence is foreign to you, don't sweat it. What it really means is that, broadly speaking, stable_sort
must offer performance at the same level as the most efficient general-purpose serial sorting
The primary library components supporting internationalization are facets and locales. Facets describe how particular characteristics of a culture should be handled, including collation rules (i.e., how strings in the local character set should be sorted), how dates and times should be expressed, how numeric and monetary values should be presented, how to map from message identifiers to (natural) language-specific messages, etc. Locales bundle together sets of facets. For example, a locale for the United States would include facets describing how to sort strings in American English, read and write dates and times, read and write monetary and numeric values, etc., in a way appropriate for people in the USA. A locale for France, on the other hand, would describe how to perform these tasks in a manner to which the French are accustomed. C++ allows multiple locales to be active within a single program, so different parts of an application may employ different
float
, double
, or long
double
) as well as for special array types specifically designed to facilitate numeric programming. Objects of type valarray
, for example, are defined to hold elements that are free from aliasing. This allows compilers to be much more aggressive in their optimizations, especially for vector machines. The library also offers support for two different types of array slices, as well as providing algorithms to compute inner products, partial sums, adjacent differences, and more.
Exceptions of type logic_error
(or its subclasses) represent errors in the logic of software. In theory, such errors could have been prevented by more careful programming. Exceptions of type runtime_error
(or its derived classes) represent errors detectable only at
You may use these classes as is, you may inherit from them to create your own exception classes, or you may ignore them. Their use is not
This list doesn't describe everything in the standard library. Remember, the specification runs over 300 pages. Still, it should give you the basic lay of the
The part of the library pertaining to containers and algorithms is commonly known as Standard Template Library (the STL see Item M35). There is actually a third component to the STL Iterators that I haven't described. Iterators are pointer-like objects that allow STL algorithms and containers to work together. You need not understand iterators for the high-level description of the standard library I give here. If you're interested in them, however, you can find examples of their use in Items 39 and M35.
The STL is the most revolutionary part of the standard library, not because of the containers and algorithms it offers (though they are undeniably useful), but because of its architecture. Simply put, the architecture is extensible: you can add to the STL. Of course, the components of the standard library itself are fixed, but if you follow the conventions on which the STL is built, you can write your own containers, algorithms, and iterators that work as well with the standard STL components as the STL components work with one another. You can also take advantage of STL-compliant containers, algorithms, and iterators written by others, just as they can take advantage of yours. What makes the STL revolutionary is that it's not really software, it's a set of conventions. The STL components in the standard library are simply manifestations of the good that can come from following those
By using the components in the standard library, you can generally dispense with designing your own from-the-ground-up mechanisms for stream I/O, strings, containers (including iteration and common manipulations), internationalization, numeric data structures, and diagnostics. That leaves you a lot more time and energy for the really important part of software development: implementing the things that distinguish your wares from those of your
Item 50: Improve your understanding of C++.
There's a lot of stuff in C++. C stuff. Overloading stuff. Object-oriented stuff. Template stuff. Exception stuff. Namespace stuff. Stuff, stuff, stuff! Sometimes it can be overwhelming. How do you make sense of all that
It's not that hard once you understand the design goals that forged C++ into what it is. Foremost amongst those goals are the
These goals explain a multitude of language details that might otherwise merely chafe. Why do implicitly-generated copy constructors and assignment operators behave the way they do, especially for pointers (see Items 11 and 45)? Because that's how C copies and assigns struct
s, and compatibility with C is important. Why aren't destructors automatically virtual (see Item 14), and why must implementation details appear in class definitions (see Item 34)? Because doing otherwise would impose a performance penalty, and efficiency is important. Why can't C++ detect initialization dependencies between non-local static objects (see Item 47)? Because C++ supports separate translation (i.e., the ability to compile source modules separately, then link several object files together to form an executable), relies on existing linkers, and doesn't mandate the existence of program databases. As a result, C++ compilers almost never know everything about an entire program. Finally, why doesn't C++ free programmers from tiresome duties like memory management (see Items 5-10) and low-level pointer manipulations? Because some programmers need those capabilities, and the needs of real programmers are of paramount
This barely hints at how the design goals behind C++ shape the behavior of the language. To cover everything would take an entire book, so it's convenient that Stroustrup wrote one. That book is dynamic_cast
feature (see Items 39 and M2) was considered, rejected, reconsidered, then accepted and why. If you're having trouble making sense of C++, D&E should dispel much of your
The Design and Evolution of C++ offers a wealth of insights into how C++ came to be what it is, but it's nothing like a formal specification for the language. For that you must turn to the
A virtual function call uses the default arguments in the declaration of the virtual function determined by the static type of the pointer or reference denoting the object. An overriding function in a derived class does not acquire default arguments from the function it overrides.
This paragraph is the basis for Item 38 ("Never redefine an inherited default parameter value"), but I hope my treatment of the topic is somewhat more accessible than the text
The standard is hardly bedtime reading, but it's your best recourse your standard recourse if you and someone else (a compiler vendor, say, or a developer of some other tool that processes source code) disagree on what is and isn't C++. The whole purpose of a standard is to provide definitive information that settles arguments like
The standard's official title is a mouthful, but if you need to know it, you need to know it. Here it is: International Standard for Information SystemsProgramming Language C++. It's published by Working Group 21 of the
As I said, The Design and Evolution of C++ is fine for insights into the language's design, and the standard is great for nailing down language details, but it would be nice if there were a comfortable middle ground between D&E's view from 10,000 meters and the standard's micron-level examination. Textbooks are supposed to fill this niche, but they generally drift toward the standard's perspective, whereby what the language is receives a lot more attention than why it's that
Enter the ARM. The ARM is another book,
What makes the ARM really useful, however, isn't the RM part (the Reference Manual), it's the A part: the annotations. The ARM provides extensive commentary on why many features of C++ behave the way they do. Some of this information is in D&E, but much of it isn't, and you do want to know it. For instance, here's something that drives most people crazy when they first encounter
class Base { public: virtual void f(int x); }; class Derived: public Base { public: virtual void f(double *pd); }; Derived *pd = new Derived; pd->f(10); // error!
The problem is that Derived::f
hides Base::f
, even though they take different parameter types, so compilers demand that the call to f
take a double*
, which the literal 10
most certainly is
This is inconvenient, but the ARM provides an explanation for this behavior. Suppose that when you called f
, you really did want to call the version in Derived
, but you accidentally used the wrong parameter type. Further suppose that Derived
is way down in an inheritance hierarchy and that you were unaware that Derived
indirectly inherits from some base class BaseClass
, and that BaseClass
declares a virtual function f
that takes an int
. In that case, you would have inadvertently called BaseClass::f
, a function you didn't even know existed! This kind of error could occur frequently where large class hierarchies are used, so Stroustrup decided to nip it in the bud by having derived class members hide base class members on a per-name
Note, by the way, that if the writer of Derived
wants to allow clients to access Base::f
, this is easily accomplished via a simple using
class Derived: public Base { public: using Base::f; // import Base::f into // Derived's scope virtual void f(double *pd); }; Derived *pd = new Derived; pd->f(10); // fine, calls Base::f
For compilers not yet supporting using
declarations, an alternative is to employ an inline
class Derived: public Base { public: virtual void f(int x) { Base::f(x); } virtual void f(double *pd); }; Derived *pd = new Derived; pd->f(10); // fine, calls Derived::f(int), // which calls Base::f(int)
Between D&E and the ARM, you'll gain insights into the design and implementation of C++ that make it possible to appreciate the sound, no-nonsense architecture behind a sometimes baroque-looking facade. Fortify those insights with the detailed information in the standard, and you've got a foundation for software development that leads to truly effective