Effective C++, 2E | Item 49: Familiarize yourself with the standard library

Back to Item 48: Pay attention to compiler warnings.
Continue to Item 50: Improve your understanding of C++.

Item 49: Familiarize yourself with the standard library.

C++'s standard library is big. Very big. Incredibly big. How big? Let me put it this way: the specification takes over 300 closely-packed pages in the °C++ standard, and that all but excludes the standard C library, which is included in the C++ library "by reference." (That's the term they use, honest.)

Bigger isn't always better, of course, but in this case, bigger is better, because a big library contains lots of functionality. The more functionality in the standard library, the more functionality you can lean on as you develop your applications. The C++ library doesn't offer everything (support for concurrency and for graphical user interfaces is notably absent), but it does offer a lot. You can lean almost anything against it.

Before summarizing what's in the library, I need to tell you a bit about how it's organized. Because the library has so much in it, there's a reasonable chance you (or someone like you) may choose a class or function name that's the same as a name in the standard library. To shield you from the name conflicts that would result, virtually everything in the standard library is nestled in the namespace std (see Item 28). But that leads to a new problem. Gazillions of lines of existing C++ rely on functionality in the pseudo-standard library that's been in use for years, e.g., functionality declared in the headers <iostream.h>, <complex.h>, <limits.h>, etc. That existing software isn't designed to use namespaces, and it would be a shame if wrapping the standard library by std caused the existing code to break. (Authors of the broken code would likely use somewhat harsher language than "shame" to describe their feelings about having the library rug pulled out from underneath them.)

Mindful of the destructive power of rioting bands of incensed programmers, the °standardization committee decided to create new header names for the std-wrapped components. The algorithm they chose for generating the new header names is as trivial as the results it produces are jarring: the .h on the existing C++ headers was simply dropped. So <iostream.h> became <iostream>, <complex.h> became <complex>, etc. For C headers, the same algorithm was applied, but a c was prepended to each result. Hence C's <string.h> became <cstring>, <stdio.h> became <cstdio>, etc. For a final twist, the old C++ headers were officially deprecated (i.e., listed as no longer supported), but the old C headers were not (to maintain C compatibility). In practice, compiler vendors have no incentive to disavow their customers' legacy software, so you can expect the old C++ headers to be supported for many years.

Practically speaking, then, this is the C++ header situation:

Old C++ header names like <iostream.h> are likely to continue to be supported, even though they aren't in the °official standard. The contents of such headers are not in namespace std.
New C++ header names like <iostream> contain the same basic functionality as the corresponding old headers, but the contents of the headers are in namespace std. (During standardization, the details of some of the library components were modified, so there isn't necessarily an exact match between the entities in an old C++ header and those in a new one.)
Standard C headers like <stdio.h> continue to be supported. The contents of such headers are not in std.
New C++ headers for the functionality in the C library have names like <cstdio>. They offer the same contents as the corresponding old C headers, but the contents are in std.

All this seems a little weird at first, but it's really not that hard to get used to. The biggest challenge is keeping all the string headers straight: <string.h> is the old C header for char*-based string manipulation functions, <string> is the std-wrapped C++ header for the new string classes (see below), and <cstring> is the std-wrapped version of the old C header. If you can master that (and I know you can), the rest of the library is easy.

The next thing you need to know about the standard library is that almost everything in it is a template. Consider your old friend iostreams. (If you and iostreams aren't friends, turn to Item 2 to find out why you should cultivate a relationship.) Iostreams help you manipulate streams of characters, but what's a character? Is it a char? A wchar_t? A Unicode character? Some other multi-byte character? There's no obviously right answer, so the library lets you choose. All the stream classes are really class templates, and you specify the character type when you instantiate a stream class. For example, the standard library defines the type of cout to be ostream, but ostream is really a typedef for basic_ostream<char>.

Similar considerations apply to most of the other classes in the standard library. string isn't a class, it's a class template: a type parameter defines the type of characters in each string class. complex isn't a class, it's a class template: a type parameter defines the type of the real and imaginary components in each complex class. vector isn't a class, it's a class template. On and on it goes.

You can't escape the templates in the standard library, but if you're used to working with only streams and strings of chars, you can mostly ignore them. That's because the library defines typedefs for char instantiations for these components of the library, thus letting you continue to program in terms of the objects cin, cout, cerr, etc., and the types istream, ostream, string, etc., without having to worry about the fact that cin's real type is basic_istream<char> and string's is basic_string<char>.

Many components in the standard library are templatized much more than this suggests. Consider again the seemingly straightforward notion of a string. Sure, it can be parameterized based on the type of characters it holds, but different character sets differ in details, e.g., special end-of-file characters, most efficient way of copying arrays of them, etc. Such characteristics are known in the standard as traits, and they are specified for string instantiations by an additional template parameter. In addition, string objects are likely to perform dynamic memory allocation and deallocation, but there are lots of different ways to approach that task (see Item 10). Which is best? You get to choose: the string template takes an Allocator parameter, and objects of type Allocator are used to allocate and deallocate the memory used by string objects.

Here's a full-blown declaration for the basic_string template and the string typedef that builds on it; you can find this (or something equivalent to it) in the header <string>:

namespace std {

  template<class charT,
           class traits = char_traits<charT>,
           class Allocator = allocator<charT> >
     class basic_string;

  typedef basic_string<char> string;

Notice how basic_string has default values for its traits and Allocator parameters. This is typical of the standard library. It offers flexibility to those who need it, but "typical" clients who just want to do the "normal" thing can ignore the complexity that makes possible the flexibility. In other words, if you just want string objects that act more or less like C strings, you can use string objects and remain merrily ignorant of the fact that you're really using objects of type basic_string<char, char_traits<char>, allocator<char> >.

Well, usually you can. Sometimes you have to peek under the hood a bit. For example, Item 34 discusses the advantages of declaring a class without providing its definition, and it remarks that the following is the wrong way to declare the string type:

class string;                   // this will compile, but
                                // you don't want to do it

Setting aside namespace considerations for a moment, the real problem here is that string isn't a class, it's a typedef. It would be nice if you could solve the problem this way:

typedef basic_string<char> string;

but that won't compile. "What is this basic_string of which you speak?," your compilers will wonder, though they'll probably phrase the question rather differently. No, to declare string, you would first have to declare all the templates on which it depends. If you could do it, it would look something like this:

template<class charT> struct char_traits;

template<class T> class allocator;

  template<class charT,
           class traits = char_traits<charT>,
           class Allocator = allocator<charT> >
     class basic_string;

typedef basic_string<char> string;

However, you can't declare string. At least you shouldn't. That's because library implementers are allowed to declare string (or anything else in the std namespace) differently from what's specified in °the standard as long as the result offers standard-conforming behavior. For example, a basic_string implementation could add a fourth template parameter, but that parameter's default value would have to yield code that acts as the standard says an unadorned basic_string must.

End result? Don't try to manually declare string (or any other part of the standard library). Instead, just include the appropriate header, e.g. <string>.

With this background on headers and templates under our belts, we're in a position to survey the primary components of the standard C++ library:

The standard C library. It's still there, and you can still use it. A few minor things have been tweaked here and there, but for all intents and purposes, it's the same C library that's been around for years.
Iostreams. Compared to "traditional" iostream implementations, it's been templatized, its inheritance hierarchy has been modified, it's been augmented with the ability to throw exceptions, and it's been updated to support strings (via the stringstream classes) and internationalization (via locales — see below). Still, most everything you've come to expect from the iostream library continues to exist. In particular, it still supports stream buffers, formatters, manipulators, and files, plus the objects cin, cout, cerr, and clog. That means you can treat strings and files as streams, and you have extensive control over stream behavior, including buffering and formatting.
Strings. string objects were designed to eliminate the need to use char* pointers in most applications. They support the operations you'd expect (e.g., concatenation, constant-time access to individual characters via operator[], etc.), they're convertible to char*s for compatibility with legacy code, and they handle memory management automatically. Some string implementations employ reference counting (see Item M29), which can lead to better performance (in both time and space) than char*-based strings.
Containers. Stop writing your own basic container classes! The library offers efficient implementations of vectors (they act like dynamically extensible arrays), lists (doubly-linked), queues, stacks, deques, maps, sets, and bitsets. Alas, there are no hash tables in the library (though many vendors offer them as extensions), but compensating somewhat is the fact that strings are containers. That's important, because it means anything you can do to a container (see below), you can also do to a string.
What's that? You want to know how I know the library implementations are efficient? Easy: the library specifies each class's interface, and part of each interface specification is a set of performance guarantees. So, for example, no matter how vector is implemented, it's not enough to offer just access to its elements, it must offer constant-time access. If it doesn't, it's not a valid vector implementation.

In many C++ programs, dynamically allocated strings and arrays account for most uses of new and delete, and new/delete errors — especially leaks caused by failure to delete newed memory — are distressingly common. If you use string and vector objects (both of which perform their own memory management) instead of char*s and pointers to dynamically allocated arrays, many of your news and deletes will vanish, and so will the difficulties that frequently accompany their use (e.g., Items 6 and 11).
Algorithms. Having standard containers is nice, but it's even nicer when there's an easy way to do things with them. The standard library offers over two dozen easy ways (i.e., predefined functions, officially known as algorithms — they're really function templates), most of which work with all the containers in the library — as well as with built-in arrays!
Algorithms treat the contents of a container as a sequence, and each algorithm may be applied to either the sequence corresponding to all the values in a container or to a subsequence. Among the standard algorithms are for_each (apply a function to each element of a sequence), find (find the first location in a sequence holding a given value — Item M35 shows its implementation), count_if (count the number of elements in a sequence for which a given predicate is true), equal (determine whether two sequences hold equal-valued elements), search (find the first position in one sequence where a second sequence occurs as a subsequence), copy (copy one sequence into another), unique (remove duplicate values from a sequence), rotate (rotate the values in a sequence) and sort (sort the values in a sequence). Note that this is just a sampling of the algorithms available; the library contains many others.

Just as container operations come with performance guarantees, so do algorithms. For example, the stable_sort algorithm is required to perform no more than O(N log N) comparisons. (If the "Big O" notation in the previous sentence is foreign to you, don't sweat it. What it really means is that, broadly speaking, stable_sort must offer performance at the same level as the most efficient general-purpose serial sorting algorithms.)
Support for internationalization. Different cultures do things in different ways. Like the C library, the C++ library offers features to facilitate the production of internationalized software, but the C++ approach, though conceptually akin to that of C, is different. It should not surprise you, for example, to learn that C++'s support for internationalization makes extensive use of templates, and it takes advantage of inheritance and virtual functions, too.
The primary library components supporting internationalization are facets and locales. Facets describe how particular characteristics of a culture should be handled, including collation rules (i.e., how strings in the local character set should be sorted), how dates and times should be expressed, how numeric and monetary values should be presented, how to map from message identifiers to (natural) language-specific messages, etc. Locales bundle together sets of facets. For example, a locale for the United States would include facets describing how to sort strings in American English, read and write dates and times, read and write monetary and numeric values, etc., in a way appropriate for people in the USA. A locale for France, on the other hand, would describe how to perform these tasks in a manner to which the French are accustomed. C++ allows multiple locales to be active within a single program, so different parts of an application may employ different conventions.
Support for numeric processing. The end for FORTRAN may finally be near. The C++ library offers a template for complex number classes (the precision of the real and imaginary parts may be float, double, or long double) as well as for special array types specifically designed to facilitate numeric programming. Objects of type valarray, for example, are defined to hold elements that are free from aliasing. This allows compilers to be much more aggressive in their optimizations, especially for vector machines. The library also offers support for two different types of array slices, as well as providing algorithms to compute inner products, partial sums, adjacent differences, and more.
Diagnostic support. The standard library offers support for three ways to report errors: via C's assertions (see Item 7), via error numbers, and via exceptions. To help provide some structure to exception types, the library defines the following hierarchy of exception classes:

Exceptions of type logic_error (or its subclasses) represent errors in the logic of software. In theory, such errors could have been prevented by more careful programming. Exceptions of type runtime_error (or its derived classes) represent errors detectable only at runtime.

You may use these classes as is, you may inherit from them to create your own exception classes, or you may ignore them. Their use is not mandatory.

This list doesn't describe everything in the standard library. Remember, the specification runs over 300 pages. Still, it should give you the basic lay of the land.

The part of the library pertaining to containers and algorithms is commonly known as Standard Template Library (the STL — see Item M35). There is actually a third component to the STL — Iterators — that I haven't described. Iterators are pointer-like objects that allow STL algorithms and containers to work together. You need not understand iterators for the high-level description of the standard library I give here. If you're interested in them, however, you can find examples of their use in Items 39 and M35.

The STL is the most revolutionary part of the standard library, not because of the containers and algorithms it offers (though they are undeniably useful), but because of its architecture. Simply put, the architecture is extensible: you can add to the STL. Of course, the components of the standard library itself are fixed, but if you follow the conventions on which the STL is built, you can write your own containers, algorithms, and iterators that work as well with the standard STL components as the STL components work with one another. You can also take advantage of STL-compliant containers, algorithms, and iterators written by others, just as they can take advantage of yours. What makes the STL revolutionary is that it's not really software, it's a set of conventions. The STL components in the standard library are simply manifestations of the good that can come from following those conventions.

By using the components in the standard library, you can generally dispense with designing your own from-the-ground-up mechanisms for stream I/O, strings, containers (including iteration and common manipulations), internationalization, numeric data structures, and diagnostics. That leaves you a lot more time and energy for the really important part of software development: implementing the things that distinguish your wares from those of your competitors.

Back to Item 48: Pay attention to compiler warnings.
Continue to Item 50: Improve your understanding of C++.