More Effective C++ | Item 26: Limiting the number of objects of a class Back to Item 25: Virtualizing constructors and non-member functions Continue to Item 27: Requiring or prohibiting heap-based objects Item 26: Limiting the number of objects of a class. Okay, you're crazy about objects, but sometimes you'd like to bound your insanity. For example, you've got only one printer in your system, so you'd like to somehow limit the number of printer objects to one. Or you've got only 16 file descriptors you can hand out, so you've got to make sure there are never more than that many file descriptor objects in existence. How can you do such things? How can you limit the number of objects? If this were a proof by mathematical induction, we might start with n = 1, then build from there. Fortunately, this is neither a proof nor an induction. Moreover, it turns out to be instructive to begin with n = 0, so we'll start there instead. How do you prevent objects from being instantiated at all? Allowing Zero or One Objects Each time an object is instantiated, we know one thing for sure: a constructor will be called. That being the case, the easiest way to prevent objects of a particular class from being created is to declare the constructors of that class private: class CantBeInstantiated { private: CantBeInstantiated(); CantBeInstantiated(const CantBeInstantiated&); ... }; Having thus removed everybody's right to create objects, we can selectively loosen the restriction. If, for example, we want to create a class for printers, but we also want to abide by the constraint that there is only one printer available to us, we can encapsulate the printer object inside a function so that everybody has access to the printer, but only a single printer object is created: class PrintJob; // forward declaration // see Item E34 class Printer { public: void submitJob(const PrintJob& job); void reset(); void performSelfTest(); ... friend Printer& thePrinter(); private: Printer(); Printer(const Printer& rhs); ... }; Printer& thePrinter() { static Printer p; // the single printer object return p; } There are three separate components to this design. First, the constructors of the Printer class are private. That suppresses object creation. Second, the global function thePrinter is declared a friend of the class. That lets thePrinter escape the restriction imposed by the private constructors. Finally, thePrinter contains a static Printer object. That means only a single object will be created. Client code refers to thePrinter whenever it wishes to interact with the system's lone printer. By returning a reference to a Printer object, thePrinter can be used in any context where a Printer object itself could be: class PrintJob { public: PrintJob(const string& whatToPrint); ... }; string buffer; ... // put stuff in buffer thePrinter().reset(); thePrinter().submitJob(buffer); It's possible, of course, that thePrinter strikes you as a needless addition to the global namespace. "Yes," you may say, "as a global function it looks more like a global variable, but global variables are gauche, and I'd prefer to localize all printer-related functionality inside the Printer class." Well, far be it from me to argue with someone who uses words like gauche. thePrinter can just as easily be made a static member function of Printer, and that puts it right where you want it. It also eliminates the need for a friend declaration, which many regard as tacky in its own right. Using a static member function, Printer looks like this: class Printer { public: static Printer& thePrinter(); ... private: Printer(); Printer(const Printer& rhs); ... }; Printer& Printer::thePrinter() { static Printer p; return p; } Clients must now be a bit wordier when they refer to the printer: Printer::thePrinter().reset(); Printer::thePrinter().submitJob(buffer); Another approach is to move Printer and thePrinter out of the global scope and into a namespace (see Item E28). Namespaces are a recent addition to C++. Anything that can be declared at global scope can also be declared in a namespace. This includes classes, structs, functions, variables, objects, typedefs, etc. The fact that something is in a namespace doesn't affect its behavior, but it does prevent name conflicts between entities in different namespaces. By putting the Printer class and the thePrinter function into a namespace, we don't have to worry about whether anybody else happened to choose the names Printer or thePrinter for themselves; our namespace prevents name conflicts. Syntactically, namespaces look much like classes, but there are no public, protected, or private sections; everything is public. This is how we'd put Printer and thePrinter into a namespace called PrintingStuff: namespace PrintingStuff { class Printer { // this class is in the public: // PrintingStuff namespace void submitJob(const PrintJob& job); void reset(); void performSelfTest(); ... friend Printer& thePrinter(); private: Printer(); Printer(const Printer& rhs); ... }; Printer& thePrinter() // so is this function { static Printer p; return p; } } // this is the end of the // namespace Given this namespace, clients can refer to thePrinter using a fully-qualified name (i.e., one that includes the name of the namespace), PrintingStuff::thePrinter().reset(); PrintingStuff::thePrinter().submitJob(buffer); but they can also employ a using declaration to save themselves keystrokes: using PrintingStuff::thePrinter; // import the name // "thePrinter" from the // namespace "PrintingStuff" // into the current scope thePrinter().reset(); // now thePrinter can be thePrinter().submitJob(buffer); // used as if it were a // local name There are two subtleties in the implementation of thePrinter that are worth exploring. First, it's important that the single Printer object be static in a function and not in a class. An object that's static in a class is, for all intents and purposes, always constructed (and destructed), even if it's never used. In contrast, an object that's static in a function is created the first time through the function, so if the function is never called, the object is never created. (You do, however, pay for a check each time the function is called to see whether the object needs to be created.) One of the philosophical pillars on which C++ was built is the idea that you shouldn't pay for things you don't use, and defining an object like our printer as a static object in a function is one way of adhering to this philosophy. It's a philosophy you should adhere to whenever you can. There is another drawback to making the printer a class static versus a function static, and that has to do with its time of initialization. We know exactly when a function static is initialized: the first time through the function at the point where the static is defined. The situation with a class static (or, for that matter, a global static, should you be so gauche as to use one) is less well defined. C++ offers certain guarantees regarding the order of initialization of statics within a particular translation unit (i.e., a body of source code that yields a single object file), but it says nothing about the initialization order of static objects in different translation units (see Item E47). In practice, this turns out to be a source of countless headaches. Function statics, when they can be made to suffice, allow us to avoid these headaches. In our example here, they can, so why suffer? The second subtlety has to do with the interaction of inlining and static objects inside functions. Look again at the code for the non-member version of thePrinter: Printer& thePrinter() { static Printer p; return p; } Except for the first time through this function (when p must be constructed), this is a one-line function it consists entirely of the statement "return p;". If ever there were a good candidate for inlining, this function would certainly seem to be the one. Yet it's not declared inline. Why not? Consider for a moment why you'd declare an object to be static. It's usually because you want only a single copy of that object, right? Now consider what inline means. Conceptually, it means compilers should replace each call to the function with a copy of the function body, but for non-member functions, it also means something else. It means the functions in question have internal linkage. You don't ordinarily need to worry about such linguistic mumbo jumbo, but there is one thing you must remember: functions with internal linkage may be duplicated within a program (i.e., the object code for the program may contain more than one copy of each function with internal linkage), and this duplication includes static objects contained within the functions. The result? If you create an inline non-member function containing a local static object, you may end up with more than one copy of the static object in your program! So don't create inline non-member functions that contain local static data.9 But maybe you think this business of creating a function to return a reference to a hidden object is the wrong way to go about limiting the number of objects in the first place. Perhaps you think it's better to simply count the number of objects in existence and throw an exception in a constructor if too many objects are requested. In other words, maybe you think we should handle printer creation like this: class Printer { public: class TooManyObjects{}; // exception class for use // when too many objects // are requested Printer(); ~Printer(); ... private: static size_t numObjects; Printer(const Printer& rhs); // there is a limit of 1 // printer, so never allow }; // copying (see Item E27) The idea is to use numObjects to keep track of how many Printer objects are in existence. This value will be incremented in the class constructor and decremented in its destructor. If an attempt is made to construct too many Printer objects, we throw an exception of type TooManyObjects: // Obligatory definition of the class static size_t Printer::numObjects = 0; Printer::Printer() { if (numObjects >= 1) { throw TooManyObjects(); } proceed with normal construction here; ++numObjects; } Printer::~Printer() { perform normal destruction here; --numObjects; } This approach to limiting object creation is attractive for a couple of reasons. For one thing, it's straightforward everybody should be able to understand what's going on. For another, it's easy to generalize so that the maximum number of objects is some number other than one. Contexts for Object Construction There is also a problem with this strategy. Suppose we have a special kind of printer, say, a color printer. The class for such printers would have much in common with our generic printer class, so of course we'd inherit from it: class ColorPrinter: public Printer { ... }; Now suppose we have one generic printer and one color printer in our system: Printer p; ColorPrinter cp; How many Printer objects result from these object definitions? The answer is two: one for p and one for the Printer part of cp. At runtime, a TooManyObjects exception will be thrown during the construction of the base class part of cp. For many programmers, this is neither what they want nor what they expect. (Designs that avoid having concrete classes inherit from other concrete classes do not suffer from this problem. For details on this design philosophy, see Item 33.) A similar problem occurs when Printer objects are contained inside other objects: class CPFMachine { // for machines that can private: // copy, print, and fax Printer p; // for printing capabilities FaxMachine f; // for faxing capabilities CopyMachine c; // for copying capabilities ... }; CPFMachine m1; // fine CPFMachine m2; // throws TooManyObjects exception The problem is that Printer objects can exist in three different contexts: on their own, as base class parts of more derived objects, and embedded inside larger objects. The presence of these different contexts significantly muddies the waters regarding what it means to keep track of the "number of objects in existence," because what you consider to be the existence of an object may not jibe with your compilers'. Often you will be interested only in allowing objects to exist on their own, and you will wish to limit the number of those kinds of instantiations. That restriction is easy to satisfy if you adopt the strategy exemplified by our original Printer class, because the Printer constructors are private, and (in the absence of friend declarations) classes with private constructors can't be used as base classes, nor can they be embedded inside other objects. The fact that you can't derive from classes with private constructors leads to a general scheme for preventing derivation, one that doesn't necessarily have to be coupled with limiting object instantiations. Suppose, for example, you have a class, FSA, for representing finite state automata. (Such state machines are useful in many contexts, among them user interface design.) Further suppose you'd like to allow any number of FSA objects to be created, but you'd also like to ensure that no class ever inherits from FSA. (One reason for doing this might be to justify the presence of a nonvirtual destructor in FSA. Item E14 explains why base classes generally need virtual destructors, and Item 24 explains why classes without virtual functions yield smaller objects than do equivalent classes with virtual functions.) Here's how you can design FSA to satisfy both criteria: class FSA { public: // pseudo-constructors static FSA * makeFSA(); static FSA * makeFSA(const FSA& rhs); ... private: FSA(); FSA(const FSA& rhs); ... }; FSA * FSA::makeFSA() { return new FSA(); } FSA * FSA::makeFSA(const FSA& rhs) { return new FSA(rhs); } Unlike the thePrinter function that always returned a reference to a single object, each makeFSA pseudo-constructor returns a pointer to a unique object. That's what allows an unlimited number of FSA objects to be created. This is nice, but the fact that each pseudo-constructor calls new implies that callers will have to remember to call delete. Otherwise a resource leak will be introduced. Callers who wish to have delete called automatically when the current scope is exited can store the pointer returned from makeFSA in an auto_ptr object (see Item 9); such objects automatically delete what they point to when they themselves go out of scope: // indirectly call default FSA constructor auto_ptr pfsa1(FSA::makeFSA()); // indirectly call FSA copy constructor auto_ptr pfsa2(FSA::makeFSA(*pfsa1)); ... // use pfsa1 and pfsa2 as normal pointers, // but don't worry about deleting them Allowing Objects to Come and Go We now know how to design a class that allows only a single instantiation, we know that keeping track of the number of objects of a particular class is complicated by the fact that object constructors are called in three different contexts, and we know that we can eliminate the confusion surrounding object counts by making constructors private. It is worthwhile to make one final observation. Our use of the thePrinter function to encapsulate access to a single object limits the number of Printer objects to one, but it also limits us to a single Printer object for each run of the program. As a result, it's not possible to write code like this: create Printer object p1; use p1; destroy p1; create Printer object p2; use p2; destroy p2; ... This design never instantiates more than a single Printer object at a time, but it does use different Printer objects in different parts of the program. It somehow seems unreasonable that this isn't allowed. After all, at no point do we violate the constraint that only one printer may exist. Isn't there a way to make this legal? There is. All we have to do is combine the object-counting code we used earlier with the pseudo-constructors we just saw: class Printer { public: class TooManyObjects{}; // pseudo-constructor static Printer * makePrinter(); ~Printer(); void submitJob(const PrintJob& job); void reset(); void performSelfTest(); ... private: static size_t numObjects; Printer(); Printer(const Printer& rhs); // we don't define this }; // function, because we'll // never allow copying // (see Item E27) // Obligatory definition of class static size_t Printer::numObjects = 0; Printer::Printer() { if (numObjects >= 1) { throw TooManyObjects(); } proceed with normal object construction here; ++numObjects; } Printer * Printer::makePrinter() { return new Printer; } If the notion of throwing an exception when too many objects are requested strikes you as unreasonably harsh, you could have the pseudo-constructor return a null pointer instead. Clients would then have to check for this before doing anything with it, of course. Clients use this Printer class just as they would any other class, except they must call the pseudo-constructor function instead of the real constructor: Printer p1; // error! default ctor is // private Printer *p2 = Printer::makePrinter(); // fine, indirectly calls // default ctor Printer p3 = *p2; // error! copy ctor is // private p2->performSelfTest(); // all other functions are p2->reset(); // called as usual ... delete p2; // avoid resource leak; this // would be unnecessary if // p2 were an auto_ptr This technique is easily generalized to any number of objects. All we have to do is replace the hard-wired constant 1 with a class-specific value, then lift the restriction against copying objects. For example, the following revised implementation of our Printer class allows up to 10 Printer objects to exist: class Printer { public: class TooManyObjects{}; // pseudo-constructors static Printer * makePrinter(); static Printer * makePrinter(const Printer& rhs); ... private: static size_t numObjects; static const size_t maxObjects = 10; // see below Printer(); Printer(const Printer& rhs); }; // Obligatory definitions of class statics size_t Printer::numObjects = 0; const size_t Printer::maxObjects; Printer::Printer() { if (numObjects >= maxObjects) { throw TooManyObjects(); } ... } Printer::Printer(const Printer& rhs) { if (numObjects >= maxObjects) { throw TooManyObjects(); } ... } Printer * Printer::makePrinter() { return new Printer; } Printer * Printer::makePrinter(const Printer& rhs) { return new Printer(rhs); } Don't be surprised if your compilers get all upset about the declaration of Printer::maxObjects in the class definition above. In particular, be prepared for them to complain about the specification of 10 as an initial value for that variable. The ability to specify initial values for static const members (of integral type, e.g., ints, chars, enums, etc.) inside a class definition was added to C++ only relatively recently, so some compilers don't yet allow it. If your compilers are as-yet-unupdated, pacify them by declaring maxObjects to be an enumerator inside a private anonymous enum, class Printer { private: enum { maxObjects = 10 }; // within this class, ... // maxObjects is the }; // constant 10 or by initializing the constant static like a non-const static member: class Printer { private: static const size_t maxObjects; // no initial value given ... }; // this goes in a single implementation file const size_t Printer::maxObjects = 10; This latter approach has the same effect as the original code above, but explicitly specifying the initial value is easier for other programmers to understand. When your compilers support the specification of initial values for const static members in class definitions, you should take advantage of that capability. An Object-Counting Base Class Initialization of statics aside, the approach above works like the proverbial charm, but there is one aspect of it that continues to nag. If we had a lot of classes like Printer whose instantiations needed to be limited, we'd have to write this same code over and over, once per class. That would be mind-numbingly dull. Given a fancy-pants language like C++, it somehow seems we should be able to automate the process. Isn't there a way to encapsulate the notion of counting instances and bundle it into a class? We can easily come up with a base class for counting object instances and have classes like Printer inherit from that, but it turns out we can do even better. We can actually come up with a way to encapsulate the whole counting kit and kaboodle, by which I mean not only the functions to manipulate the instance count, but also the instance count itself. (We'll see the need for a similar trick when we examine reference counting in Item 29. For a detailed examination of this design, see my article on counting objects.) The counter in the Printer class is the static variable numObjects, so we need to move that variable into an instance-counting class. However, we also need to make sure that each class for which we're counting instances has a separate counter. Use of a counting class template lets us automatically generate the appropriate number of counters, because we can make the counter a static member of the classes generated from the template: template class Counted { public: class TooManyObjects{}; // for throwing exceptions static int objectCount() { return numObjects; } protected: Counted(); Counted(const Counted& rhs); ~Counted() { --numObjects; } private: static int numObjects; static const size_t maxObjects; void init(); // to avoid ctor code }; // duplication template Counted::Counted() { init(); } template Counted::Counted(const Counted&) { init(); } template void Counted::init() { if (numObjects >= maxObjects) throw TooManyObjects(); ++numObjects; } The classes generated from this template are designed to be used only as base classes, hence the protected constructors and destructor. Note the use of the private member function init to avoid duplicating the statements in the two Counted constructors. We can now modify the Printer class to use the Counted template: class Printer: private Counted { public: // pseudo-constructors static Printer * makePrinter(); static Printer * makePrinter(const Printer& rhs); ~Printer(); void submitJob(const PrintJob& job); void reset(); void performSelfTest(); ... using Counted::objectCount; // see below using Counted::TooManyObjects; // see below private: Printer(); Printer(const Printer& rhs); }; The fact that Printer uses the Counted template to keep track of how many Printer objects exist is, frankly, nobody's business but the author of Printer's. Such implementation details are best kept private, and that's why private inheritance is used here (see Item E42). The alternative would be to use public inheritance between Printer and Counted, but then we'd be obliged to give the Counted classes a virtual destructor. (Otherwise we'd risk incorrect behavior if somebody deleted a Printer object through a Counted* pointer see Item E14.) As Item 24 makes clear, the presence of a virtual function in Counted would almost certainly affect the size and layout of objects of classes inheriting from Counted. We don't want to absorb that overhead, and the use of private inheritance lets us avoid it. Quite properly, most of what Counted does is hidden from Printer's clients, but those clients might reasonably want to find out how many Printer objects exist. The Counted template offers the objectCount function to provide this information, but that function becomes private in Printer due to our use of private inheritance. To restore the public accessibility of that function, we employ a using declaration: class Printer: private Counted { public: ... using Counted::objectCount; // make this function // public for clients ... // of Printer }; This is perfectly legitimate, but if your compilers don't yet support namespaces, they won't allow it. If they don't, you can use the older access declaration syntax: class Printer: private Counted { public: ... Counted::objectCount; // make objectCount // public in Printer ... }; This more traditional syntax has the same meaning as the using declaration, but it's deprecated. The class TooManyObjects is handled in the same fashion as objectCount, because clients of Printer must have access to TooManyObjects if they are to be able to catch exceptions of that type. When Printer inherits from Counted, it can forget about counting objects. The class can be written as if somebody else were doing the counting for it, because somebody else (Counted) is. A Printer constructor now looks like this: Printer::Printer() { proceed with normal object construction; } What's interesting here is not what you see, it's what you don't. No checking of the number of objects to see if the limit is about to be exceeded, no incrementing the number of objects in existence once the constructor is done. All that is now handled by the Counted constructors, and because Counted is a base class of Printer, we know that a Counted constructor will always be called before a Printer constructor. If too many objects are created, a Counted constructor throws an exception, and the Printer constructor won't even be invoked. Nifty, huh? Nifty or not, there's one loose end that demands to be tied, and that's the mandatory definitions of the statics inside Counted. It's easy enough to take care of numObjects we just put this in Counted's implementation file: template // defines numObjects int Counted::numObjects; // and automatically // initializes it to 0 The situation with maxObjects is a bit trickier. To what value should we initialize this variable? If we want to allow up to 10 printers, we should initialize Counted::maxObjects to 10. If, on the other hand, we want to allow up to 16 file descriptor objects, we should initialize Counted::maxObjects to 16. What to do? We take the easy way out: we do nothing. We provide no initialization at all for maxObjects. Instead, we require that clients of the class provide the appropriate initialization. The author of Printer must add this to an implementation file: const size_t Counted::maxObjects = 10; Similarly, the author of FileDescriptor must add this: const size_t Counted::maxObjects = 16; What will happen if these authors forget to provide a suitable definition for maxObjects? Simple: they'll get an error during linking, because maxObjects will be undefined. Provided we've adequately documented this requirement for clients of Counted, they can then say "Duh" to themselves and go back and add the requisite initialization. Back to Item 25: Virtualizing constructors and non-member functions Continue to Item 27: Requiring or prohibiting heap-based objects 9 In July 1996, the ISO/ANSI standardization committee changed the default linkage of inline functions to external, so the problem I describe here has been eliminated, at least on paper. Your compilers may not yet be in accord with the standard, however, so your best bet is still to shy away from inline functions with static data. Return