Effective C++, 2E | Item 33: Use inlining judiciously Back to Item 32: Postpone variable definitions as long as possible. Continue to Item 34: Minimize compilation dependencies between files. Item 33: Use inlining judiciously. Inline functions -- what a wonderful idea! They look like functions, they act like functions, they're ever so much better than macros (see Item 1), and you can call them without having to incur the overhead of a function call. What more could you possibly ask for? You actually get more than you might think, because avoiding the cost of a function call is only half the story. Compiler optimization routines are typically designed to concentrate on stretches of code that lack function calls, so when you inline a function, you may enable compilers to perform context-specific optimizations on the body of the function. Such optimizations would be impossible for "normal" function calls. However, let's not get carried away. In programming, as in life, there is no free lunch, and inline functions are no exception. The whole idea behind an inline function is to replace each call of that function with its code body, and it doesn't take a Ph.D. in statistics to see that this is likely to increase the overall size of your object code. On machines with limited memory, overzealous inlining can give rise to programs that are too big for the available space. Even with virtual memory, inline-induced code bloat can lead to pathological paging behavior (thrashing) that will slow your program to a crawl. (It will, however, provide your disk controller with a nice exercise regimen.) Too much inlining can also reduce your instruction cache hit rate, thus reducing the speed of instruction fetch from that of cache memory to that of primary memory. On the other hand, if an inline function body is very short, the code generated for the function body may actually be smaller than the code generated for a function call. If that is the case, inlining the function may actually lead to smaller object code and a higher cache hit rate! Bear in mind that the inline directive, like register, is a hint to compilers, not a command. That means compilers are free to ignore your inline directives whenever they want to, and it's not that hard to make them want to. For example, most compilers refuse to inline "complicated" functions (e.g., those that contain loops or are recursive), and all but the most trivial virtual function calls stop inlining routines dead in their tracks. (This shouldn't be much of a surprise. virtual means "wait until runtime to figure out which function to call," and inline means "during compilation, replace the call site with the called function." If compilers don't know which function will be called, you can hardly blame them for refusing to make an inline call to it.) It all adds up to this: whether a given inline function is actually inlined is dependent on the implementation of the compiler you're using. Fortunately, most compilers have a diagnostic level that will result in a warning (see Item 48) if they fail to inline a function you've asked them to. Suppose you've written some function f and you've declared it inline. What happens if a compiler chooses, for whatever reason, not to inline that function? The obvious answer is that f will be treated like a non-inline function: code for f will be generated as if it were a normal "outlined" function, and calls to f will proceed as normal function calls. In theory, this is precisely what will happen, but this is one of those occasions when theory and practice may go their separate ways. That's because this very tidy solution to the problem of what to do about "outlined inlines" was added to C++ relatively late in the standardization process. Earlier specifications for the language (such as the ARM see Item 50) told compiler vendors to implement different behavior, and the older behavior is still common enough that you need to understand what it is. Think about it for a minute, and you'll realize that inline function definitions are virtually always put in header files. This allows multiple translation units (source files) to include the same header files and reap the advantages of the inline functions that are defined within them. Here's an example, in which I adopt the convention that source files end in ".cpp"; this is probably the most prevalent of the file naming conventions in the world of C++: // This is file example.h inline void f() { ... } // definition of f ... // This is file source1.cpp #include "example.h" // includes definition of f ... // contains calls to f // This is file source2.cpp #include "example.h" // also includes definition // of f ... // also calls f Under the old "outlined inline" rules and the assumption that f is not being inlined, when source1.cpp is compiled, the resulting object file will contain a function called f, just as if f had never been declared inline. Similarly, when source2.cpp is compiled, its generated object file will also hold a function called f. When you try to link the two object files together, you can reasonably expect your linker to complain that your program contains two definitions of f, an error. To prevent this problem, the old rules decreed that compilers treat an un-inlined inline function as if the function had been declared static that is, local to the file currently being compiled. In the example you just saw, compilers following the old rules would treat f as if it were static in source1.cpp when that file was being compiled and as if it were static in source2.cpp when that file was being compiled. This strategy eliminates the link-time problem, but at a cost: each translation unit that includes the definition of f (and that calls f) contains its own static copy of f. If f itself defines local static variables, each copy of f gets its own copy of the variables, something sure to astonish programmers who believe that "static" inside a function means "only one copy." This leads to a stunning realization. Under both new rules and old, if an inline function isn't inlined, you still pay for the cost of a function call at each call site, but under the old rules, you can also suffer an increase in code size, because each translation unit that includes and calls f gets its own copy of f's code and f's static variables! (To make matters worse, each copy of f and each copy of f's static variables tend to end up on different virtual memory pages, so two calls to different copies of f are likely to entail one or more page faults.) There's more. Sometimes your poor, embattled compilers have to generate a function body for an inline function even when they are perfectly willing to inline the function. In particular, if your program ever takes the address of an inline function, compilers must generate a function body for it. How can they come up with a pointer to a function that doesn't exist? inline void f() {...} // as above void (*pf)() = f; // pf points to f int main() { f(); // an inline call to f pf(); // a non-inline call to f // through pf ... } In this case, you end up in the seemingly paradoxical situation whereby calls to f are inlined, but under the old rules each translation unit that takes f's address still generates a static copy of the function. (Under the new rules, only a single out-of-line copy of f will be generated, regardless of the number of translation units involved.) This aspect of un-inlined inline functions can affect you even if you never use function pointers, because programmers aren't necessarily the only ones asking for pointers to functions. Sometimes compilers do it. In particular, compilers sometimes generate out-of-line copies of constructors and destructors so that they can get pointers to those functions for use in constructing and destructing arrays of objects of a class (see also Item M8). In fact, constructors and destructors are often worse candidates for inlining than a casual examination would indicate. For example, consider the constructor for class Derived below: class Base { public: ... private: string bm1, bm2; // base members 1 and 2 }; class Derived: public Base { public: Derived() {} // Derived's ctor is ... // empty -- or is it? private: string dm1, dm2, dm3; // derived members 1-3 }; This constructor certainly looks like an excellent candidate for inlining, since it contains no code. But looks can be deceiving. Just because it contains no code doesn't necessarily mean it contains no code. In fact, it may contain a fair amount of code. C++ makes various guarantees about things that happen when objects are created and destroyed. Items 5 and M8 describes how when you use new, your dynamically created objects are automatically initialized by their constructors, and how when you use delete, the corresponding destructors are invoked. Item 13 explains that when you create an object, each base class of and each data member in that object is automatically constructed, and the reverse process regarding destruction automatically occurs when an object is destroyed. Those items describe what C++ says must happen, but C++ does not say how they happen. That's up to compiler implementers, but it should be clear that those things don't just happen by themselves. There has to be some code in your program to make those things happen, and that code the code written by compiler implementers and inserted into your program during compilation has to go somewhere. Sometimes, it ends up in your constructors and destructors, so some implementations will generate code equivalent to the following for the allegedly empty Derived constructor above: // possible implementation of Derived constructor Derived::Derived() { // allocate heap memory for this object if it's supposed // to be on the heap; see Item 8 for info on operator new if (this object is on the heap) this = ::operator new(sizeof(Derived)); Base::Base(); // initialize Base part dm1.string(); // construct dm1 dm2.string(); // construct dm2 dm3.string(); // construct dm3 } You could never hope to get code like this to compile, because it's not legal C++ not for you, anyway. For one thing, you have no way of asking whether an object is on the heap from inside its constructor. (For an examination of what it takes to reliably determine whether an object is on the heap, see Item M27.) For another, you're forbidden from assigning to this. And you can't invoke constructors via function calls, either. Your compilers, however, labor under no such constraints they can do whatever they like. But the legality of the code is not the point. The point is that code to call operator new (if necessary), to construct base class parts, and to construct data members may be silently inserted into your constructors, and when it is, those constructors increase in size, thus making them less attractive candidates for inlining. Of course, the same reasoning applies to the Base constructor, so if it's inlined, all the code inserted into it is also inserted into the Derived constructor (via the Derived constructor's call to the Base constructor). And if the string constructor also happens to be inlined, the Derived constructor will gain five copies of that function's code, one for each of the five strings in a Derived object (the two it inherits plus the three it declares itself). Now do you see why it's not necessarily a no-brain decision whether to inline Derived's constructor? Of course, similar considerations apply to Derived's destructor, which, one way or another, must see to it that all the objects initialized by Derived's constructor are properly destroyed. It may also need to free the dynamically allocated memory formerly occupied by the just-destroyed Derived object. Library designers must evaluate the impact of declaring functions inline, because inline functions make it impossible to provide binary upgrades to the inline functions in a library. In other words, if f is an inline function in a library, clients of the library compile the body of f into their applications. If a library implementer later decides to change f, all clients who've used f must recompile. This is often highly undesirable (see also Item 34). On the other hand, if f is a non-inline function, a modification to f requires only that clients relink. This is a substantially less onerous burden than recompiling and, if the library containing the function is dynamically linked, one that may be absorbed in a way that's completely transparent to clients. Static objects inside inline functions often exhibit counterintuitive behavior. For this reason, it's generally a good idea to avoid declaring functions inline if those functions contain static objects. For details, consult Item M26. For purposes of program development, it is important to keep all these considerations in mind, but from a purely practical point of view during coding, one fact dominates all others: most debuggers have trouble with inline functions. This should be no great revelation. How do you set a breakpoint in a function that isn't there? How do you step through such a function? How do you trap calls to it? Without being unreasonably clever (or deviously underhanded), you simply can't. Happily, this leads to a logical strategy for determining which functions should be declared inline and which should not. Initially, don't inline anything, or at least limit your inlining to those functions that are truly trivial, such as age below: class Person { public: int age() const { return personAge; } ... private: int personAge; ... }; By employing inlines cautiously, you facilitate your use of a debugger, but you also put inlining in its proper place: as a hand-applied optimization. Don't forget the empirically determined rule of 80-20 (see Item M16), which states that a typical program spends 80 percent of its time executing only 20 percent of its code. It's an important rule, because it reminds you that your goal as a software developer is to identify the 20 percent of your code that is actually capable of increasing your program's overall performance. You can inline and otherwise tweak your functions until the cows come home, but it's all wasted effort unless you're focusing on the right functions. Once you've identified the set of important functions in your application, the ones whose inlining will actually make a difference (a set that is itself dependent on the architecture on which you're running), don't hesitate to declare them inline. At the same time, however, be on the lookout for problems caused by code bloat, and watch out for compiler warnings (see Item 48) that indicate that your inline functions haven't been inlined. Used judiciously, inline functions are an invaluable component of every C++ programmer's toolbox, but, as the foregoing discussion has revealed, they're not quite as simple and straightforward as you might have thought. Back to Item 32: Postpone variable definitions as long as possible. Continue to Item 34: Minimize compilation dependencies between files.