Template Constants as Compile-Time Flags

Ali Rahimi (last modified Sept 4, 2000)


1. Exposing The Problem

A C++ compiler cannot make assumptions about the values of function arguments unless the function is inlined. This limitation often forces programmers to implement ugly hacks, lots of handwritten code, or inefficient programs. Here's an example of an inefficient program:

void
do_some_processing(Arg a, const bool do_complicated_error_checking)
{
   for(...;...;...) {
      ...
      if(do_complicated_error_checking)
         do_some_intensive_error_checking();
   }
}

int
main()
{
   Arg a;
   do_some_processing(a, false);
}

If do_some_processing() is not inlined, the check for the do_complicated_error_checking is performed in each iteration of its inner loop, even though it's pretty clear that the value of the flag doesn't change when do_some_processing() is called from main(). Though theoretically the compiler could generate two versions of the loop, bracing them with a conditional on the flag, this is rarely done (for example, I can't get g++ 2.92 to do it).

A solution might involve the programmer writing two versions of do_some_processing(). In my opinion this solution is both cumbersome and unmaintainable. Dieheard lisp fans might suggest consing up the loop during the function call and performing the correct code motion at runtime. Unfortunately, this practice is not portable in C++ (or readable in Lisp).

2. Solving it with Template Constants

A good solution is to turn do_some_processing() into a template function:

template <bool do_complicated_error_checking>
void do_some_processing(Arg a)
{
   for(...;...;...) {
      ...
      if(do_complicated_error_checking)
         do_some_intensive_error_checking();
   }
}

int
main()
{
   Arg a;
   do_some_processing<false>(a);         // 1
   do_some_processing<true>(a);          // 2
}

In the above example, the value of do_complicated_error_checking is known at compile time through the template argument of do_some_processing(). As such, the if statement is completely reduced away, along with the call to do_some_intensive_error_checking(). Similarly, the flag test is reduced away, but do_some_intensive_error_checking() is still called in every iteration of the inner loop.

Although this solution does what we want it to do, it has serious syntactic shortcomings (for example, we have created two types of arguments: a template argument and function parameters. Migrating a parameter from one type to another changes the interface of the function and breaks user code unnecessarily).

Instead, we use template constants to provide the same functionality to the caller, but with a much cleaner interface.


template <class T, T V>
struct constant
{
    operator T() const { return V; }
};

template <class Boolean>
void do_some_processing(Arg a, const Boolean do_complicated_error_checking)
{
   for(...;...;...) {
      ...
      if(do_complicated_error_checking)
         do_some_intensive_error_checking();
   }
}

main()
{
   Arg a;
   ...
   do_some_processing(a, false);                             // 1
   do_some_processing(a, constant<bool,false>());      // 2
   do_some_processing(a, constant<bool,true>());       // 3
}

The code generated for the first call generates code that is identical to that generated by the previous example: the conditional still gets evaluated within the loop and we haven't gained anything.

However, our do_some_processing() function can now take a new type of argument: the constant template. The constant<> template class can be cast to an object of type specified by its first argument, and value specified by its second template argument. Since the second template argument to constant<> is part of the type of do_complicated_error_checking, the if statement in do_some_processing() can be evaluated at compile time.

Any dumb old compiler can determine the value of do_complicated_error_checking.operator bool() at compile time by inlining constant::operator T(). As a result, the if statement is easily reduced at compile time.

In the above, no test on do_complicated_error_checking or call to do_intensive_error_checking() appears anywhere in the code generated by the second call to do_some_processing(). Similarly, no test on do_complicated_error_checking is generated in the third call, but do_intensive_error_checking() is always evaluated.

3. Conclusion

You can use template constants to provide hints to compilers about how you want the function to behave at compile time, reducing run-time overhead significantly by taking out unnecessary tests. The solution presented here supports template constants (which result in the reductions discussed), but also allows old-style booleans flags to be passed, resulting in classical-looking code.