void
do_some_processing(Arg a, const bool do_complicated_error_checking)
{
for(...;...;...) {
...
if(do_complicated_error_checking)
do_some_intensive_error_checking();
}
}
int
main()
{
Arg a;
do_some_processing(a, false);
}
|
If do_some_processing() is not inlined, the check for the do_complicated_error_checking is performed in each iteration of its inner loop, even though it's pretty clear that the value of the flag doesn't change when do_some_processing() is called from main(). Though theoretically the compiler could generate two versions of the loop, bracing them with a conditional on the flag, this is rarely done (for example, I can't get g++ 2.92 to do it).
A solution might involve the programmer writing two versions of do_some_processing(). In my opinion this solution is both cumbersome and unmaintainable. Dieheard lisp fans might suggest consing up the loop during the function call and performing the correct code motion at runtime. Unfortunately, this practice is not portable in C++ (or readable in Lisp).
template <bool do_complicated_error_checking>
void do_some_processing(Arg a)
{
for(...;...;...) {
...
if(do_complicated_error_checking)
do_some_intensive_error_checking();
}
}
int
main()
{
Arg a;
do_some_processing<false>(a); // 1
do_some_processing<true>(a); // 2
}
|
In the above example, the value of do_complicated_error_checking is known at compile time through the template argument of do_some_processing(). As such, the if statement is completely reduced away, along with the call to do_some_intensive_error_checking(). Similarly, the flag test is reduced away, but do_some_intensive_error_checking() is still called in every iteration of the inner loop.
Although this solution does what we want it to do, it has serious syntactic shortcomings (for example, we have created two types of arguments: a template argument and function parameters. Migrating a parameter from one type to another changes the interface of the function and breaks user code unnecessarily).
Instead, we use template constants to provide the same functionality to the caller, but with a much cleaner interface.
template <class T, T V>
struct constant
{
operator T() const { return V; }
};
template <class Boolean>
void do_some_processing(Arg a, const Boolean do_complicated_error_checking)
{
for(...;...;...) {
...
if(do_complicated_error_checking)
do_some_intensive_error_checking();
}
}
main()
{
Arg a;
...
do_some_processing(a, false); // 1
do_some_processing(a, constant<bool,false>()); // 2
do_some_processing(a, constant<bool,true>()); // 3
}
|
The code generated for the first call generates code that is identical to that generated by the previous example: the conditional still gets evaluated within the loop and we haven't gained anything.
However, our do_some_processing() function can now take a new type of argument: the constant template. The constant<> template class can be cast to an object of type specified by its first argument, and value specified by its second template argument. Since the second template argument to constant<> is part of the type of do_complicated_error_checking, the if statement in do_some_processing() can be evaluated at compile time.
Any dumb old compiler can determine the value of do_complicated_error_checking.operator bool() at compile time by inlining constant::operator T(). As a result, the if statement is easily reduced at compile time.
In the above, no test on do_complicated_error_checking or call to do_intensive_error_checking() appears anywhere in the code generated by the second call to do_some_processing(). Similarly, no test on do_complicated_error_checking is generated in the third call, but do_intensive_error_checking() is always evaluated.