Monday 10 March 2014

Design Pattern for Scientific Computing - Curiously recurring template pattern VS. dynamic polymorphism

In scientific computing dynamic polymorphism can be expensive sometimes. Someone may wonder why, just because of an extra vtable pointer in memory and called by a few extra instructions of arithmetic operation on pointers and de-referencing of pointers. (Check the generated vtable by -fdump-class-hierarchy option with g++.)

All these reasons should not matter too much when the function is large and not very often called. (However this statement does not hold in scientific computing. Some small functions are often called over and over again. For instance in matrix operation. At this point dynamic polymorphism is not the best choice. Think about CRTP) This "trivial" impact will come to be magnified when the function is small and called many many times. Except these there is one more important issue about the nature of dynamic polymorphism in C++, defined at the run-time NOT in compile time. It prevents virtual functions from being inlined and optimized by compilers.  Based on this point the calling convention will contributes to the worsening performance. For instance automatic variables, , registers, program counter and so on need to push-into/pop-out-of the stack. What's more CPU has to do extra branching-pipelines if calling function could not be found in the current cache. 

In order to maintain the performance curiously recurring template pattern (CRTP) is employed to replace dynamic polymorphism. Fundamentally CRTP is implemented by template and uses static polymorphism to replace dynamic polymorphism. Hence the extra time will be spent on compilation (in theory). In code-wise the size will be bigger, if considering multiple template instances have been instantiated. But given the fact of better code sharing/generalization done in CRTP and modern compilers available, the compilation time and increment size are moderate.

Here is the simple implementation of CRTP.

template<typename Derived>
class Base {
public:
    Derived& GetIntance() {
         return static_cast<Derived&>(*this);
    }
   
    void Foo() {
        GetInstance().FooImp();
    }

    static void S_Bar() {
       Derived::S_BarImp(); 
    }
};

// template specification
class Derived1 : public Base<Derived1> {
public:
    void FooImp();
    static void S_BarImp();
};

class Derived2 : public Base<Derived2> {
public:
    void FooImp();
    static void S_BarImp();
};

// How to use
template<typename Derived>
void ProcessBase(Base<Derived>& obj) {
    // ......
    obj.Foo();
    // .......
}

No comments:

Post a Comment