我们都知道虚函数在引用语义下表现出多态,但也多了一次寻址开销
实际性能开销主要源自虚函数抑制了内联优化
首先我们先看一下以下代码的输出 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
| class A { public: virtual void f() { cout << "A::f()" << "\n"; } virtual void g() { cout << "A::g()" << "\n"; } }; class B : public A { public: void f() override { cout << "B::f()" << "\n"; } void g() override { cout << "B::g()" << "\n"; } }; class C : public B { public: void f() override { cout << "C::f()" << "\n"; } void g() override { cout << "C::g()" << "\n"; } };
int main() { A* a = new A(); a->f(); a->g();
B* b = new B(); *(std::uint64_t*)a = *(std::uint64_t*)b; a->f(); a->g();
A aa = *a; aa.f(); aa.g();
a = new C(); a->f(); a->g(); }
|
A::f()
A::g()
B::f()
B::g()
A::f()
A::g()
C::f()
C::g()
第一次输出很好理解,就是简单的函数调用,第二次和第四次输出a都指向了子类,所以调用了子类重写的虚函数,第三次输出是值语义,不表现多态
我们再看另外一份代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| class A { public: virtual void f() { cout << "A::f()" << "\n"; } virtual void g() { cout << "A::g()" << "\n"; } }; class B : public A { public: void f() override { cout << "B::f()" << "\n"; } void g() override { cout << "B::g()" << "\n"; } }; class C : public B { public: void f() override { cout << "C::f()" << "\n"; } void g() override { cout << "C::g()" << "\n"; } };
int main() { A* a = new B(); C* c = new C();
a->f(); a->g();
c->f(); c->g();
*(std::uint64_t*)a = *(std::uint64_t*)c; a->f(); a->g();
static_cast<B*>(a)->f(); static_cast<B*>(a)->g(); }
|
与之前的输出的原理一致
B::f()
B::g()
C::f()
C::g()
C::f()
C::g()
C::f()
C::g()
观察汇编代码也可以很清楚的搞清其中的原理,所有的调用都在查询虚表(64位程序所以是+8)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| a->f(); 00007FF7E38F4F17 48 8B 03 mov rax,qword ptr [rbx] 00007FF7E38F4F1A 48 8B CB mov rcx,rbx 00007FF7E38F4F1D FF 10 call qword ptr [rax] a->g(); 00007FF7E38F4F1F 48 8B 03 mov rax,qword ptr [rbx] 00007FF7E38F4F22 48 8B CB mov rcx,rbx 00007FF7E38F4F25 FF 50 08 call qword ptr [rax+8]
c->f(); 00007FF7E38F4F28 48 8B 07 mov rax,qword ptr [rdi] 00007FF7E38F4F2B 48 8B CF mov rcx,rdi 00007FF7E38F4F2E FF 10 call qword ptr [rax] c->g(); 00007FF7E38F4F30 48 8B 07 mov rax,qword ptr [rdi] 00007FF7E38F4F33 48 8B CF mov rcx,rdi 00007FF7E38F4F36 FF 50 08 call qword ptr [rax+8]
*(std::uint64_t*)a = *(std::uint64_t*)c; 00007FF7E38F4F39 48 8B 07 mov rax,qword ptr [rdi] 00007FF7E38F4F3C 48 89 03 mov qword ptr [rbx],rax a->f(); 00007FF7E38F4F3F 48 8B CB mov rcx,rbx 00007FF7E38F4F42 FF 10 call qword ptr [rax] a->g(); 00007FF7E38F4F44 48 8B 03 mov rax,qword ptr [rbx] 00007FF7E38F4F47 48 8B CB mov rcx,rbx 00007FF7E38F4F4A FF 50 08 call qword ptr [rax+8]
static_cast<B*>(a)->f(); 00007FF7E38F4F4D 48 8B 03 mov rax,qword ptr [rbx] 00007FF7E38F4F50 48 8B CB mov rcx,rbx 00007FF7E38F4F53 FF 10 call qword ptr [rax] static_cast<B*>(a)->g(); 00007FF7E38F4F55 48 8B 03 mov rax,qword ptr [rbx] 00007FF7E38F4F58 48 8B CB mov rcx,rbx 00007FF7E38F4F5B FF 50 08 call qword ptr [rax+8]
|
现在我们稍微修改一下代码,对B的f函数加上Final,并对C的继承加上Final
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| class A { public: virtual void f() { cout << "A::f()" << "\n"; } virtual void g() { cout << "A::g()" << "\n"; } }; class B : public A { public: void f() override final { cout << "B::f()" << "\n"; } void g() override { cout << "B::g()" << "\n"; } }; class C final : public B { public:
void g() override { cout << "C::g()" << "\n"; } };
int main() { A* a = new B(); C* c = new C();
a->f(); a->g();
c->f(); c->g();
*(std::uint64_t*)a = *(std::uint64_t*)c; a->f(); a->g();
static_cast<B*>(a)->f(); static_cast<B*>(a)->g(); }
|
输出如下,可以观察到输出改变了
B::f()
B::g()
B::f()
C::g()
B::f()
C::g()
B::f()
C::g()
现在回到汇编代码就可以很清晰的知道发生了什么
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| a->f(); 00007FF619A64F07 48 8B 03 mov rax,qword ptr [rbx] 00007FF619A64F0A 48 8B CB mov rcx,rbx 00007FF619A64F0D FF 10 call qword ptr [rax] a->g(); 00007FF619A64F0F 48 8B 03 mov rax,qword ptr [rbx] 00007FF619A64F12 48 8B CB mov rcx,rbx 00007FF619A64F15 FF 50 08 call qword ptr [rax+8]
c->f(); 00007FF619A64F18 48 8B CF mov rcx,rdi 00007FF619A64F1B E8 7D C9 FF FF call B::f (07FF619A6189Dh) c->g(); 00007FF619A64F20 48 8B CF mov rcx,rdi 00007FF619A64F23 E8 1A C5 FF FF call C::g (07FF619A61442h)
*(std::uint64_t*)a = *(std::uint64_t*)c; 00007FF619A64F28 48 8B 07 mov rax,qword ptr [rdi] 00007FF619A64F2B 48 89 03 mov qword ptr [rbx],rax a->f(); 00007FF619A64F2E 48 8B CB mov rcx,rbx 00007FF619A64F31 FF 10 call qword ptr [rax] a->g(); 00007FF619A64F33 48 8B 03 mov rax,qword ptr [rbx] 00007FF619A64F36 48 8B CB mov rcx,rbx 00007FF619A64F39 FF 50 08 call qword ptr [rax+8]
static_cast<B*>(a)->f(); 00007FF619A64F3C 48 8B CB mov rcx,rbx 00007FF619A64F3F E8 59 C9 FF FF call B::f (07FF619A6189Dh) static_cast<B*>(a)->g(); 00007FF619A64F44 48 8B 03 mov rax,qword ptr [rbx] 00007FF619A64F47 48 8B CB mov rcx,rbx 00007FF619A64F4A FF 50 08 call qword ptr [rax+8]
|
对于第一次输出,父类指针a指向了子类B,所以调用采用查询虚表的形式,与之前一致
对于第二次输出,由于C在继承时添加了Final,这表明了不会有别的类进一步继承C,因此C调用函数的过程在编译期可确定,不再查询虚表
对于第三次输出,父类指针a指向了c,此时保持查询虚表的形式
对于第四次输出,我们把指针a转换位B*类型,由于B中的f函数添加了final,可以确定继承B的子类不会进一步重写f函数,因此B调用函数f的过程也是编译期可以确定的,不再查询虚表
通过上面的例子不难总结,当使用了final时:
- 对于指针指向子类对象时,调用虚函数依然需要查询虚表(编译期无法确定)
- 对于指针指向自身类型时,调用虚函数可以根据final进行优化,一旦时编译期可以确定的行为,那么讲不再查询虚表
这样的优化叫做去虚拟化