http://quick-bench.com/N35RRw_IFO74ZihFbtMu4BIKCJg
如果启用优化,您将获得不同的结果:
http://quick-bench.com/p6KIUtRxZdHJeiFiGI8gjbOumoc
这是我用于基准测试的代码;我相信有更好的实现方式使用变体而不是虚拟关键字(继承 vs. std::variant): < p > 删除旧代码;查看更新 有人能解释一下如何最好地使用 std::variant 实现这个使用案例,让我进行测试和基准测试吗:
我目前正在实现RFC 3986,它是'URI',对于我的用例,这个类将更多地用作常量,并且可能不会经常更改,用户更有可能使用该类来查找URI的每个特定部分,而不是创建URI; 因此,使用 std::string_view 而不是将URI的每个段分别放在自己的 std :: string 中是有意义的。问题是我需要为此实现两个类; 当我只需要一个const版本时使用一个类;当用户想要创建URI而不是提供一个URI并搜索它时使用另一个类。
所以我使用了一个模板
来解决这个问题,但是它也有自己的问题;但后来我意识到我可以使用std::variant<std::string, std::string_view>
(或者可能是std::variant<CustomStructHoldingAllThePieces, std::string_view>
);因此我开始研究是否实际上使用变量会有帮助。从这些结果来看,如果我不想实现两个不同的const_uri
和uri
类,则使用继承和virtual
是最好的选择。
你认为我应该怎么做?
更新(2)
感谢 @gan_ 提出并解决了我在基准测试代码中遇到的变量提升问题。
http://quick-bench.com/Mcclomh03nu8nDCgT3T302xKnXY
我对 try-catch 地狱的结果感到惊讶,但幸好有 这个评论,现在我理解了。
更新(3)
我移除了 try-catch
方法,因为它真的很糟糕;同时也随机更改了选定的值,看起来我得到了更加真实的基准测试结果。看来 virtual
并不是正确的答案。
http://quick-bench.com/o92Yrt0tmqTdcvufmIpu_fIfHt0
http://quick-bench.com/FFbe3bsIpdFsmgKfm94xGNFKVKs(没有内存泄漏哈哈)
更新(4)
我去掉了生成随机数的开销(在上次更新中已经做过了,但似乎我拿错了基准测试的URL),并添加了一个EmptyRandom来理解生成随机数的基准。还对Virtual进行了一些小改动,但我不认为这会影响什么。
http://quick-bench.com/EmhM-S-xoA0LABYK6yrMyBb8UeI
http://quick-bench.com/5hBZprSRIRGuDaBZ_wj0cOwnNhw(去掉Virtual,这样你可以更好地比较其他内容)
更新 (5)
正如Jorge Bellon在评论中所说的那样,我没有考虑到分配的成本;因此,我将每个基准测试都转换为使用指针。当然,这种间接性会影响性能,但现在更加公平。所以现在循环中没有分配。
以下是代码:
删除了旧代码; 请查看更新
我迄今为止运行了一些基准测试。看起来g++更好地优化了代码:
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
EmptyRandom 0.756 ns 0.748 ns 746067433
TradeSpaceForPerformance 2.87 ns 2.86 ns 243756914
Virtual 12.5 ns 12.4 ns 60757698
Index 7.85 ns 7.81 ns 99243512
GetIf 8.20 ns 8.18 ns 92393200
HoldsAlternative 7.08 ns 7.07 ns 96959764
ConstexprVisitor 11.3 ns 11.2 ns 60152725
StructVisitor 10.7 ns 10.6 ns 60254088
Overload 10.3 ns 10.3 ns 58591608
对于clang:
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
EmptyRandom 1.99 ns 1.99 ns 310094223
TradeSpaceForPerformance 8.82 ns 8.79 ns 87695977
Virtual 12.9 ns 12.8 ns 51913962
Index 13.9 ns 13.8 ns 52987698
GetIf 15.1 ns 15.0 ns 48578587
HoldsAlternative 13.1 ns 13.1 ns 51711783
ConstexprVisitor 13.8 ns 13.8 ns 49120024
StructVisitor 14.5 ns 14.5 ns 52679532
Overload 17.1 ns 17.1 ns 42553366
现在,对于clang来说,最好使用虚拟继承,但对于g ++来说,最好使用holds_alternative
或get_if
,但总体而言,std::visit
似乎不是我目前所有基准测试的一个好选择。
我认为,如果模式匹配(能够检查除整数文字之外的更多内容的switch语句)被添加到C++中,我们将编写更清晰、更易于维护的代码。
我想知道package.index()
的结果。它不应该更快吗?它是做什么的?
Clang版本:http://quick-bench.com/cl0HFmUes2GCSE1w04qt4Rqj6aI
使用One one
而不是auto one = new One
的版本,基于Maxim Egorushkin's comment: http://quick-bench.com/KAeT00__i2zbmpmUHDutAfiD6-Q(结果没有发生太大变化)
更新(6)
我做了一些更改,现在不同编译器的结果差异很大。但似乎std::get_if
和std::holds_alternatives
是最好的解决方案。virtual
现在似乎在clang中表现最佳,这让我感到非常惊讶,因为我记得在gcc中virtual
表现更好。而且std::visit
完全不具竞争力,在最后一个基准测试中甚至比vtable查找还要糟糕。
以下是基准测试(使用GCC / Clang以及libstdc ++和libc ++运行):
http://quick-bench.com/LhdP-9y6CqwGxB-WtDlbG27o_5Y
#include <benchmark/benchmark.h>
#include <array>
#include <variant>
#include <random>
#include <functional>
#include <algorithm>
using namespace std;
struct One {
auto get () const { return 1; }
};
struct Two {
auto get() const { return 2; }
};
struct Three {
auto get() const { return 3; }
};
struct Four {
auto get() const { return 4; }
};
template<class... Ts> struct overload : Ts... { using Ts::operator()...; };
template<class... Ts> overload(Ts...) -> overload<Ts...>;
std::random_device dev;
std::mt19937 rng(dev());
std::uniform_int_distribution<std::mt19937::result_type> random_pick(0,3); // distribution in range [1, 6]
template <std::size_t N>
std::array<int, N> get_random_array() {
std::array<int, N> item;
for (int i = 0 ; i < N; i++)
item[i] = random_pick(rng);
return item;
}
template <typename T, std::size_t N>
std::array<T, N> get_random_objects(std::function<T(decltype(random_pick(rng)))> func) {
std::array<T, N> a;
std::generate(a.begin(), a.end(), [&] {
return func(random_pick(rng));
});
return a;
}
static void TradeSpaceForPerformance(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
int index = 0;
auto ran_arr = get_random_array<50>();
int r = 0;
auto pick_randomly = [&] () {
index = ran_arr[r++ % ran_arr.size()];
};
pick_randomly();
for (auto _ : state) {
int res;
switch (index) {
case 0:
res = one.get();
break;
case 1:
res = two.get();
break;
case 2:
res = three.get();
break;
case 3:
res = four.get();
break;
}
benchmark::DoNotOptimize(index);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
// Register the function as a benchmark
BENCHMARK(TradeSpaceForPerformance);
static void Virtual(benchmark::State& state) {
struct Base {
virtual int get() const noexcept = 0;
virtual ~Base() {}
};
struct A final: public Base {
int get() const noexcept override { return 1; }
};
struct B final : public Base {
int get() const noexcept override { return 2; }
};
struct C final : public Base {
int get() const noexcept override { return 3; }
};
struct D final : public Base {
int get() const noexcept override { return 4; }
};
Base* package = nullptr;
int r = 0;
auto packages = get_random_objects<Base*, 50>([&] (auto r) -> Base* {
switch(r) {
case 0: return new A;
case 1: return new B;
case 3: return new C;
case 4: return new D;
default: return new C;
}
});
auto pick_randomly = [&] () {
package = packages[r++ % packages.size()];
};
pick_randomly();
for (auto _ : state) {
int res = package->get();
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
for (auto &i : packages)
delete i;
}
BENCHMARK(Virtual);
static void FunctionPointerList(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
using type = std::function<int()>;
std::size_t index;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return std::bind(&One::get, one);
case 1: return std::bind(&Two::get, two);
case 2: return std::bind(&Three::get, three);
case 3: return std::bind(&Four::get, four);
default: return std::bind(&Three::get, three);
}
});
int r = 0;
auto pick_randomly = [&] () {
index = r++ % packages.size();
};
pick_randomly();
for (auto _ : state) {
int res = packages[index]();
benchmark::DoNotOptimize(index);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(FunctionPointerList);
static void Index(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
using type = std::variant<One, Two, Three, Four>;
type* package = nullptr;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return one;
case 1: return two;
case 2: return three;
case 3: return four;
default: return three;
}
});
int r = 0;
auto pick_randomly = [&] () {
package = &packages[r++ % packages.size()];
};
pick_randomly();
for (auto _ : state) {
int res;
switch (package->index()) {
case 0:
res = std::get<One>(*package).get();
break;
case 1:
res = std::get<Two>(*package).get();
break;
case 2:
res = std::get<Three>(*package).get();
break;
case 3:
res = std::get<Four>(*package).get();
break;
}
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(Index);
static void GetIf(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
using type = std::variant<One, Two, Three, Four>;
type* package = nullptr;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return one;
case 1: return two;
case 2: return three;
case 3: return four;
default: return three;
}
});
int r = 0;
auto pick_randomly = [&] () {
package = &packages[r++ % packages.size()];
};
pick_randomly();
for (auto _ : state) {
int res;
if (auto item = std::get_if<One>(package)) {
res = item->get();
} else if (auto item = std::get_if<Two>(package)) {
res = item->get();
} else if (auto item = std::get_if<Three>(package)) {
res = item->get();
} else if (auto item = std::get_if<Four>(package)) {
res = item->get();
}
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(GetIf);
static void HoldsAlternative(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
using type = std::variant<One, Two, Three, Four>;
type* package = nullptr;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return one;
case 1: return two;
case 2: return three;
case 3: return four;
default: return three;
}
});
int r = 0;
auto pick_randomly = [&] () {
package = &packages[r++ % packages.size()];
};
pick_randomly();
for (auto _ : state) {
int res;
if (std::holds_alternative<One>(*package)) {
res = std::get<One>(*package).get();
} else if (std::holds_alternative<Two>(*package)) {
res = std::get<Two>(*package).get();
} else if (std::holds_alternative<Three>(*package)) {
res = std::get<Three>(*package).get();
} else if (std::holds_alternative<Four>(*package)) {
res = std::get<Four>(*package).get();
}
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(HoldsAlternative);
static void ConstexprVisitor(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
using type = std::variant<One, Two, Three, Four>;
type* package = nullptr;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return one;
case 1: return two;
case 2: return three;
case 3: return four;
default: return three;
}
});
int r = 0;
auto pick_randomly = [&] () {
package = &packages[r++ % packages.size()];
};
pick_randomly();
auto func = [] (auto const& ref) {
using type = std::decay_t<decltype(ref)>;
if constexpr (std::is_same<type, One>::value) {
return ref.get();
} else if constexpr (std::is_same<type, Two>::value) {
return ref.get();
} else if constexpr (std::is_same<type, Three>::value) {
return ref.get();
} else if constexpr (std::is_same<type, Four>::value) {
return ref.get();
} else {
return 0;
}
};
for (auto _ : state) {
auto res = std::visit(func, *package);
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(ConstexprVisitor);
static void StructVisitor(benchmark::State& state) {
struct VisitPackage
{
auto operator()(One const& r) { return r.get(); }
auto operator()(Two const& r) { return r.get(); }
auto operator()(Three const& r) { return r.get(); }
auto operator()(Four const& r) { return r.get(); }
};
One one;
Two two;
Three three;
Four four;
using type = std::variant<One, Two, Three, Four>;
type* package = nullptr;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return one;
case 1: return two;
case 2: return three;
case 3: return four;
default: return three;
}
});
int r = 0;
auto pick_randomly = [&] () {
package = &packages[r++ % packages.size()];
};
pick_randomly();
auto vs = VisitPackage();
for (auto _ : state) {
auto res = std::visit(vs, *package);
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(StructVisitor);
static void Overload(benchmark::State& state) {
One one;
Two two;
Three three;
Four four;
using type = std::variant<One, Two, Three, Four>;
type* package = nullptr;
auto packages = get_random_objects<type, 50>([&] (auto r) -> type {
switch(r) {
case 0: return one;
case 1: return two;
case 2: return three;
case 3: return four;
default: return three;
}
});
int r = 0;
auto pick_randomly = [&] () {
package = &packages[r++ % packages.size()];
};
pick_randomly();
auto ov = overload {
[] (One const& r) { return r.get(); },
[] (Two const& r) { return r.get(); },
[] (Three const& r) { return r.get(); },
[] (Four const& r) { return r.get(); }
};
for (auto _ : state) {
auto res = std::visit(ov, *package);
benchmark::DoNotOptimize(package);
benchmark::DoNotOptimize(res);
pick_randomly();
}
}
BENCHMARK(Overload);
// BENCHMARK_MAIN();
GCC编译器的结果:
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
TradeSpaceForPerformance 3.71 ns 3.61 ns 170515835
Virtual 12.20 ns 12.10 ns 55911685
FunctionPointerList 13.00 ns 12.90 ns 50763964
Index 7.40 ns 7.38 ns 136228156
GetIf 4.04 ns 4.02 ns 205214632
HoldsAlternative 3.74 ns 3.73 ns 200278724
ConstexprVisitor 12.50 ns 12.40 ns 56373704
StructVisitor 12.00 ns 12.00 ns 60866510
Overload 13.20 ns 13.20 ns 56128558
针对clang编译器的结果(我对此感到惊讶):
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
TradeSpaceForPerformance 8.07 ns 7.99 ns 77530258
Virtual 7.80 ns 7.77 ns 77301370
FunctionPointerList 12.1 ns 12.1 ns 56363372
Index 11.1 ns 11.1 ns 69582297
GetIf 10.4 ns 10.4 ns 80923874
HoldsAlternative 9.98 ns 9.96 ns 71313572
ConstexprVisitor 11.4 ns 11.3 ns 63267967
StructVisitor 10.8 ns 10.7 ns 65477522
Overload 11.4 ns 11.4 ns 64880956
目前为止最好的基准测试(将会更新):http://quick-bench.com/LhdP-9y6CqwGxB-WtDlbG27o_5Y(同时也请查看GCC)
One
)。try/catch备用方案基本上是为在选择第一种方案时非常快速而设计的。如果我将你的代码更改为持有第三种方案(Three
),那么try/catch就会比其他任何版本慢数千倍。http://quick-bench.com/yIDclCuxfzjzBlCjbVCDNrp13aU - Holt