在实践中，unordered_map比map更快吗？

Question

在实践中，unordered_map比map更快吗？

c++performancedictionaryunordered-map

31

当然，unordered_map的查找性能平均为常数级别，而map的查找性能为O（logN）。

但是，为了在unordered_map中找到一个对象，我们需要：

哈希要查找的键。
将该键与同一桶中的每个键进行equality_compare比较。

而在map中，我们只需要将要查找的键与log2（N）个键进行less_than比较，其中N是map中项目的数量。

考虑到哈希函数会增加开销，而equality_compare不比less_than比较便宜，因此我想知道实际的性能差异会有多大。

为了不打扰社区并回答自己的问题，我写了一个测试。

我已经分享了以下结果，以防其他人发现这很有趣或有用。

当然，如果有人能够并愿意添加更多信息，则可以提出更多答案。

- Richard Hodges

6

map 的问题并不在于它本身的 log N，而是在于遍历树时每次内存访问都是基本随机的。当 map 很小时，这不重要，但当 map 很大时，它就占主导地位。（访问缓存和内存之间的差异是一个数量级或两个数量级；参见例如 https://dev59.com/VW855IYBdhLWcg3w6IvV/。而且这种差异往往会随着 CPU 代数的增加而增加，因为相关的物理是局部的。）相比指针追踪，等于/小于操作是微不足道的。 - Nemo

@Nemo，请看一下我的测试结果，特别是 flat_map 和 map 的比较。乍一看，与（大！）排序向量相比，在 map 中指针追踪似乎导致查找时间翻倍。然而，这里可能还有其他因素在起作用。例如，clang 更愿意内联整个对向量的 lower_bound 搜索，而不是对 map 的 at 搜索。 - Richard Hodges

2个回答

9

在这个测试中，我使用了苹果clang的-O3编译选项，并采取了以下措施以确保测试的公正性：

对于通过虚表搜索的每个结果，调用一个接收器函数，以防止优化程序完全内联整个搜索！
并行运行包含相同数据、相同顺序的3种不同类型的映射的测试。这意味着如果一个测试开始“领先”，它会进入缓存未命中区域进行搜索（请参见代码）。这意味着没有一个测试会因为遇到“热”缓存而获得不公平的优势。
参数化键大小（因此复杂度）
参数化地图大小
测试三种不同类型的映射（包含相同数据）- 无序映射、映射和按键/值对排序的向量。
检查汇编输出，以确保优化程序无法通过死代码分析优化整个逻辑块。

下面是代码：

#include <iostream>
#include <random>
#include <algorithm>
#include <string>
#include <vector>
#include <map>
#include <unordered_map>
#include <chrono>
#include <tuple>
#include <future>
#include <stdexcept>
#include <sstream>

using namespace std;

// this sets the length of the string we will be using as a key.
// modify this to test whether key complexity changes the performance ratios
// of the various maps
static const size_t key_length = 20;

// the number of keys we will generate (the size of the test)
const size_t nkeys = 1000000;


// the types of map we will test
unordered_map<string, string> unordered;
map<string, string> ordered;
vector<pair<string, string>> flat_map;

// a vector of all keys, which we can shuffle in order to randomise
// access order of all our maps consistently
vector<string> keys;

// use a virtual method to prevent the optimiser from detecting that
// our sink function actually does nothing. otherwise it might skew the test
struct string_user
{
    virtual void sink(const std::string&) = 0;
    virtual ~string_user() = default;
};

struct real_string_user : string_user
{
    virtual void sink(const std::string&) override
    {
        
    }
};

struct real_string_user_print : string_user
{
    virtual void sink(const std::string& s) override
    {
        cout << s << endl;
    }
};

// generate a sink from a string - this is a runtime operation and therefore
// prevents the optimiser from realising that the sink does nothing
std::unique_ptr<string_user> make_sink(const std::string& name)
{
    if (name == "print")
    {
        return make_unique<real_string_user_print>();
    }
    if (name == "noprint")
    {
        return make_unique<real_string_user>();
    }
    throw logic_error(name);
}

// generate a random key, given a random engine and a distribution
auto gen_string = [](auto& engine, auto& dist)
{
    std::string result(key_length, ' ');
    generate(begin(result), end(result), [&] {
        return dist(engine);
    });
    return result;
};

// comparison predicate for our flat map.
struct pair_less
{
    bool operator()(const pair<string, string>& l, const string& r) const {
        return l.first < r;
    }

    bool operator()(const string& l, const pair<string, string>& r) const {
        return l < r.first;
    }
};

int main()
{
    // generate the sink, preventing the optimiser from realising what it
    // does.
    stringstream ss;
    ss << "noprint";
    string arg;
    ss >> arg;
    auto puser = make_sink(arg);
    
    // generate keys
    auto eng = std::default_random_engine(std::random_device()());
    auto alpha_dist = std::uniform_int_distribution<char>('A', 'Z');
    
    for (size_t i = 0 ; i < nkeys ; ++i)
    {
        bool inserted = false;
        auto value = to_string(i);
        while(!inserted) {
            // generate a key
            auto key = gen_string(eng, alpha_dist);
            // try to store it in the unordered map
            // if it already exists, force a regeneration
            // otherwise also store it in the ordered map and the flat map
            tie(ignore, inserted) = unordered.emplace(key, value);
            if (inserted) {
                flat_map.emplace_back(key, value);
                ordered.emplace(key, std::move(value));
                // record the key for later use
                keys.emplace_back(std::move(key));
            }
        }
    }
    // turn our vector 'flat map' into an actual flat map by sorting it by pair.first. This is the key.
    sort(begin(flat_map), end(flat_map),
         [](const auto& l, const auto& r) { return l.first < r.first; });
    
    // shuffle the keys to randomise access order
    shuffle(begin(keys), end(keys), eng);

    // spawn a thread to time access to the unordered map
    auto unordered_future = async(launch::async, [&]()
                                  {
                                      auto start_time = chrono::system_clock::now();

                                      for (auto const& key : keys)
                                      {
                                          puser->sink(unordered.at(key));
                                      }
                                      
                                      auto stop_time = chrono::system_clock::now();
                                      auto diff =  stop_time - start_time;
                                      return diff;
                                  });
    
    // spawn a thread to time access to the ordered map
    auto ordered_future = async(launch::async, [&]
                                {
                                    
                                    auto start_time = chrono::system_clock::now();
                                    
                                    for (auto const& key : keys)
                                    {
                                        puser->sink(ordered.at(key));
                                    }
                                    
                                    auto stop_time = chrono::system_clock::now();
                                    auto diff =  stop_time - start_time;
                                    return diff;
                                });

    // spawn a thread to time access to the flat map
    auto flat_future = async(launch::async, [&]
                                {
                                    
                                    auto start_time = chrono::system_clock::now();
                                    
                                    for (auto const& key : keys)
                                    {
                                        auto i = lower_bound(begin(flat_map),
                                                               end(flat_map),
                                                               key,
                                                               pair_less());
                                        if (i != end(flat_map) && i->first == key)
                                            puser->sink(i->second);
                                        else
                                            throw invalid_argument(key);
                                    }
                                    
                                    auto stop_time = chrono::system_clock::now();
                                    auto diff =  stop_time - start_time;
                                    return diff;
                                });

    // synchronise all the threads and get the timings
    auto ordered_time = ordered_future.get();
    auto unordered_time = unordered_future.get();
    auto flat_time = flat_future.get();
 
    // print
    cout << "  ordered time: " << ordered_time.count() << endl;
    cout << "unordered time: " << unordered_time.count() << endl;
    cout << " flat map time: " << flat_time.count() << endl;
    
    return 0;
}

结果：

  ordered time: 972711
unordered time: 335821
 flat map time: 559768

从结果可以看出，unordered_map明显比map和排序后的pair vector更快。而pair vector比map解决方案快两倍，这很有趣，因为lower_bound和map::at的复杂度几乎相同。

简述

在此测试中，无序映射查找速度约为有序映射的3倍，而排序后的向量比映射更快。

我实际上对它的速度有些震惊。

- Richard Hodges

1

我指的是更改 nkeys 值。 - Disillusioned

4

看起来你的“平面图”测试实际上搜索了排序向量和有序映射。所以我有点惊讶它的时间相同。实际上，这可能与同时运行测试有关。我个人认为，如果测试不是同时运行，以消除争用作为一个因素，会感觉更好。此外，平面图测试不应该对ordered对象做任何事情（除非我误解了什么）。 - Michael Burr

1

@RichardHodges：你的代码中没有显式的互斥锁，但是处理器数量仍然有限，它们之间存在竞争。 - Mooing Duck

2

@Richard 关于“测试速度太快以至于无法测量”的问题，检查复杂度的整个目的是为了了解不同N值的性能影响。特别是，较低复杂度的好处在于，无论常数开销有多大，都会有一个阈值N，从该阈值开始，较低的复杂度开始表现更好。为了使用较低的N值进行测试，您需要重复测试多次以获得可衡量的结果。 - Disillusioned

2

我刚刚在GCC和MSVC上运行了基准测试。在地图中有大约20个项目时，无序使用GCC的速度是2倍，但已经接近MSVC的3倍。MSVC对std::uniform_int_distribution<char>提出了抱怨，但使用<int>可以解决问题。 - Octo Poulos

显示剩余14条评论

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Richard Hodges · Accepted Answer

针对关于与未搜索数量相关的性能问题的提问，我已经重构了测试以参数化此问题。

示例结果：

searches	set_size	miss	ordered	unordered	flat_map
1000000	0	100%	4384	12901	681
1000000	99	99.99%	89127	42615	86091
1000000	172	99.98%	101283	53468	96008
1000000	303	99.97%	112747	53211	107343
1000000	396	99.96%	124179	59655	112687
1000000	523	99.95%	132180	51133	121669
1000000	599	99.94%	135850	55078	121072
1000000	695	99.93%	140204	60087	124961
1000000	795	99.92%	146071	64790	127873
1000000	916	99.91%	154461	50944	133194
1000000	988	99.9%	156327	54094	134288

关键：

searches = number of searches performed against each map
set_size = how big each map is (and therefore how many of the searches will result in a hit)
miss = the probability of generating a missed search. Used for generating searches and set_size.
ordered = the time spent searching the ordered map
unordered = the time spent searching the unordered_map
flat_map = the time spent searching the flat map

note: time is measured in std::system_clock::duration ticks.

TL;DR

结果：当地图中有数据时，unordered_map显示出其优越性。唯一比有序地图表现更差的情况是地图为空的时候。

这是新代码：

#include <iostream>
#include <iomanip>
#include <random>
#include <algorithm>
#include <string>
#include <vector>
#include <map>
#include <unordered_map>
#include <unordered_set>
#include <chrono>
#include <tuple>
#include <future>
#include <stdexcept>
#include <sstream>

using namespace std;

// this sets the length of the string we will be using as a key.
// modify this to test whether key complexity changes the performance ratios
// of the various maps
static const size_t key_length = 20;

// the number of keys we will generate (the size of the test)
const size_t nkeys = 1000000;



// use a virtual method to prevent the optimiser from detecting that
// our sink function actually does nothing. otherwise it might skew the test
struct string_user
{
    virtual void sink(const std::string&) = 0;
    virtual ~string_user() = default;
};

struct real_string_user : string_user
{
    virtual void sink(const std::string&) override
    {
        
    }
};

struct real_string_user_print : string_user
{
    virtual void sink(const std::string& s) override
    {
        cout << s << endl;
    }
};

// generate a sink from a string - this is a runtime operation and therefore
// prevents the optimiser from realising that the sink does nothing
std::unique_ptr<string_user> make_sink(const std::string& name)
{
    if (name == "print")
    {
        return make_unique<real_string_user_print>();
    }
    if (name == "noprint")
    {
        return make_unique<real_string_user>();
    }
    throw logic_error(name);
}

// generate a random key, given a random engine and a distribution
auto gen_string = [](auto& engine, auto& dist)
{
    std::string result(key_length, ' ');
    generate(begin(result), end(result), [&] {
        return dist(engine);
    });
    return result;
};

// comparison predicate for our flat map.
struct pair_less
{
    bool operator()(const pair<string, string>& l, const string& r) const {
        return l.first < r;
    }

    bool operator()(const string& l, const pair<string, string>& r) const {
        return l < r.first;
    }
};

template<class F>
auto time_test(F&& f, const vector<string> keys)
{
    auto start_time = chrono::system_clock::now();
    
    for (auto const& key : keys)
    {
        f(key);
    }
    
    auto stop_time = chrono::system_clock::now();
    auto diff =  stop_time - start_time;
    return diff;
}

struct report_key
{
    size_t nkeys;
    int miss_chance;
};

std::ostream& operator<<(std::ostream& os, const report_key& key)
{
    return os << "miss=" << setw(2) << key.miss_chance << "%";
}

void run_test(string_user& sink, size_t nkeys, double miss_prob)
{
    // the types of map we will test
    unordered_map<string, string> unordered;
    map<string, string> ordered;
    vector<pair<string, string>> flat_map;
    
    // a vector of all keys, which we can shuffle in order to randomise
    // access order of all our maps consistently
    vector<string> keys;
    unordered_set<string> keys_record;

    // generate keys
    auto eng = std::default_random_engine(std::random_device()());
    auto alpha_dist = std::uniform_int_distribution<char>('A', 'Z');
    auto prob_dist = std::uniform_real_distribution<double>(0, 1.0 - std::numeric_limits<double>::epsilon());
    
    auto generate_new_key = [&] {
        while(true) {
            // generate a key
            auto key = gen_string(eng, alpha_dist);
            // try to store it in the unordered map
            // if it already exists, force a regeneration
            // otherwise also store it in the ordered map and the flat map
            if(keys_record.insert(key).second) {
                return key;
            }
        }
    };
    
    for (size_t i = 0 ; i < nkeys ; ++i)
    {
        bool inserted = false;
        auto value = to_string(i);

        auto key = generate_new_key();
        if (prob_dist(eng) >= miss_prob) {
            unordered.emplace(key, value);
            flat_map.emplace_back(key, value);
            ordered.emplace(key, std::move(value));
        }
        // record the key for later use
        keys.emplace_back(std::move(key));
    }
    // turn our vector 'flat map' into an actual flat map by sorting it by pair.first. This is the key.
    sort(begin(flat_map), end(flat_map),
         [](const auto& l, const auto& r) { return l.first < r.first; });
    
    // shuffle the keys to randomise access order
    shuffle(begin(keys), end(keys), eng);
    
    auto unordered_lookup = [&](auto& key) {
        auto i = unordered.find(key);
        if (i != end(unordered)) {
            sink.sink(i->second);
        }
    };
    
    auto ordered_lookup = [&](auto& key) {
        auto i = ordered.find(key);
        if (i != end(ordered)) {
            sink.sink(i->second);
        }
    };
    
    auto flat_map_lookup = [&](auto& key) {
        auto i = lower_bound(begin(flat_map),
                             end(flat_map),
                             key,
                             pair_less());
        if (i != end(flat_map) && i->first == key) {
            sink.sink(i->second);
        }
    };
    
    // spawn a thread to time access to the unordered map
    auto unordered_future = async(launch::async,
                                  [&]()
                                  {
                                      return time_test(unordered_lookup, keys);
                                  });
    
    // spawn a thread to time access to the ordered map
    auto ordered_future = async(launch::async, [&]
                                {
                                    return time_test(ordered_lookup, keys);
                                });
    
    // spawn a thread to time access to the flat map
    auto flat_future = async(launch::async, [&]
                             {
                                 return time_test(flat_map_lookup, keys);
                             });
    
    // synchronise all the threads and get the timings
    auto ordered_time = ordered_future.get();
    auto unordered_time = unordered_future.get();
    auto flat_time = flat_future.get();
    
    cout << "searches=" << setw(7) << nkeys;
    cout << " set_size=" << setw(7) << unordered.size();
    cout << " miss=" << setw(7) << setprecision(6) << miss_prob * 100.0 << "%";
    cout << " ordered=" << setw(7) << ordered_time.count();
    cout << " unordered=" << setw(7) << unordered_time.count();
    cout << " flat_map=" << setw(7) << flat_time.count() << endl;

}

int main()
{
    // generate the sink, preventing the optimiser from realising what it
    // does.
    stringstream ss;
    ss << "noprint";
    string arg;
    ss >> arg;
    auto puser = make_sink(arg);
    
    for (double chance = 1.0 ; chance >= 0.0 ; chance -= 0.0001)
    {
        run_test(*puser, 1000000, chance);
    }
    
    
    return 0;
}