在发布模式下Boost ASIO导致段错误

3

我写了一个小的示例代码,展示了我正在编写的程序存在的相同问题:在调试模式下运行良好,在发布模式下会出现段错误。问题似乎是在发布模式下 ui_context 被调用执行它分配的工作时为 nullptr。 在 Fedora 33 上运行,使用 g++ (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)clang version 11.0.0 (Fedora 11.0.0-2.fc33) 这两个编译器的行为都相同。Boost 版本是 1.75。

代码:


#include <iostream>
#include <vector>
#include <memory>
#include <chrono>
#include <thread>

#include <boost/asio.hpp>
#include <boost/signals2.hpp>

constexpr auto MAX_LOOP_COUNT = 100;

class network_client : public std::enable_shared_from_this<network_client>
{
private:
    using Signal = boost::signals2::signal<void(int)>;
public:
    network_client(boost::asio::io_context &context) : 
    strand(boost::asio::make_strand(context))
    {
        std::cout << "network client created" << std::endl;
    }
    void doNetworkWork()
    {
        std::cout << "doing network work" << std::endl;
        boost::asio::post(strand,std::bind(&network_client::onWorkComplete,shared_from_this()));
    }
    void onWorkComplete()
    {
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
        std::cout << "signalling completion" << " from thread id:" << std::this_thread::get_id() << std::endl;
        signal(42);
    }
    void workCompleteHandler(const typename Signal::slot_type &slot)
    {
        signal.connect(slot);
    }

private : 
    boost::asio::strand<boost::asio::io_context::executor_type> strand;
    Signal signal;
};

class network_client_producer
{
public :
    network_client_producer() : work(boost::asio::make_work_guard(context))
    {
        using run_function = boost::asio::io_context::count_type (boost::asio::io_context::*)();        
        for (int i = 0; i < 2; i++)
        {
            context_threads.emplace_back(std::bind(static_cast<run_function>(&boost::asio::io_context::run), std::ref(context)));
        }
    }
    ~network_client_producer()
    {
        context.stop();
        for(auto&& thread : context_threads)
        {
            if(thread.joinable())
            {
                thread.join();
            }
        }
    }
    using NetworkClientPtr = std::shared_ptr<network_client>;
    NetworkClientPtr makeNetworkClient()
    {
        return std::make_shared<network_client>(context);
    }

private : 
    boost::asio::io_context context;
    std::vector<std::thread> context_threads;
    boost::asio::executor_work_guard<boost::asio::io_context::executor_type> work;
};


class desktop : public std::enable_shared_from_this<desktop>
{
public:
    desktop(const boost::asio::io_context::executor_type &executor):executor(executor)
    {
    }
    void doSomeNetworkWork()
    {
        auto client = client_producer.makeNetworkClient();
        client->workCompleteHandler([self = shared_from_this()](int i){
            //post work into the UI thread
            std::cout << "calling into the uiThreadWork with index " << i << " from thread id:" << std::this_thread::get_id() << std::endl;
            boost::asio::post(self->executor, std::bind(&desktop::uiThreadWorkComplete, self, i));
        });
        client->doNetworkWork();
    }
    void showDesktop()
    {
        std::this_thread::sleep_for(std::chrono::milliseconds(20));
    }
public:
    void uiThreadWorkComplete(int i)
    {
        std::cout << "Called in the UI thread with index:" << i << ", on thread id:" << std::this_thread::get_id() << std::endl;
    }
private:
    const boost::asio::io_context::executor_type& executor;
    network_client_producer client_producer;
};

int main()
{
    std::cout << "Starting application. Main thread id:"<<std::this_thread::get_id() << std::endl;
    
    int count = 0;
    boost::asio::io_context ui_context;
    auto work = boost::asio::make_work_guard(ui_context);
    /*auto work = boost::asio::require(ui_context.get_executor(),
                                     boost::asio::execution::outstanding_work.tracked);*/
    auto ui_desktop = std::make_shared<desktop>(ui_context.get_executor());

    ui_desktop->doSomeNetworkWork();

    while(true)
    {
        ui_context.poll_one();

        ui_desktop->showDesktop();

        if (count >= MAX_LOOP_COUNT)
            break;
        count++;
    }
    ui_context.stop();
    std::cout << "Stopping application" << std::endl;
    return 0;
}

使用 g++ -std=c++17 -g -o main -pthread -O3 main.cpp 编译并在 gdb 中运行,我得到了以下结果:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Starting application. Main thread id:140737348183872
[New Thread 0x7ffff7a51640 (LWP 27082)]
[New Thread 0x7ffff7250640 (LWP 27083)]
network client created
doing network work
signalling completion from thread id:140737348179520
calling into the uiThreadWork with index 42 from thread id:140737348179520

Thread 2 "main" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7a51640 (LWP 27082)]
0x000000000040b7b8 in boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u>::execute<std::_Bind<void (desktop::*(std::shared_ptr<desktop>, int))(int)> >(std::_Bind<void (desktop::*(std::shared_ptr<desktop>, int))(int)>&&) const (this=<optimized out>, f=...) at /usr/local/include/boost/asio/impl/io_context.hpp:309
309       io_context_->impl_.post_immediate_completion(p.p,


在没有进行任何优化的情况下编译,g++ -std=c++17 -g -o main -pthread -O0 main.cpp 的工作效果符合预期。

我尽可能地让它接近实际执行网络IO操作的程序,这也是为什么我把那个strand放进去的原因。

很明显我在这里做错了什么。问题是:出了什么问题? 感谢您提供的任何指针。

2个回答

3

添加内存错误检测器 -fsanitize=undefined,address:

Starting application. Main thread id:139902898299968
network client created
doing network work
signalling completion from thread id:139902399940352
calling into the uiThreadWork with index 42 from thread id:139902399940352
=================================================================
==29084==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc8393d7c0 at pc 0x000000507ce0 bp 0x7f3d90d9e3d0 sp 0x7f3d90d9e3c8
READ of size 8 at 0x7ffc8393d7c0 thread T1
    #0 0x507cdf in boost::asio::io_context::basic_executor_type<std::allocator<void>, ...
    #1 0x507cdf in boost::asio::detail::initiate_post_with_executor<boost::asio::io_co...
    #2 0x507cdf in auto boost::asio::post<boost::asio::io_context::basic_executor_type...
    #3 0x5077cf in desktop::doSomeNetworkWork()::'lambda'(int)::operator()(int) const ...
    #4 0x518ce2 in boost::function1<void, int>::operator()(int) const /home/sehe/custo...
    #5 0x518481 in boost::signals2::detail::void_type boost::signals2::detail::call_wi...
    #6 0x518481 in boost::signals2::detail::void_type boost::signals2::detail::variadi...
    #7 0x517f43 in boost::signals2::detail::slot_call_iterator_t<boost::signals2::deta...
    #8 0x516397 in void boost::signals2::optional_last_value<void>::operator()<boost::...
    #9 0x516397 in void boost::signals2::detail::combiner_invoker<void>::operator()<bo...
    #10 0x516397 in boost::signals2::detail::signal_impl<void (int), boost::signals2::...
    #11 0x50d9d4 in network_client::onWorkComplete() /home/sehe/Projects/stackoverflow...
    #12 0x51021d in void std::_Bind<void (network_client::* (std::shared_ptr<network_c...
    #13 0x51021d in void boost::asio::asio_handler_invoke<std::_Bind<void (network_cli...
    #14 0x51021d in void boost_asio_handler_invoke_helpers::invoke<std::_Bind<void (ne...
    #15 0x51021d in boost::asio::detail::executor_op<std::_Bind<void (network_client::...
    #16 0x51188e in boost::asio::detail::strand_executor_service::invoker<boost::asio:...
    #17 0x514311 in void boost::asio::asio_handler_invoke<boost::asio::detail::strand_...
    #18 0x514311 in void boost_asio_handler_invoke_helpers::invoke<boost::asio::detail...
    #19 0x514311 in boost::asio::detail::executor_op<boost::asio::detail::strand_execu...
    #20 0x4d8704 in boost::asio::detail::scheduler::do_run_one(boost::asio::detail::co...
    #21 0x4d70dc in boost::asio::detail::scheduler::run(boost::system::error_code&) /h...
    #22 0x523a6e in boost::asio::io_context::run() /home/sehe/custom/boost_1_75_0/boos...
    #23 0x5258ef in unsigned long std::_Bind<unsigned long (boost::asio::io_context::*...
    #24 0x5258ef in unsigned long std::__invoke_impl<unsigned long, std::_Bind<unsigne...
    #25 0x5258ef in std::__invoke_result<std::_Bind<unsigned long (boost::asio::io_con...
    #26 0x5258ef in unsigned long std::thread::_Invoker<std::tuple<std::_Bind<unsigned...
    #27 0x7f3da660bd7f  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd0d7f)
    #28 0x7f3da5f856da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
    #29 0x7f3da5a9671e in clone /build/glibc-S7xCS9/glibc-2.27/misc/../sysdeps/unix/sy...

Address 0x7ffc8393d7c0 is located in stack of thread T0 at offset 224 in frame
    #0 0x4cb30f in main /home/sehe/Projects/stackoverflow/test.cpp:109

  This frame has 6 object(s):
    [32, 40) 'ref.tmp.i85' (line 96)
    [64, 80) 'ref.tmp.i'
    [96, 112) 'ui_context' (line 113)
    [128, 152) 'work' (line 114)
    [192, 208) 'ui_desktop' (line 117)
    [224, 240) 'ref.tmp' (line 117) <== Memory access at offset 224 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope /home/sehe/custom/boost_1_75_0/boost/asio/io_context.hpp:678:25 in boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u>::basic_executor_type(boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u> const&)
Shadow bytes around the buggy address:
  0x10001071faa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10001071fab0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2
  0x10001071fac0: f8 f2 f2 f2 00 f2 f2 f2 00 00 f3 f3 00 00 00 00
  0x10001071fad0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10001071fae0: 00 f2 f2 f2 f8 f8 f2 f2 00 00 f2 f2 00 00 00 f2
=>0x10001071faf0: f2 f2 f2 f2 00 00 f2 f2[f8]f8 f3 f3 00 00 00 00
  0x10001071fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10001071fb10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10001071fb20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10001071fb30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10001071fb40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
Thread T1 created by T0 here:
    #0 0x483a6a in pthread_create (/home/sehe/Projects/stackoverflow/sotest+0x483a6a)
    #1 0x7f3da660c014 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std...
    #2 0x5241d4 in void std::vector<std::thread, std::allocator<std::thread> >::_M_realloc_ins...
    #3 0x523517 in std::thread& std::vector<std::thread, std::allocator<std::thread> >::emplac...

==29084==ABORTING

这就是罪魁祸首。

搜索罪魁祸首

  This frame has 6 object(s):
    [32, 40) 'ref.tmp.i85' (line 96)
    [64, 80) 'ref.tmp.i'
    [96, 112) 'ui_context' (line 113)
    [128, 152) 'work' (line 114)
    [192, 208) 'ui_desktop' (line 117)
    [224, 240) 'ref.tmp' (line 117) <== Memory access at offset 224 is inside this variable

那是什么变量?显然在这一行中。
auto ui_desktop = std::make_shared<desktop>(ui_context.get_executor());

有一个暂存对象需要保留引用。它必须是 ui_context.get_executor(),因为ui_desktop是有命名和“显然”的生命周期。

的确,desktop通过引用声明了它的执行器成员:

const boost::asio::io_context::executor_type& executor;

这是一个明显的错误。Executors 不是服务或执行上下文,它们被设计为可便宜地可复制并通过值传递。问题很简单:
boost::asio::io_context::executor_type executor;

奖励

作为奖励,这里有一个简化版,运行演示半秒钟。注意事项如下:

  • 使用线程池而不是手动编写有缺陷的线程池
  • 考虑不在执行上下文中使用 .stop(),或遗忘冗余的字词保护?

在线编译器上查看

#include <iostream>
#include <chrono>
#include <iomanip>
#include <memory>
#include <thread>

#include <boost/asio.hpp>
#include <boost/signals2.hpp>

namespace {
    using namespace std::chrono_literals;
    auto now = std::chrono::high_resolution_clock::now;
    auto elapsed = [start=now()] { return (now()-start)/1ms; };

    inline std::string thread_hash() {
        static constexpr std::hash<std::thread::id> h{};
        std::ostringstream oss;
        oss << std::hex << std::setw(2) << std::setfill('0')
            << h(std::this_thread::get_id()) % 0xff;
        return oss.str();
    }

    auto trace = [](auto const&... args) {
        std::cout << "thread #" << thread_hash() << " at t+" << std::setw(3) << elapsed() << "ms\t";
        (std::cout << ... << args) << std::endl;
    };
} // namespace

struct network_client : std::enable_shared_from_this<network_client> {
    explicit network_client(const boost::asio::any_io_executor& context) : strand(make_strand(context)) {
        trace("network client created");
    }

    void doNetworkWork() {
        trace("doing network work");
        post(strand, std::bind(&network_client::onWorkComplete, shared_from_this()));
    }

    void onWorkComplete() {
        std::this_thread::sleep_for(10ms);
        trace("signalling completion");
        signal(42);
    }

    template <typename F> void workCompleteHandler(F slot) {
        signal.connect(std::move(slot));
    }

  private:
    boost::asio::strand<boost::asio::any_io_executor> strand;
    using Signal = boost::signals2::signal<void(int)>;
    Signal signal;
};

struct network_client_producer {
    auto makeNetworkClient() {
        return std::make_shared<network_client>(context_threads.get_executor());
    }

  private : 
    boost::asio::thread_pool context_threads {2};
};

struct desktop : std::enable_shared_from_this<desktop> {
    explicit desktop(boost::asio::io_context::executor_type executor) : executor(std::move(executor)) {}
    void doSomeNetworkWork() {
        auto client = client_producer.makeNetworkClient();
        client->workCompleteHandler([this, self = shared_from_this()](int i) {
            // post work into the UI thread
            trace("calling into the uiThreadWork with index ", i);
            post(executor, std::bind(&desktop::uiThreadWorkComplete, self, i));
        });
        client->doNetworkWork();
    }

    static void showDesktop() {
        trace("showDesktop");
        std::this_thread::sleep_for(20ms);
    }

    void uiThreadWorkComplete(int i) const {
        trace("Called in the UI thread with index:", i);
    }

  private:
    boost::asio::io_context::executor_type executor;
    network_client_producer client_producer;
};

int main() {
    trace("Starting application. Main thread is #", thread_hash());

    boost::asio::io_context ui_context;
    auto work = boost::asio::make_work_guard(ui_context);
    /*auto work = boost::asio::require(ui_context.get_executor(),
                                     boost::asio::execution::outstanding_work.tracked);*/
    auto ui_desktop = std::make_shared<desktop>(ui_context.get_executor());

    ui_desktop->doSomeNetworkWork();

    for (auto deadline = now() + 0.5s; now() < deadline;) {
        ui_context.poll_one();
        ui_desktop->showDesktop();
    }

    trace("Stopping application");
    work.reset();
    ui_context.run();
    // ui_context.stop();
    trace("Bye\n");
}

打印

thread #97 at t+  0ms   Starting application. Main thread is #97
thread #97 at t+  0ms   network client created
thread #97 at t+  1ms   doing network work
thread #97 at t+  1ms   showDesktop
thread #2d at t+ 11ms   signalling completion
thread #2d at t+ 11ms   calling into the uiThreadWork with index 42
thread #97 at t+ 21ms   Called in the UI thread with index:42
thread #97 at t+ 21ms   showDesktop
thread #97 at t+ 41ms   showDesktop
thread #97 at t+ 61ms   showDesktop
thread #97 at t+ 81ms   showDesktop
thread #97 at t+101ms   showDesktop
thread #97 at t+122ms   showDesktop
thread #97 at t+142ms   showDesktop
thread #97 at t+162ms   showDesktop
thread #97 at t+182ms   showDesktop
thread #97 at t+202ms   showDesktop
thread #97 at t+222ms   showDesktop
thread #97 at t+242ms   showDesktop
thread #97 at t+262ms   showDesktop
thread #97 at t+282ms   showDesktop
thread #97 at t+302ms   showDesktop
thread #97 at t+323ms   showDesktop
thread #97 at t+343ms   showDesktop
thread #97 at t+363ms   showDesktop
thread #97 at t+383ms   showDesktop
thread #97 at t+403ms   showDesktop
thread #97 at t+423ms   showDesktop
thread #97 at t+443ms   showDesktop
thread #97 at t+463ms   showDesktop
thread #97 at t+483ms   showDesktop
thread #97 at t+503ms   Stopping application
thread #97 at t+504ms   Bye

增加了一个演示 Live On Compiler Explorer ,其中包含了一些简化的建议。 - sehe
1
谢谢,非常有教育意义。执行器不是服务或执行上下文,它们被设计为可以廉价地复制并按值传递。我之前不知道这一点。感谢您启发我。 - serje

1
问题在于您的executor是一个临时对象的引用。在您的main方法中,您调用ui_context.get_executor(),它返回一个临时对象。您将临时对象传递给desktop构造函数,该构造函数将此对象的引用存储在成员变量executor中。在main中的auto ui_desktop = ...行完成后,临时对象超出范围,并且由executor持有的引用变得无效。

启用地址消毒编译程序(-fsanitize=address)也会检测到此问题:

==24629==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe8bee0270 at pc 0x5584b2d89b0c bp 0x7f3ac42fd970 sp 0x7f3ac42fd960
READ of size 8 at 0x7ffe8bee0270 thread T1
    #0 0x5584b2d89b0b in boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u>::basic_executor_type(boost::asio::io_context::basic_executor_type<std::allocator<void>, 0u> const&) /usr/include/boost/asio/io_context.hpp:678
...

我会怀疑在您的调试版本中,临时对象的生命周期会稍微长一些,也就是说,当临时对象超出范围后,占用它的堆栈内存不会立即被回收。而在发布版本中,会应用更加激进的优化,这会导致内存更快地被回收,从而使引用更早失效,然后在访问引用时崩溃程序。
为了解决这个问题,您必须确保由get_executor返回的执行器不会超出范围,以便ui_desktop对象持有的引用保持有效。例如,您可以将get_executor的结果分配给main中的一个变量:
  auto executor{ui_context.get_executor()};
  auto ui_desktop = std::make_shared<desktop>(executor);

哇,是的,你说得对。我不知道为什么我认为context.get_executor()会给我一个与上下文一样长寿的执行器。我真的以为它是上下文本身的一部分。但现在这有意义了。 - serje

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接