使用boost::asio无栈协程通过HTTP下载多个文件

6

我将《Lua程序设计》一书中Roberto Ierusalimschy的示例,使用boost::asio和stackful协程从下载多个HTTP文件的Lua代码翻译成了C++代码。以下是代码:

#include <iostream>
#include <chrono>
#include <boost/asio.hpp>
#include <boost/asio/spawn.hpp>

using namespace std;
using namespace boost::asio;

io_service ioService;

void download(const string& host, const string& file, yield_context& yield)
{
  clog << "Downloading " << host << file << " ..." << endl;

  size_t fileSize = 0;
  boost::system::error_code ec;

  ip::tcp::resolver resolver(ioService);

  ip::tcp::resolver::query query(host, "80");
  auto it = resolver.async_resolve(query, yield[ec]);

  ip::tcp::socket socket(ioService);
  socket.async_connect(*it, yield[ec]);

  ostringstream req;
  req << "GET " << file << " HTTP/1.0\r\n\r\n";
  write(socket, buffer(req.str()));

  while (true)
  {
    char data[8192];
    size_t bytesRead = socket.async_read_some(buffer(data), yield[ec]);
    if (0 == bytesRead) break;
    fileSize += bytesRead;
  }

  socket.shutdown(ip::tcp::socket::shutdown_both);
  socket.close();

  clog << file << " size: " << fileSize << endl;
}

int main()
{
  auto timeBegin = chrono::high_resolution_clock::now();

  vector<pair<string, string>> resources =
  {
    {"www.w3.org", "/TR/html401/html40.txt"},
    {"www.w3.org", "/TR/2002/REC-xhtml1-20020801/xhtml1.pdf"},
    {"www.w3.org", "/TR/REC-html32.html"},
    {"www.w3.org", "/TR/2000/REC-DOM-Level-2-Core-20001113/DOM2-Core.txt"},
  };

  for(const auto& res : resources)
  {
    spawn(ioService, [&res](yield_context yield)
    {
      download(res.first, res.second, yield);
    });
  }

  ioService.run();

  auto timeEnd = chrono::high_resolution_clock::now();

  clog << "Time: " << chrono::duration_cast<chrono::milliseconds>(
            timeEnd - timeBegin).count() << endl;

  return 0;
}

现在我正在尝试将代码转换为使用无栈协程,来自于boost::asio。但是文档对我来说不够充分,无法彻底理解如何组织代码以使其能够使用无栈协程。有人可以提供解决方案吗?

1个回答

1

这是一种基于Boost提供的无栈协程的解决方案。鉴于它们本质上是一种hack,我不认为这个解决方案特别优雅。虽然使用C++20可能会更好,但我认为那将超出本问题的范围。

#include <functional>
#include <iostream>

#include <boost/asio.hpp>
#include <boost/asio/yield.hpp>

using boost::asio::async_write;
using boost::asio::buffer;
using boost::asio::error::eof;
using boost::system::error_code;

using std::placeholders::_1;
using std::placeholders::_2;

/**
 * Stackless coroutine for downloading file from host.
 *
 * The lifetime of the object is limited to one () call. After that,
 * the object will be copied and the old object is discarded. For this
 * reason, the socket_ and resolver_ member are stored as shared_ptrs,
 * so that they can live as long as there is a live copy. An alternative
 * solution would be to manager these objects outside of the coroutine
 * and to pass them here by reference.
 */
class downloader : boost::asio::coroutine {

  using socket_t = boost::asio::ip::tcp::socket;
  using resolver_t = boost::asio::ip::tcp::resolver;

public:
  downloader(boost::asio::io_service &service, const std::string &host,
             const std::string &file)
      : socket_{std::make_shared<socket_t>(service)},
        resolver_{std::make_shared<resolver_t>(service)}, file_{file},
        host_{host} {}

  void operator()(error_code ec = error_code(), std::size_t length = 0,
                  const resolver_t::results_type &results = {}) {

    // Check if the last yield resulted in an error.
    if (ec) {
      if (ec != eof) {
        throw boost::system::system_error{ec};
      }
    }

    // Jump to after the previous yield.
    reenter(this) {

      yield {
        resolver_t::query query{host_, "80"};

        // Use bind to skip the length parameter not provided by async_resolve
        auto result_func = std::bind(&downloader::operator(), this, _1, 0, _2);

        resolver_->async_resolve(query, result_func);
      }

      yield socket_->async_connect(*results, *this);

      yield {
        std::ostringstream req;
        req << "GET " << file_ << " HTTP/1.0\r\n\r\n";
        async_write(*socket_, buffer(req.str()), *this);
      }

      while (true) {
        yield {
          char data[8192];
          socket_->async_read_some(buffer(data), *this);
        }

        if (length == 0) {
          break;
        }

        fileSize_ += length;
      }

      std::cout << file_ << " size: " << fileSize_ << std::endl;

      socket_->shutdown(socket_t::shutdown_both);
      socket_->close();
    }

    // Uncomment this to show progress and to demonstrace interleaving
    // std::cout << file_ << " size: " << fileSize_ << std::endl;
  }

private:
  std::shared_ptr<socket_t> socket_;
  std::shared_ptr<resolver_t> resolver_;

  const std::string file_;
  const std::string host_;
  size_t fileSize_{};
};

int main() {
  auto timeBegin = std::chrono::high_resolution_clock::now();

  try {
    boost::asio::io_service service;

    std::vector<std::pair<std::string, std::string>> resources = {
        {"www.w3.org", "/TR/html401/html40.txt"},
        {"www.w3.org", "/TR/2002/REC-xhtml1-20020801/xhtml1.pdf"},
        {"www.w3.org", "/TR/REC-html32.html"},
        {"www.w3.org", "/TR/2000/REC-DOM-Level-2-Core-20001113/DOM2-Core.txt"},
    };

    std::vector<downloader> downloaders{};
    std::transform(resources.begin(), resources.end(),
                   std::back_inserter(downloaders), [&](auto &x) {
                     return downloader{service, x.first, x.second};
                   });

    std::for_each(downloaders.begin(), downloaders.end(),
                  [](auto &dl) { dl(); });

    service.run();

  } catch (std::exception &e) {
    std::cerr << "exception: " << e.what() << "\n";
  }

  auto timeEnd = std::chrono::high_resolution_clock::now();

  std::cout << "Time: "
            << std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd -
                                                                     timeBegin)
                   .count()
            << std::endl;

  return 0;
}

使用Boost 1.72编译,g++ -lboost_coroutine -lpthread test.cpp。示例输出:

$ ./a.out 
/TR/REC-html32.html size: 606
/TR/html401/html40.txt size: 629
/TR/2002/REC-xhtml1-20020801/xhtml1.pdf size: 115777
/TR/2000/REC-DOM-Level-2-Core-20001113/DOM2-Core.txt size: 229699
Time: 1644

()函数的结尾处取消注释日志行,以演示下载交错的情况。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接