异步连接和断开连接使用 epoll（Linux）

Question

异步连接和断开连接使用 epoll（Linux）

8

我需要在Linux中使用epoll为tcp客户端提供异步连接和断开功能。在Windows中有一些外部函数，如ConnectEx、DisconnectEx、AcceptEx等等... 在tcp服务器中，标准的accept函数可以工作，但在tcp客户端中，连接和断开都无法工作... 所有的套接字都是非阻塞的。

我该怎么做呢？

谢谢！

- Alexander

这可能会对你有所帮助：https://dev59.com/803Sa4cB1Zd3GeqPzOqD - Skippy Fastol

作为链接到DJB页面的建议的可能替代方案，我想建议尝试使用dup和close描述符（并使用副本）。在我的理解中，虽然没有经过测试，但应该可以工作。文档指出，不检查close的返回值是一个严重的编程错误，因为它可能返回先前的错误。这正是您想要的（如果close出现错误，则connect失败）。当然，如果您使用epoll，则保证具有getsockopt（SO_ERROR）将正常工作的操作系统... - Damon

1

如果可行的话，最简单的选择是在connect()返回后再设置NON_BLOCK。 - CodeClown42

@goldilocks：+1，除非你使用工作线程，否则它不是异步的，但我同意它的简单性很诱人。此外，DNS解析--你可能需要--无论如何都需要一个工作线程，除非你想在上面阻塞(getattrinfo_a也会在内部执行此操作)。因此，当你在工作线程中阻塞时，你也可以在连接上阻塞... - Damon

我有一个工作线程来满足我所有的需求（tcp服务器/客户端，udp套接字，timerfd）。在这个线程中，我使用epoll进行异步工作。所以我等待epoll_wait(...)，然后做我需要做的事情。例如：如果套接字是监听套接字 - 我调用accept函数，使用此套接字创建新的客户端，并将其添加到epoll队列中。但是在tcpclient中 - 在连接完成之前我无法将其添加到epoll中... 如果我这样做 - 客户端会连接多次（3-4次）... - Alexander

4个回答

3

从经验上看，当检测到非阻塞连接时，epoll 与 select 和 poll 有一些不同。

使用 epoll：

在进行 connect() 调用后，检查返回代码。

如果连接不能立即完成，则使用 epoll 注册 EPOLLOUT 事件。

调用 epoll_wait()。

如果连接失败，则您的事件将填充 EPOLLERR 或 EPOLLHUP，否则将触发 EPOLLOUT。

- Bugs

是的，我在回答中忘记提到epoll除了EPOLLOUT之外还可以返回EPOLLERR或EPOLLHUP。谢谢你提醒，已经更正了。 - Ambroz Bizjak

1

我已尝试过 Sonny 的解决方案，但 epoll_ctl 将返回无效参数。因此，我认为正确的做法可能如下:

1.创建 socketfd 和 epollfd。

2.使用 epoll_ctl 将 socketfd 和 epollfd 与 epoll 事件关联。

3.执行 connect(socketfd，...)。

4.检查返回值或 errno。

5.如果 errno == EINPROGRESS，请执行 epoll_wait。

- Aaron

1

我在这里提供一个“完整”的答案，以防其他人也在寻找此信息：

#include <sys/epoll.h>
#include <errno.h>
....
....
int retVal = -1;
socklen_t retValLen = sizeof (retVal);

int status = connect(socketFD, ...);
if (status == 0)
 {
   // OK -- socket is ready for IO
 }
else if (errno == EINPROGRESS)
 {
    struct epoll_event newPeerConnectionEvent;
    int epollFD = -1;
    struct epoll_event processableEvents;
    unsigned int numEvents = -1;

    if ((epollFD = epoll_create (1)) == -1)
    {
       printf ("Could not create the epoll FD list. Aborting!");
       exit (2);
    }     

    newPeerConnectionEvent.data.fd = socketFD;
    newPeerConnectionEvent.events = EPOLLOUT | EPOLLIN | EPOLLERR;

    if (epoll_ctl (epollFD, EPOLL_CTL_ADD, socketFD, &newPeerConnectionEvent) == -1)
    {
       printf ("Could not add the socket FD to the epoll FD list. Aborting!");
       exit (2);
    }

    numEvents = epoll_wait (epollFD, &processableEvents, 1, -1);

    if (numEvents < 0)
    {
       printf ("Serious error in epoll setup: epoll_wait () returned < 0 status!");
       exit (2);
    }

    if (getsockopt (socketFD, SOL_SOCKET, SO_ERROR, &retVal, &retValLen) < 0)
    {
       // ERROR, fail somehow, close socket
    }

    if (retVal != 0) 
    {
       // ERROR: connect did not "go through"
    }   
}
else
{
   // ERROR: connect did not "go through" for other non-recoverable reasons.
   switch (errno)
   {
     ...
   }
}

- Sonny

我认为你在epoll_wait()之后的错误检查是不正确的 - 你应该始终通过getsockopt(SO_ERROR)检查连接尝试的结果，即使你没有得到EPOLLERR。请参阅man页面http://linux.die.net/man/2/connect中的EINPROGRESS。此外，assert()是处理关键错误的错误方式 - 这意味着你已经证明它永远不会发生。改用exit()，即使定义了NDEBUG，也将终止程序。 - Ambroz Bizjak

刚刚添加了建议的编辑。未经编辑的版本对我来说似乎可以工作。 - Sonny

将超时设置为-1，上述程序中的epoll_wait不应该无限期地阻塞吗？ - wonder

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ambroz Bizjak · Accepted Answer

为了执行非阻塞 connect()，假设套接字已经被设置成非阻塞模式:

int res = connect(fd, ...);
if (res < 0 && errno != EINPROGRESS) {
    // error, fail somehow, close socket
    return;
}

if (res == 0) {
    // connection has succeeded immediately
} else {
    // connection attempt is in progress
}

对于第二种情况，当connect() 函数返回EINPROGRESS错误时（仅适用于此情况），您需要等待套接字可写，例如在epoll中指定您正在等待此套接字的EPOLLOUT事件。一旦您收到通知它已经可写（使用epoll时，还要期望收到EPOLLERR或EPOLLHUP事件），检查连接尝试的结果：

int result;
socklen_t result_len = sizeof(result);
if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &result, &result_len) < 0) {
    // error, fail somehow, close socket
    return;
}

if (result != 0) {
    // connection failed; error code is in 'result'
    return;
}

// socket is ready for read()/write()

根据我的经验，在Linux上，connect()函数从来不会立即成功，你总是需要等待可写事件。然而，例如在FreeBSD上，我曾经看到非阻塞的connect()函数对本地主机的连接能够立即成功。