MySQL - 0 [ERROR] Error in accept: Bad file descriptor MySQL - 0 [错误] 接受时出错：文件描述符错误

Question

MySQL - 0 [ERROR] Error in accept: Bad file descriptor MySQL - 0 [错误] 接受时出错：文件描述符错误

5

最近在Debian (Debian 3.2.78-1 x86_64 GNU/Linux)上升级到了MySQL 5.7.12，但几个小时后服务器就会挂起。以下是syslog和mysql.log中不断出现的错误信息： 2016-06-13T18:05:20.261209Z 0 [ERROR] Error in accept: Bad file descriptor MySQL信息如下： mysql Ver 14.14 Distrib 5.7.12-5, for debian-linux-gnu (x86_64) using 6.2

以下是my.cnf文件中mysqld部分的一些值，这些值可以帮助调整：

[mysqld]
max_allowed_packet      = 64M
thread_stack            = 256K
thread_cache_size       = 8

max_connections         = 150
max_connect_errors      = 10000
connect_timeout         = 30
wait_timeout            = 86400
table_open_cache        = 2048
open_files_limit        = 65535

query_cache_limit       = 4M
query_cache_size        = 128M
query_cache_type    = 1

server-id               = 1
log_bin                 = /var/log/mysql/mysql-bin.log
expire_logs_days        = 10
max_binlog_size         = 100M

# * InnoDB
innodb_file_per_table
innodb_buffer_pool_instances=2
innodb_buffer_pool_size=2G
thread_pool_size = 24

- gdakram

5个回答

2

我们在一个安装了mysql 5.7.13的Ubuntu 16.04系统上遇到了同样的问题。我们通过以下方式增加了systemd中的最大打开文件参数： /etc/systemd/system/mysql.service.d/10-ulimit.conf

[Service]
LimitNOFILE=1000000

到目前为止，这个问题再次没有发生。也许现在MySQL需要更多的文件描述符。

- nick

提醒一下，在我的FC24系统上，使用DNF更新mariadb时，systemd文件被覆盖了，导致这个问题再次出现。 - 111

很遗憾，这个修复方法在这里不起作用。我在使用MySQL 5.7.17的全新Ubuntu 16.04和Ubuntu 16.10实例中遇到了这个错误。(顺便说一句，我认为首先需要建立mysql.service文件夹并运行systemctl daemon-reload)。 - mahemoff

1

进行了一些研究，发现以下内容;

也出现在MariaDB中

https://lists.launchpad.net/maria-discuss/msg03060.html https://mariadb.atlassian.net/browse/MDEV-8995
Percona Server/Percona XtraDB Cluster

https://groups.google.com/forum/#!topic/percona-discussion/Tu0S2OvYqKA
2010/2012年的旧漏洞

https://bugs.mysql.com/bug.php?id=48929 http://lists.mysql.com/commits/96472
一些有趣的信息（不应该发生）

https://lists.mysql.com/mysql/97275

[我为Percona工作]

- Roel Van de Paar

0

我也遇到了同样的错误，但是所有的解决方案都没有起作用。经过我们的研究，发现是apparmor拒绝了我们的日志目录，导致了坏的文件描述符错误信息。

- Freddy

0

我在升级到Percona Cluster 5.7.14-26.17-1.trusty后遇到了同样的问题。

ulimit.conf建议没有帮助，我已经通过编辑/etc/security/limits.conf和/etc/sysctl.conf确保有足够的文件句柄。

我可以通过telnet到post 3306然后断开连接来轻松重现这个问题；服务器随后会陷入旋转并记录此错误。

这个可怕的解决方法在我的环境中看起来很有前途，就是避免使用端口3306上的TCP连接，而改用unix套接字。

您可以通过更改/etc/mysql/my.cnf中的端口号，然后使用socat从端口3306代理到套接字。

nohup socat TCP4-LISTEN:3306,fork UNIX-CONNECT:/var/run/mysqld/mysqld.sock&

如果我在3306端口上使用telnet并断开连接，我就无法引发问题。我打算回报一下这个方法的持续时间如何。

顺便说一句，代码似乎期望有时会发生这种情况：

for (uint retry= 0; retry < MAX_ACCEPT_RETRY; retry++)
{
  socket_len_t length= sizeof(struct sockaddr_storage);
  connect_sock= mysql_socket_accept(key_socket_client_connection, listen_sock,
                                    (struct sockaddr *)(&cAddr), &length);
  if (mysql_socket_getfd(connect_sock) != INVALID_SOCKET ||
      (socket_errno != SOCKET_EINTR && socket_errno != SOCKET_EAGAIN))
    break;
}
if (mysql_socket_getfd(connect_sock) == INVALID_SOCKET)
{
  /*
    accept(2) failed on the listening port, after many retries.
    There is not much details to report about the client,
    increment the server global status variable.
  */
  connection_errors_accept++;
  if ((m_error_count++ & 255) == 0) // This can happen often
    sql_print_error("Error in accept: %s", strerror(errno));
  if (socket_errno == SOCKET_ENFILE || socket_errno == SOCKET_EMFILE)
    sleep(1);             // Give other threads some time
  return NULL;
}

- Edward Hibbert

虽然 socat 的解决方法看起来不太美观，但它一直表现良好，所以我建议其他遇到这个问题的人也可以尝试使用。看起来这是一个存在已久的问题，可能只影响某些环境（请参见 Google 和 @Roel 的评论）。 - Edward Hibbert

更新。使用socat解决方法后，我仍然看到一些循环产生这些错误，但它们似乎最终会终止，因此系统基本上是稳定的。 - Edward Hibbert

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- sethsn · Accepted Answer

我找到了问题（或可能是其中一个问题）。这是mysqld的strace摘录：

...
socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 20
write(2, "2017-01-29T22:22:45.433033Z 0 [N"..., 72) = 72
setsockopt(20, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
setsockopt(20, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
bind(20, {sa_family=AF_INET6, sin6_port=htons(3306), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(20, 70)                          = 0
fcntl(20, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(20, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
 ...
accept(20, {sa_family=AF_INET6, sin6_port=htons(58332), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 37
rt_sigaction(SIGCHLD, {SIG_DFL, [CHLD], SA_RESTORER|SA_RESTART, 0x7f3ddeac84b0}, {SIG_DFL, [], 0}, 8) = 0
getpeername(37, {sa_family=AF_INET6, sin6_port=htons(58332), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
getsockname(37, {sa_family=AF_INET6, sin6_port=htons(3306), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
open("/etc/hosts.allow", O_RDONLY)      = 38
fstat(38, {st_mode=S_IFREG|0644, st_size=589, ...}) = 0
read(38, "# /etc/hosts.allow: list of host"..., 4096) = 589
read(38, "", 4096)                      = 0
close(38)                               = 0
open("/etc/hosts.deny", O_RDONLY)       = 38
fstat(38, {st_mode=S_IFREG|0644, st_size=704, ...}) = 0
read(38, "# /etc/hosts.deny: list of hosts"..., 4096) = 704
close(38)                               = 0
socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 38
connect(38, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = 0
sendto(38, "<36>Jan 29 14:23:08 mysqld[13052"..., 72, MSG_NOSIGNAL, NULL, 0) = 72
shutdown(20, SHUT_RDWR)                 = 0
close(20)                               = 0

poll([{fd=20, events=POLLIN}, {fd=22, events=POLLIN}], 2, -1) = 1 ([{fd=20, revents=POLLNVAL}])
accept(-1, 0x7ffe6ebd7160, 0x7ffe6ebd70fc) = -1 EBADF (Bad file descriptor)
write(2, "2017-01-29T22:23:08.109451Z 0 [E"..., 75) = 75
 ... rinse and repeat *REALLY* fast!

在使用 tcp_wrappers 锁定系统时，我不小心将 mysqld 从 hosts.allow 和 hosts.deny 中删除了。看起来，在检查了 hosts.allow 和 hosts.deny 后，mysqld 关闭并关闭套接字，就像您所期望的那样。然而，它随即开始轮询（现在不存在的）套接字以进行活动。

我刚刚进行了另一个测试，其中我的 tcp_wrappers 配置正确。当我从授权主机连接时一切正常；但是当我从被阻止的地址连接时，出现了相同的问题。基于此，我建议使用其他工具来保护 mysqld，并使您的 tcp_wrappers 配置比防火墙更开放。话虽如此，这个 bug 仍然应该被修复！

这个修复程序还没有经过时间的考验，所以，像往常一样，结果可能会有所不同。希望它能帮到你。

Nick