Ruby进程.spawn标准输出 => 管道缓冲区大小限制

Question

Ruby进程.spawn标准输出 => 管道缓冲区大小限制

4

在Ruby中，我正在使用Process.spawn在新进程中运行命令。我打开了一个双向管道来捕获从生成的进程中的stdout和stderr。这很好地工作，直到写入管道的字节数（命令的stdout）超过64Kb，此时该命令永远不会完成。我认为已经达到了管道缓冲区大小，并且现在管道写入被阻塞，导致进程永远无法完成。在我的实际应用程序中，我正在运行一个具有大量stdout的长命令，当进程完成时，我需要捕获并保存它们。是否有一种方法可以提高缓冲区大小，或者更好的是清空缓冲区，以便不会达到限制？

cmd = "for i in {1..6600}; do echo '123456789'; done"  #works fine at 6500 iterations.

pipe_cmd_in, pipe_cmd_out = IO.pipe
cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)

Process.wait(cmd_pid)
pipe_cmd_out.close
out = pipe_cmd_in.read
puts "child: cmd out length = #{out.length}"

更新 Open3::capture2e 确实适用于我展示的简单示例。但是，对于我的实际应用程序，我需要能够获取生成进程的pid，并控制何时阻止执行。一般的想法是我分叉一个非阻塞进程。在这个分叉中，我生成一个命令。我将命令pid发送回父进程，然后等待命令完成以获取退出状态。当命令完成时，退出状态被发送回父进程。在父进程中，循环每1秒钟检查DB以获取控制操作，例如暂停和恢复。如果它获得一个控制操作，它将向命令pid发送适当的信号以停止、继续。当命令最终完成时，父进程击中救援块并读取退出状态管道，并保存到DB。以下是我的实际流程：

#pipes for communicating the command pid, and exit status from child to parent
pipe_parent_in, pipe_child_out = IO.pipe
pipe_exitstatus_read, pipe_exitstatus_write = IO.pipe

child_pid = fork do
    pipe_cmd_in, pipe_cmd_out = IO.pipe
    cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)
    pipe_child_out.write cmd_pid  #send command pid to parent
    pipe_child_out.close
    Process.wait(cmd_pid)
    exitstatus = $?.exitstatus
    pipe_exitstatus_write.write exitstatus  #send exitstatus to parent
    pipe_exitstatus_write.close
    pipe_cmd_out.close
    out = pipe_cmd_in.read
    #save out to DB
end

#blocking read to get the command pid from the child
pipe_child_out.close
cmd_pid = pipe_parent_in.read.to_i

loop do
    begin
        Process.getpgid(cmd_pid)  #when command is done, this will except
        @job.reload #refresh from DB

        #based on status in the DB, pause / resume command
        if @job.status == 'pausing'
            Process.kill('SIGSTOP', cmd_pid)
        elsif @job.status == 'resuming'
            Process.kill('SIGCONT', cmd_pid)
        end
    rescue
        #command is no longer running
        pipe_exitstatus_write.close
        exitstatus = pipe_exitstatus_read.read
        #save exit status to DB
        break
    end
    sleep 1
end

注意：我不能让父进程轮询命令输出管道，因为父进程会被阻塞等待管道关闭。这样就无法通过控制循环暂停和恢复命令。

- jnome

我正在寻找有关增加管道缓冲区大小的相同问题的答案，但我可以提供一种解决方案，允许您轮询进程状态而不会阻塞。您可以这样做：thrd = Process.detach（pid）＃返回一个线程; 如果thrd.join（1）{#命令退出，执行您想要的任何处理}。当进程完成时，thrd.join（1）返回一个线程，如果达到1的超时时间，则返回nil（表示线程尚未返回）。 - Sam Woods

2个回答

0

确实，您的诊断很可能是正确的。在等待进程结束时，您可以在管道上实现选择和读取循环，但很可能您可以更简单地使用stdlib Open3::capture2e获得所需的结果。

- dbenhur

1

Process.wait是获取退出状态所必需的。这会阻塞执行，使得在管道上有一个读取循环不可能。 - jnome

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- user1118597 · Accepted Answer

这段代码似乎可以实现你想要的功能，并且可能具有说明性。

cmd = "for i in {1..6600}; do echo '123456789'; done"

pipe_cmd_in, pipe_cmd_out = IO.pipe
cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)

@exitstatus = :not_done
Thread.new do
  Process.wait(cmd_pid); 
  @exitstatus = $?.exitstatus
end

pipe_cmd_out.close
out = pipe_cmd_in.read;
sleep(0.1) while @exitstatus == :not_done
puts "child: cmd out length = #{out.length}; Exit status: #{@exitstatus}"

一般来说，在线程之间共享数据（@exitstatus）需要更加小心，但在这里它可以工作，因为它只被线程初始化后写入一次。（事实证明$？.exitstatus可能返回nil，这就是为什么我将其初始化为其他值的原因。）调用sleep()甚至不太可能执行一次，因为上面的read()直到生成的进程关闭其stdout才会完成。