Bash：捕获在后台运行的命令的输出

Question

Bash：捕获在后台运行的命令的输出

77

我正在尝试编写一个bash脚本，用于获取在后台运行的命令的输出。不幸的是，我无法让它工作，我赋值输出的变量为空 - 如果我使用echo命令替换赋值，则一切正常工作。

#!/bin/bash

function test {
    echo "$1"
}

echo $(test "echo") &
wait

a=$(test "assignment") &
wait

echo $a

echo done

这段代码将生成以下输出：

echo

done

把任务分配更改为

a=`echo $(test "assignment") &`

这个方法可以运行，但似乎应该有更好的实现方式。

- rthur

6个回答

48

在bash中，处理协处理器的一种非常强大的方法是使用内置命令coproc。

假设您有一个名为banana的脚本或函数，您希望在后台运行它，同时捕获所有输出并执行一些操作，然后等待直到完成。我将使用以下内容进行模拟：

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
        sleep 1
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

您将使用coproc如下运行banana：

coproc bananafd { banana; }

这就像是运行banana &，但多了以下附加功能：它会创建两个文件描述符，分别位于数组bananafd中（输出在索引0处，输入在索引1处）。您可以使用read内置函数来捕获banana的输出：

IFS= read -r -d '' -u "${bananafd[0]}" banana_output

试一下：

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
        sleep 1
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

stuff

IFS= read -r -d '' -u "${bananafd[0]}" banana_output

echo "$banana_output"

注意：在香蕉结束之前，你必须完成stuff！如果大猩猩比你更快：

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

stuff

IFS= read -r -d '' -u "${bananafd[0]}" banana_output

echo "$banana_output"

在这种情况下，您将会获得类似于这样的错误信息：

./banana: line 22: read: : invalid file descriptor specification

你可以检查是否已经太晚（即你做stuff花费的时间太长），因为在执行完coproc之后，bash会删除数组bananafd中的值，这就是我们得到前面错误的原因。

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

stuff

if [[ -n ${bananafd[@]} ]]; then
    IFS= read -r -d '' -u "${bananafd[0]}" banana_output
    echo "$banana_output"
else
    echo "oh no, I took too long doing my stuff..."
fi

最后，如果你真的不想错过任何大猩猩的动作，即使你做你的事情太久了，你也可以将香蕉的文件描述符复制到另一个fd，例如3，然后做你的事情，然后从3读取：

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
        sleep 1
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

# Copy file descriptor banana[0] to 3
exec 3>&${bananafd[0]}

stuff

IFS= read -d '' -u 3 output
echo "$output"

这将非常有效！最后的read也将充当wait的角色，以便output将包含banana的完整输出。

太棒了：无需处理临时文件（bash会悄悄地处理一切）并且100％纯bash！

希望这有所帮助！

- gniourf_gniourf

7

@user2352030 的回答并不简单！它甚至更加复杂，因为它需要你创建一个文件然后再删除它。如果你想以安全的方式使用他的答案，你需要使用 mktemp，甚至可能还需要使用 trap！不要被看似简单的外表所迷惑！ - gniourf_gniourf

也许我错了，但是从我的理解来看，他的可移植sh shell方法强制我这样做，但是使用bash构造exec 3< <(command)则不需要。 - rthur

1

@user2352030 coproc 的另一个优点是，您可以通过检查数组 bananafd 是否设置来确定后台进程是否已完成。 - gniourf_gniourf

@rthur，是的，你说得对，我在谈论最后一种方法。 - gniourf_gniourf

3

这似乎是解决我遇到的问题的最佳方案，但官方仅允许同时运行一个 coproc，这限制了它的使用。在我的情况下，我要初始化四个不同的 bash 变量，需要运行四个昂贵的命令，当它们按顺序运行时需要花费 2.5 秒钟，而并行执行可以将时间缩短到约 0.9 秒钟。尽管如此，它仍然能够处理四个 coproc，但在第一个未完成之前启动的其他三个 coproc 会抛出警告。 - ShadowRanger

显示剩余6条评论

16

捕获后台命令输出的一种方法是将其输出重定向到文件中，并在后台进程结束后从文件中获取输出：

test "assignment" > /tmp/_out &
wait
a=$(</tmp/_out)

- anubhava

有没有不使用文件的方式来做到这一点？ - rthur

是的，有一种（仅限于bash）的方法。请看我的回答。 - Jo So

1

这是唯一为我工作的方法，谢谢！ - chrismarx

3

我也使用文件重定向。例如：

exec 3< <({ sleep 2; echo 12; })  # Launch as a job stdout -> fd3
cat <&3  # Lock read fd3

更多真实案例如果我想要4个并行工作者的输出：toto、titi、tata和tutu。我将每个工作者重定向到不同的文件描述符上（在fd变量中）。然后读取这些文件描述符将会阻塞，直到EOF <=管道断开 <=命令完成。

#!/usr/bin/env bash

# Declare data to be forked
a_value=(toto titi tata tutu)
msg=""

# Spawn child sub-processes
for i in {0..3}; do
  ((fd=50+i))
  echo -e "1/ Launching command: $cmd with file descriptor: $fd!"
  eval "exec $fd< <({ sleep $((i)); echo ${a_value[$i]}; })"
  a_pid+=($!)  # Store pid
done

# Join child: wait them all and collect std-output
for i in {0..3}; do
  ((fd=50+i));
  echo -e "2/ Getting result of: $cmd with file descriptor: $fd!"
  msg+="$(cat <&$fd)\n"
  ((i_fd--))
done

# Print result
echo -e "===========================\nResult:"
echo -e "$msg"

应输出：

1/ Launching command:  with file descriptor: 50!
1/ Launching command:  with file descriptor: 51!
1/ Launching command:  with file descriptor: 52!
1/ Launching command:  with file descriptor: 53!
2/ Getting result of:  with file descriptor: 50!
2/ Getting result of:  with file descriptor: 51!
2/ Getting result of:  with file descriptor: 52!
2/ Getting result of:  with file descriptor: 53!
===========================
Result:
toto
titi
tata
tutu

注1: coproc仅支持一个协处理器而不是多个。

注2: 对于旧版本的bash（4.2），wait命令存在缺陷，无法检索我启动的作业的状态。在bash 5中它运行良好，但文件重定向适用于所有版本。

- Tinmarino

2

当您在后台运行命令并等待两个命令时，请将其分组。

{ echo a & echo b & wait; } | nl

输出结果将是：

     1  a
     2  b

但请注意，如果第二个任务运行得比第一个任务快，输出可能会无序。

{ { sleep 1; echo a; } & echo b & wait; } | nl

反向输出：

     1  b
     2  a

如果需要分离两个后台任务的输出，就必须在某个地方缓冲输出，通常是在一个文件中。例如：

#! /bin/bash

t0=$(date +%s)                               # Get start time

trap 'rm -f "$ta" "$tb"' EXIT                # Remove temp files on exit.

ta=$(mktemp)                                 # Create temp file for job a.
tb=$(mktemp)                                 # Create temp file for job b.

{ exec >$ta; echo a1; sleep 2; echo a2; } &  # Run job a.
{ exec >$tb; echo b1; sleep 3; echo b2; } &  # Run job b.

wait                                         # Wait for the jobs to finish.

cat "$ta"                                    # Print output of job a.
cat "$tb"                                    # Print output of job b.

t1=$(date +%s)                               # Get end time

echo "t1 - t0: $((t1-t0))"                   # Display execution time.

脚本的总运行时间为三秒，尽管后台作业的睡眠时间总共为五秒。而且后台作业的输出是有序的。

a1
a2
b1
b2
t1 - t0: 3

你也可以使用内存缓冲区来存储作业的输出。但是，只有当你的缓冲区足够大以存储整个作业的输出时，才能起作用。

#! /bin/bash

t0=$(date +%s)

trap 'rm -f /tmp/{a,b}' EXIT
mkfifo /tmp/{a,b}

buffer() { dd of="$1" status=none iflag=fullblock bs=1K; }

pids=()
{ echo a1; sleep 2; echo a2; } > >(buffer /tmp/a) &
pids+=($!)
{ echo b1; sleep 3; echo b2; } > >(buffer /tmp/b) &
pids+=($!)

# Wait only for the jobs but not for the buffering `dd`.
wait "${pids[@]}" 

# This will wait for `dd`.
cat /tmp/{a,b}

t1=$(date +%s)

echo "t1 - t0: $((t1-t0))"

上述方法也适用于使用cat而非dd。但这样你就无法控制缓冲区大小。

- ceving

0

如果你有GNU Parallel，你可能可以使用parset：

myfunc() {
  sleep 3
  echo "The input was"
  echo "$@"
}
export -f myfunc
parset a,b,c myfunc ::: myarg-a "myarg  b" myarg-c
echo "$a"
echo "$b"
echo "$c"

请参见：https://www.gnu.org/software/parallel/parset.html

- Ole Tange

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jo So · Accepted Answer

Bash确实有一个名为进程替换的功能来完成这个任务。

$ echo <(yes)
/dev/fd/63

这里，表达式<(yes)被替换为一个（伪设备）文件的路径名，该文件连接到异步作业yes的标准输出（它会无限循环地打印字符串y）。

现在让我们尝试从中读取：

$ cat /dev/fd/63
cat: /dev/fd/63: No such file or directory

这里的问题在于yes进程在收到SIGPIPE（因为它的stdout没有读者）后终止了。

解决方案是以下结构：

$ exec 3< <(yes)  # Save stdout of the 'yes' job as (input) fd 3.

在后台作业开始之前，这将以输入fd 3的形式打开文件。

现在您可以随时从后台作业中读取。举个愚蠢的例子

$ for i in 1 2 3; do read <&3 line; echo "$line"; done
y
y
y

请注意，这与让后台作业写入驱动器支持的文件的语义略有不同：当缓冲区满时，后台作业将被阻塞（通过从fd读取来清空缓冲区）。相比之下，只有在硬盘没有响应时，写入支持驱动器的文件才会被阻塞。

进程替换不是POSIX sh功能。

这是一种快速的技巧，可以使异步作业具有驱动支持（几乎）而无需为其分配文件名：

$ yes > backingfile &  # Start job in background writing to a new file. Do also look at `mktemp(3)` and the `sh` option `set -o noclobber`
$ exec 3< backingfile  # open the file for reading in the current shell, as fd 3
$ rm backingfile       # remove the file. It will disappear from the filesystem, but there is still a reader and a writer attached to it which both can use it.

$ for i in 1 2 3; do read <&3 line; echo "$line"; done
y
y
y

Linux最近也加入了O_TEMPFILE选项，这使得这种黑客攻击可以在文件不可见的情况下实现。我不知道bash是否已经支持它。

更新:

@rthur，如果您想捕获来自fd 3的整个输出，则使用

output=$(cat <&3)

请注意，一般情况下你无法捕获二进制数据：只有在输出是符合POSIX标准的文本时，它才是一个定义良好的操作。我所知道的实现方式只是过滤掉所有的NUL字节。此外，POSIX规定必须删除所有结尾的换行符。

(还要注意，如果写入器永远不停止（yes永远不停止），捕获输出将导致OOM。但是如果未额外编写行分隔符，即使使用read也会出现这个问题)