在退出脚本之前等待后台进程完成

67

如何确保在退出脚本(TCL/Bash)之前所有的后台进程都已经执行完毕。

我想将所有后台进程的pid写入pidfile,并在最后通过pgrep pidfile检查是否有任何进程仍在运行,然后再退出。

有没有更简单的方法来实现这个功能?是否有TCL特定的方法来做到这一点?


如果您已经挂载了/proc,那么在其中检查可能是从纯Tcl查找PID的最快方法。 - Donal Fellows
5个回答

144
如果您想等待作业完成,请使用wait。这将使shell等待,直到所有后台作业完成。但是,如果您的任何作业将自己变成守护程序,则它们不再是shell的子进程,等待将没有效果(就shell而言,子进程已经完成了)。实际上,当一个进程将自己变成守护程序时,它会通过终止和生成继承其角色的新进程来实现。
#!/bin/sh
{ sleep 5; echo waking up after 5 seconds; } &
{ sleep 1; echo waking up after 1 second; } &
wait
echo all jobs are done!

这个脚本可以运行,但是在运行时我收到了警告 : command not foundsleep 10 & wait顺便说一下,除了这个烦人的警告,wait 命令按预期工作。有什么想法是什么导致了这样的警告吗?- 注意:在 sleep 10 &wait 之间有一个回车符,但是论坛正在剥离它们。 - Marco Marsala
@MarcoMarsala,“&”符号后面跟着下一个命令,解释器会尝试查找一个程序名称为回车符的程序。 - guest

11
您可以使用kill -0来检查特定pid是否正在运行。
假设您在pwd中有名为pid的文件,其中包含pid号码列表。
while true;
do 
    if [ -s pid ] ; then
        for pid in `cat pid`
        do  
            echo "Checking the $pid"
            kill -0 "$pid" 2>/dev/null || sed -i "/^$pid$/d" pid
        done
    else
        echo "All your process completed" ## Do what you want here... here all your pids are in finished stated
        break
    fi
done

1
+1 我只会添加如何存储子进程的解释: cat $items_list_file | { #启动子进程 ( some_command )& echo $! >> pid } - Yordan Georgiev
5
通常情况下,您不能确定正在测试的PID是否仍然标识您的进程。由于PID重用,可能会发生其他进程获取您正在测试的PID。有关Linux中PID重用的更多信息:http://goo.gl/eZceq2 - smbear

3

警告: 代码较长。

一段时间以前,我也遇到了类似的问题:从一个Tcl脚本中启动多个进程,然后等待它们全部完成。这是我编写的演示脚本,用于解决这个问题。

main.tcl

#!/usr/bin/env tclsh

# Launches many processes and wait for them to finish.
# This script will works on systems that has the ps command such as
# BSD, Linux, and OS X

package require Tclx; # For process-management utilities

proc updatePidList {stat} {
    global pidList
    global allFinished

    # Parse the process ID of the just-finished process
    lassign $stat processId howProcessEnded exitCode

    # Remove this process ID from the list of process IDs
    set pidList [lindex [intersect3 $pidList $processId] 0]
    set processCount [llength $pidList]

    # Occasionally, a child process quits but the signal was lost. This
    # block of code will go through the list of remaining process IDs
    # and remove those that has finished
    set updatedPidList {}
    foreach pid $pidList {
        if {![catch {exec ps $pid} errmsg]} {
            lappend updatedPidList $pid
        }
    }

    set pidList $updatedPidList

    # Show the remaining processes
    if {$processCount > 0} {
        puts "Waiting for [llength $pidList] processes"
    } else {
        set allFinished 1
        puts "All finished"
    }
}

# A signal handler that gets called when a child process finished.
# This handler needs to exit quickly, so it delegates the real works to
# the proc updatePidList
proc childTerminated {} {
    # Restart the handler
    signal -restart trap SIGCHLD childTerminated

    # Update the list of process IDs
    while {![catch {wait -nohang} stat] && $stat ne {}} {
        after idle [list updatePidList $stat]
    }
}

#
# Main starts here
#

puts "Main begins"
set NUMBER_OF_PROCESSES_TO_LAUNCH 10
set pidList {}
set allFinished 0

# When a child process exits, call proc childTerminated
signal -restart trap SIGCHLD childTerminated

# Spawn many processes
for {set i 0} {$i < $NUMBER_OF_PROCESSES_TO_LAUNCH} {incr i} {
    set childId [exec tclsh child.tcl $i &]
    puts "child #$i, pid=$childId"
    lappend pidList $childId
    after 1000
}

# Do some processing
puts "list of processes: $pidList"
puts "Waiting for child processes to finish"
# Do some more processing if required

# After all done, wait for all to finish before exiting
vwait allFinished

puts "Main ends"

child.tcl

#!/usr/bin/env tclsh
# child script: simulate some lengthy operations

proc randomInteger {min max} {
    return [expr int(rand() * ($max - $min + 1) * 1000 + $min)]
}

set duration [randomInteger 10 30]
puts "  child #$argv runs for $duration miliseconds"
after $duration
puts "  child #$argv ends"

运行main.tcl的示例输出

Main begins
child #0, pid=64525
  child #0 runs for 17466 miliseconds
child #1, pid=64526
  child #1 runs for 14181 miliseconds
child #2, pid=64527
  child #2 runs for 10856 miliseconds
child #3, pid=64528
  child #3 runs for 7464 miliseconds
child #4, pid=64529
  child #4 runs for 4034 miliseconds
child #5, pid=64531
  child #5 runs for 1068 miliseconds
child #6, pid=64532
  child #6 runs for 18571 miliseconds
  child #5 ends
child #7, pid=64534
  child #7 runs for 15374 miliseconds
child #8, pid=64535
  child #8 runs for 11996 miliseconds
  child #4 ends
child #9, pid=64536
  child #9 runs for 8694 miliseconds
list of processes: 64525 64526 64527 64528 64529 64531 64532 64534 64535 64536
Waiting for child processes to finish
Waiting for 8 processes
Waiting for 8 processes
  child #3 ends
Waiting for 7 processes
  child #2 ends
Waiting for 6 processes
  child #1 ends
Waiting for 5 processes
  child #0 ends
Waiting for 4 processes
  child #9 ends
Waiting for 3 processes
  child #8 ends
Waiting for 2 processes
  child #7 ends
Waiting for 1 processes
  child #6 ends
All finished
Main ends

我尝试过了,但是出现了以下错误:无法等待变量"allFinished":会永远等待 当执行时 "vwait allFinished" - egorulz
我在我的 Mac 上测试了这个解决方案,但没有在其他平台上测试过。如果有时间,我会在 Linux 上测试它。然而,我没有 Windows 机器来找出答案。你是在 Windows 上运行它吗? - Hai Vu
实际上,我正在Freebsd机器上运行这个程序。 - egorulz
我已在Linux上进行了测试,它可以正常工作。我不明白为什么它在FreeBSD上无法工作。 - Hai Vu
有可能出现时间问题吗?在我的测试中,我有3个后台进程,最后一个运行了大约20秒,就是在那次运行中我遇到了错误。 - egorulz
显示剩余2条评论

2

GNU parallelxargs

这两个工具可以使脚本更简单,同时控制最大线程数(线程池)。例如:

seq 10 | xargs -P4 -I'{}' echo '{}'

或者:

seq 10 | parallel -j4  echo '{}'

请参阅:如何编写进程池Bash Shell

0

即使您没有 pid,完成所有后台进程的触发后仍可以触发“wait;”命令。例如,在 commandfile.sh 文件中-

bteq < input_file1.sql > output_file1.sql &
bteq < input_file2.sql > output_file2.sql &
bteq < input_file3.sql > output_file3.sql &
wait

当这个被触发时,就像 -

subprocess.call(['sh', 'commandfile.sh'])
print('all background processes done.')

只有在所有后台进程完成后,才会打印此内容。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接