并行化ping扫描

Question

并行化ping扫描

3

我正在尝试扫描一个包含大约65,000个地址的IP块。我们被指示使用bash和ICMP数据包，并找到一种并行化的方法。这是我想出的方案：

#!/bin/bash
ping() {
  if ping -c 1 -W 5 131.212.$i.$j >/dev/null
  then
      ((++s))
      echo -n "*"
  else
      ((++f))
      echo -n "."
  fi
  ((++j))
  #if j has reached 255, set it to zero and increment i
  if [ $j -gt 255 ]; then
      j=0
      ((++i))
      echo "Pinging 131.212.$i.xx IP Block...\n"
  fi
}

s=0 #number of responses recieved
f=0 #number of failures recieved
i=0 #IP increment 1
j=0 #IP increment 2
curProcs=$(ps | wc -l)
maxProcs=$(getconf OPEN_MAX)
while [ $i -lt 256 ]; do
    curProcs=$(ps | wc -l)
    if [ $curProcs -lt $maxProcs ]; then
      ping &
    else
      sleep 10
    fi
done
echo "Found "$s" responses and "$f" timeouts."
echo /usr/bin/time -l
done

然而，我在 macOS 上遇到了以下错误：

redirection error: cannot duplicate fd: Too many open files

我的理解是我超过了资源限制，我已经尝试通过仅在现有进程计数少于指定的最大值时启动新的ping进程来纠正这个问题，但这并没有解决问题。

感谢您的时间和建议。

编辑：下面有很多关于使用现有工具进行此操作的好建议。由于我的学术要求受到限制，我最终将ping循环拆分为每个12.34.x.x块的不同进程，虽然这很丑陋，但在5分钟内完成了任务。这段代码有很多问题，但它可能是未来某个人的好起点。

#!/bin/bash

#############################
#      Ping Subfunction     #
#############################
# blocks with more responses will complete first since worst-case scenerio
# is O(n) if no IPs generate a response
pingSubnet() {
  for ((j = 0 ; j <= 255 ; j++)); do
    # send a single ping with a timeout of 5 sec, piping output to the bitbucket
    if ping -c 1 -W 1 131.212."$i"."$j" >/dev/null
    then
        ((++s))
    else
        ((++f))
    fi
  done
  #echo "Recieved $s responses with $f timeouts in block $i..."
  # output number of success results to the pipe opened in at the start
  echo "$s" >"$pipe"
  exit 0
}

#############################
#   Variable Declaration    #
#############################
start=$(date +%s) #start of execution time
startMem=$(vm_stat | awk '/Pages free/ {print $3}' | awk 'BEGIN { FS = "\." }; {print ($1*0.004092)}' | sed 's/\..*$//');
startCPU=$(top -l 1 | grep "CPU usage" | awk '{print 100-$7;}' | sed 's/\..*$//')
s=0 #number of responses recieved
f=0 #number of failures recieved
i=0 #IP increment 1
j=0 #IP increment 2

#############################
#    Pipe Initialization    #
#############################
# create a pipe for child procs to write to
# child procs inherit runtime environment of parent proc, but cannot
# write back to it (like passing by value in C, but the whole env)
# hence, they need somewhere else to write back to that the parent
# proc can read back in
pipe=/tmp/pingpipe
trap 'rm -f $pipe' EXIT
if [[ ! -p $pipe ]]; then
    mkfifo $pipe
    exec 3<> $pipe
fi

#############################
#     IP Block Iteration    #
#############################
# adding an ampersand to the end forks the command to a separate, backgrounded
# child process. this allows for parellel computation but adds logistical
# challenges since children can't write the parent's variables
echo "Initiating scan processes..."
while [ $i -lt 256 ]; do
      #echo "Beginning 131.212.$i.x block scan..."
      #ping subnet asynchronously
      pingSubnet &
      ((++i))
done
echo "Waiting for scans to complete (this may take up to 5 minutes)..."
peakMem=$(vm_stat | awk '/Pages free/ {print $3}' | awk 'BEGIN { FS = "\." }; {print ($1*0.004092)}' | sed 's/\..*$//')
peakCPU=$(top -l 1 | grep "CPU usage" | awk '{print 100-$7;}' | sed 's/\..*$//')
wait
echo -e "done" >$pipe

#############################
#    Concat Pipe Outputs    #
#############################
# read each line from the pipe we created earlier, adding the number
# of successes up in a variable
success=0
echo "Tallying responses..."
while read -r line <$pipe; do
    if [[ "$line" == 'done' ]]; then
      break
    fi
    success=$((line+success))
done

#############################
#    Output Statistics      #
#############################
echo "Gathering Statistics..."
fail=$((65535-success))
#output program statistics
averageMem=$((peakMem-startMem))
averageCPU=$((peakCPU-startCPU))
end=$(date +%s) #end of execution time
runtime=$((end-start))
echo "Scan completed in $runtime seconds."
echo "Found $success active servers and $fail nonresponsive addresses with a timeout of 1."
echo "Estimated memory usage was $averageMem MB."
echo "Estimated CPU utilization was $averageCPU %"

- Reticulated Spline

1

将 echo -n "*" 更改为 echo -n "${j} " 可以显示循环内部 j 的值没有发生变化。 - charlesreid1

1

谢谢。我还没有意识到执行环境会传递给子进程，但反过来则不行。 - Reticulated Spline

3个回答

2

不要那样做。

使用fping。它将比你的程序更有效地进行探测。

$ brew install fping

通过的神奇之处，它将使其变得可用。

- J_H

看起来fping可以将这个项目压缩成一行代码，谢谢。然而，出于学习的目的，如果可能的话，我仍然想让我的脚本运行。 - Reticulated Spline

2

ICMP并不像TCP那样。Ping本质上是一种主机到主机的协议，而不是应用程序到应用程序的协议。它在回显到源的有效载荷数据上进行分解。您计划派生出大量的/sbin/ping进程，其中大部分将永远不会收到回复。这没有多少意义，并且将打破您的“子进程的最大文件描述符数量”配额。不要这样做。通过有限数量的子ping学习，或者使用工业级别的工具fping来完成大规模的网络探测。 - J_H

好的，Stack Overflow是一个面向专业和爱好者程序员的问答网站。 - James Brown

ICMP基本上是一种主机到主机的协议，因此使用100个/sbin/ping对100个主机进行ping测试与fping malloc数据结构来保存100个地址是完全不同的。在后者中，内核接收响应数据包并将其传递给fping，我们就完成了。而在前者中，内核将响应数据包发送到每个ping进程，并安排它们，最终我们才完成。考虑到OP的要求是高性能和可伸缩性，这毫无意义。如果学习是目标，请重新编写fping，可能使用短暂的ping进程，并使用持久的tcpdump来收集答案。 - J_H

1

当然，这并不像您试图构建的那样优化，但您可以在后台启动最大允许数量的进程，等待它们结束并开始下一批。就像这样（除了我使用的是sleep 1）：

for i in {1..20}             # iterate some
do 
    sleep 1 &                # start in the background
    if ! ((i % 5))           # after every 5th (using mod to detect)
    then 
        wait %1 %2 %3 %4 %5  # wait for all jobs to finish
    fi
done

- James Brown

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Mark Setchell · Accepted Answer

这应该会给你一些使用GNU Parallel的想法。

parallel --dry-run -j 64 -k ping 131.212.{1}.{2} ::: $(seq 1 3) ::: $(seq 11 13)

ping 131.212.1.11
ping 131.212.1.12
ping 131.212.1.13
ping 131.212.2.11
ping 131.212.2.12
ping 131.212.2.13
ping 131.212.3.11
ping 131.212.3.12
ping 131.212.3.13

-j64 每次同时执行64个ping请求
-dry-run 表示不执行任何操作，仅展示将会执行的内容
-k 表示保持输出顺序 - （这样你就能理解它）

:::引入参数，我已经用不同的数字（从1到3，然后从11到13）重复了它们，以便您可以区分两个计数器并查看生成的所有排列组合。