如何从文本文件中仅删除IP地址

3
我有两个文件,一个是ips.yaml
-
  dedicatedip: 1.1.1.11
-
  dedicatedip: 2.2.2.2
-
  dedicatedip: ''
-
  dedicatedip: 3.3.3.3
-
  dedicatedip: 3.3.3.33

这是我的result.txt文件内容:
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| id       | hostname              | some other field | status | networks                    | plan          |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| 11111111 | some-hostname         | some-value       | Active | External_Networks=1.1.1.11  | not important |
| 11111111 | some another hostname |                  | Active | External_Networks=1.1.1.111 | not important |
| 11111111 | some.fqdn.com         |                  | Active | External_Networks=1.1.1.112 | not important |
| 11111111 | fourth.hostname.com   |                  | Active | External_Networks=1.1.1.1   | not important |
| 11111111 | the-other.com         | IHaveSomething   | Active | External_Networks=2.2.2.2   | not important |
| 11111111 | not.least.com         |                  | Active | External_Networks=2.2.2.25  | not important |
| 11111111 | last.fqdn.com         | The Last Value   | Active | External_Networks=3.3.3.39  | not important |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+

我想从ips.yaml中删除每个已存在的IP,并将其从result.txt中移除,以便获得预期的输出结果。
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| id       | hostname              | some other field | status | networks                    | plan          |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| 11111111 | some another hostname |                  | Active | External_Networks=1.1.1.111 | not important |
| 11111111 | some.fqdn.com         |                  | Active | External_Networks=1.1.1.112 | not important |
| 11111111 | fourth.hostname.com   |                  | Active | External_Networks=1.1.1.1   | not important |
| 11111111 | not.least.com         |                  | Active | External_Networks=2.2.2.25  | not important |
| 11111111 | last.fqdn.com         | The Last Value   | Active | External_Networks=3.3.3.39  | not important |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+

这是我的当前bash:
while read -r yaml_ips
do
    while read -r result_ips
    do
        if [ "$yaml_ips" == "$result_ips" ]
        then
            sed "/$yaml_ips/d" result.txt
        fi
    done < <(grep -Eo 'External_Networks=[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' result.txt | awk -F '=' '{print $2}')
done < <(awk '/dedicatedip: [0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}/ {print $2}' ips.yaml)

这是我的当前输出:
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| id       | hostname              | some other field | status | networks                    | plan          |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| id       | hostname              | some other field | status | networks                    | plan          |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| 11111111 | some-hostname         | some-value       | Active | External_Networks=1.1.1.11  | not important |
| 11111111 | some another hostname |                  | Active | External_Networks=1.1.1.111 | not important |
| 11111111 | some.fqdn.com         |                  | Active | External_Networks=1.1.1.112 | not important |
| 11111111 | fourth.hostname.com   |                  | Active | External_Networks=1.1.1.1   | not important |
| 11111111 | last.fqdn.com         | The Last Value   | Active | External_Networks=3.3.3.39  | not important |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+

3
关于您标记为“不重要”的数据 - 是的,它是重要的。很容易出现针对该字段中某些值失败的解决方案,或者根据该字段可能包含的内容提供更好的答案。我无法写出最初打算的答案,因为我不知道那些“不重要”的值可能是什么。在提问时,永远不要假设任何东西都不重要,因为您不知道答案可能是什么 - 提供真正代表所有数据值的示例。 - undefined
2
我相信你之前已经被引用过了,但是你的脚本中有嵌套循环,所以请阅读为什么使用shell循环处理文本被认为是不良实践 - undefined
@EdMorton非常感谢您的评论。关于“不重要”的字段,我只是想说“网络”不是$NF或最后一个字段。此外,所有计划名称都有两个部分“A B”,这就是为什么我添加了一个空格。是的,你是对的。我本可以为所有计划写一个名称,只是为了表明最后一个字段不是“网络”。 - undefined
1
很明显,“networks”并不是“$NF”,这不是问题所在。我的观点是,根据“plan”字段的内容不同,可能会有不同的解决方案。你现在告诉我们它包含两个以空格分隔的字符串——我们无法从每个字段中的“不重要”猜出来,但如果你在问题中说明了这一点并提供了代表性的值,那将会很有用,会导致提出不同的解决方案,因此很重要。 - undefined
@EdMorton 是的先生,那是我的错误假设。 - undefined
3个回答

2
这个两步的awk可以在一个命令中完成这个任务:
awk -F ': ' '
FNR == NR {
   if (NF > 1 && $2 ~ /^[0-9]/)
      ips[$2]
   next
}
FNR > 3 {
   nw = $6
   sub(/^[^=]*=/, "", nw)
   if (nw in ips)
      next
}
1' ips.yaml FS='[[:blank:]]*[|][[:blank:]]*' result.txt

i+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| id       | hostname              | some other field | status | networks                    | plan          |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| 11111111 | some another hostname |                  | Active | External_Networks=1.1.1.111 | not important |
| 11111111 | some.fqdn.com         |                  | Active | External_Networks=1.1.1.112 | not important |
| 11111111 | fourth.hostname.com   |                  | Active | External_Networks=1.1.1.1   | not important |
| 11111111 | not.least.com         |                  | Active | External_Networks=2.2.2.25  | not important |
| 11111111 | last.fqdn.com         | The Last Value   | Active | External_Networks=3.3.3.39  | not important |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+

这里:

  • -F ': '将字段分隔符设置为": ",用于第一个文件ips.yaml
  • FS='[[:blank:]]*[|][[:blank:]]*'将字段分隔符设置为|,两边带有空格,用于第二个文件result.txt
  • NF > 1 && $2 ~ /^[0-9]/:过滤第一个文件中实际的ip地址行,并将每个ip地址存储在关联数组ips中的ips[$2]
  • FNR > 3从第4行开始处理第二个文件的第二个字段,跳过前3行的标题
  • sub(/^[^=]*=/, "", nw)从第二个文件的networks列即$6中移除等号前的文本和等号

2
请尝试使用您提供的示例,编写以下GNU awk代码。
awk '
FNR==NR && /^ +/{
  ips[$2]
  next
}
(match($0,/External_Networks=(\S+)/,arr) && (arr[1] in ips)){
  next
}
(/^+-/) || (/^\|/){
  print
  next
} ' FS=": " ips.yaml result.txt

解释:为上述代码添加详细解释。

awk '                              ##starting awk program from here.
FNR==NR && /^ +/{                  ##Checking condition FNR==NR which will be TRUE when ips.yaml file is being read and checking condition if a line starts with 1 or more spaces then do following.
  ips[$2]                          ##Creating array ips with index of $2 here.
  next                             ##next will skip all further statements from here.
}
(match($0,/External_Networks=(\S+)/,arr) && (arr[1] in ips)){ ##Using match function to match External_Networks= followed by all 1 or more non-spaces and storing IPs values into array arr.
  next                             ##next will skip all further statements from here.
}
(/^+-/) || (/^\|/){                ##Checking condition if line starts from +- OR | then do following.
  print                            ##Print that line.
  next                             ##next will skip all further statements from here.
} ' FS=": " ips.yaml result.txt   ##Setting FS to colon space for file ips.yaml and then mentioning result.txt files.

2
使用任何awk命令:
$ awk '
    NR==FNR { if ($1 == "dedicatedip:") ips[$2]; next }
    { o=$0; f=sub(/.*External_Networks=/,""); ip=$1; $0=o }
    !( f && (ip in ips) )
' ips.yaml result.txt
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| id       | hostname              | some other field | status | networks                    | plan          |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+
| 11111111 | some another hostname |                  | Active | External_Networks=1.1.1.111 | not important |
| 11111111 | some.fqdn.com         |                  | Active | External_Networks=1.1.1.112 | not important |
| 11111111 | fourth.hostname.com   |                  | Active | External_Networks=1.1.1.1   | not important |
| 11111111 | not.least.com         |                  | Active | External_Networks=2.2.2.25  | not important |
| 11111111 | last.fqdn.com         | The Last Value   | Active | External_Networks=3.3.3.39  | not important |
+----------+-----------------------+------------------+--------+-----------------------------+---------------+

1
非常好,避免了2个FS模式 - undefined

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接