我们已经找到了与感染有关的一些域名。现在我们有一个.json文件列出了DNS名称列表,我想生成一个总结输出,显示:用户列表,他们访问的唯一域名和总计数。如果我还可以获得每个域名的计数,则额外加分。
以下是文件示例:
以下是文件示例:
{"machine": "possible_victim01", "domain": "evil.com", "timestamp":1435071870}
{"machine": "possible_victim01", "domain": "evil.com", "timestamp":1435071875}
{"machine": "possible_victim01", "domain": "soevil.com", "timestamp":1435071877}
{"machine": "possible_victim02", "domain": "bad.com", "timestamp":1435071877}
{"machine": "possible_victim03", "domain": "soevil.com", "timestamp":1435071879}
理想情况下,我希望输出的结果类似于:
{"possible_victim01": "total": 3, {"evil.com": 2, "soevil.com": 1}}
{"possible_victim02": "total": 1, {"bad.com": 1}}
{"possible_victim03": "total": 1, {"soevil.com": 1}}
我很乐意妥协接受:
{"possible_victim01": "total": 3, ["evil.com", "soevil.com"]}
{"possible_victim02": "total": 1, ["bad.com"]}
{"possible_victim03": "total": 1, ["soevil.com"]}
我可以获取每个用户的总记录数,但会失去域名列表:
cat sample.json | jq -s 'group_by(.machine) | map({machine:.[0].machine,domain:.[0].domain, count:length}) '
[{"machine": "possible_victim01", "domain": "evil.com", "count": 3},
{"machine": "possible_victim02", "domain": "bad.com", "count": 1},
{"machine": "possible_victim03", "domain": "soevil.com", "count": 1}]
本文介绍如何解决问题的后半部分......JQ聚合和交叉表。目前为止,我还没有找到任何描述第一部分的内容,即如何达成以下结果:
{"machine": "possible_victim01", "domain": "evil.com", "count":2}
{"machine": "possible_victim01", "domain": "soevil.com", "count":1}
{"machine": "possible_victim02", "domain": "bad.com", "count":1}
{"machine": "possible_victim03", "domain": "soevil.com", "count":1}