如何在Shell脚本中循环遍历jq数组结果

3
我有以下JSON数据:
{
  "id": {
    "bioguide": "E000295",
    "thomas": "02283",
    "govtrack": 412667,
    "opensecrets": "N00035483",
    "lis": "S376"
  },
  "bio": {
    "gender": "F",
    "birthday": "1970-07-01"
  },
  "tooldatareports": [
    {
      "name": "A",
      "tooldata": [
        {
          "toolid": 12345,
          "data": [
            {
              "time": "2021-01-01",
              "value": 1
            },
            {
              "time": "2021-01-02",
              "value": 10
            },
            {
              "time": "2021-01-03",
              "value": 5
            }
          ]
        },
        {
          "toolid": 12346,
          "data": [
            {
              "time": "2021-01-01",
              "value": 10
            },
            {
              "time": "2021-01-02",
              "value": 100
            },
            {
              "time": "2021-01-03",
              "value": 50
            }
          ]
        }
      ]
    }
  ]
}

现在我可以使用以下命令行获取一个包含两个字典的列表, 每个字典都有一个键"数据",值是一个列表。

cat data.json |jq -n --stream '[fromstream(inputs | (.[0] | index("data")) as $ix | select($ix) | .[0] |= .[$ix:])]'

我希望能够在shell脚本中使用循环打印每个字典。
我的代码中每个循环里都有一个打印字典的语句,总共有2个字典需要打印。
但是打印结果看起来像是一个字符串。
以下是我的shell脚本:
array=$(cat ernst.json | jq -n --stream '[fromstream(inputs | (.[0] | index("data")) as $ix | select($ix) | .[0] |= .[$ix:])]')

for d in $array
do
    echo $d
done

有人有什么想法吗?

2个回答

2

您的输出是一个json列表,所以您不能直接在bash循环中使用它。bash循环不知道如何处理方括号,并将每个空格和换行符用作分隔符。再次使用jq从json列表中获取单独的对象。可以使用jq -c '.[]'来完成这个操作:

array=$(cat data.json |jq -n --stream '[fromstream(inputs | (.[0] | index("data")) as $ix | select($ix) | .[0] |= .[$ix:])]' | jq -c '.[]')

现在你有两个对象在两条单独的行中,并且没有空格 (-c = 紧凑输出),你可以在bash中循环遍历它们:

for d in $array
do
    echo "$d"
    # whatever else you need to do with them
done

{"data":[{"time":"2021-01-01","value":1},{"time":"2021-01-02","value":10},{"time":"2021-01-03","value":5}]}
{"data":[{"time":"2021-01-01","value":10},{"time":"2021-01-02","value":100},{"time":"2021-01-03","value":50}]}

你已经调用了两次 jq。能否只用一次调用完成? - Philippe
1
@Philippe 是的,你可以在第一个 jq 调用中添加 -c[]。虽然我并不总是喜欢最紧凑的解决方案,但我通常更喜欢清晰地分离关注点。这就是管道非常适合的地方。在这种情况下,我认为展示每个步骤的作用更为重要。 - cornuz
非常清晰地解释了Bash循环,谢谢。 - CYC

2
$ jq -cn --stream '[fromstream(inputs | (.[0] | index("data")) as $ix | select($ix) | .[0] |= .[$ix:])][]' data.json | while read d; do echo "item: $d"; done
item: {"data":[{"time":"2021-01-01","value":1},{"time":"2021-01-02","value":10},{"time":"2021-01-03","value":5}]}
item: {"data":[{"time":"2021-01-01","value":10},{"time":"2021-01-02","value":100},{"time":"2021-01-03","value":50}]}

请注意,您可以使用更简单的jq获得非常相似的输出:
jq -c '.tooldatareports[].tooldata[].data' data.json  | while read d; do echo "item: $d"; done
item: [{"time":"2021-01-01","value":1},{"time":"2021-01-02","value":10},{"time":"2021-01-03","value":5}]
item: [{"time":"2021-01-01","value":10},{"time":"2021-01-02","value":100},{"time":"2021-01-03","value":50}]

您可以使用以下方法获得相同的输出(这似乎是不必要的):

$ jq -c '.tooldatareports[].tooldata[].data | {"data": .}' data.json  | while read d; do echo "item: $d"; done
item: {"data":[{"time":"2021-01-01","value":1},{"time":"2021-01-02","value":10},{"time":"2021-01-03","value":5}]}
item: {"data":[{"time":"2021-01-01","value":10},{"time":"2021-01-02","value":100},{"time":"2021-01-03","value":50}]}

不错的方法,但我需要使用 --stream 标志,因为我的最终目标文件非常大。 - CYC

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接