将扁平化的对象数组转换为嵌套树形结构(位置数据)

3

嗨,我想把一个给定的平面对象数组(即数据集)转换成一个具有重复键的树形结构。

注意:我的实际输入数据包含约140K个对象元素,其完全相同的4个键如下所列。

输入示例:

[
   {
      "continent_name":"Asia",
      "country_name":"Iran",
      "subdivision_1_name":"Chaharmahal and Bakhtiari Province",
      "city_name":"Lir Abi"
   },
   {
      "continent_name":"Europe",
      "country_name":"Cyprus",
      "subdivision_1_name":"Ammochostos",
      "city_name":"Protaras"
   },
   {
      "continent_name":"Asia",
      "country_name":"Iran",
      "subdivision_1_name":"West
Azerbaijan Province",
      "city_name":"Post"
   },
   {
      "continent_name":"Africa",
      "country_name":"Somalia",
      "subdivision_1_name":"Bakool",
      "city_name":"Oddur"
   }
]

输出样例:

[
        {
           label: "Asia",
           children: [
               {
                   label: 'Iran',
                   children: [
                       {
                           label: 'Chaharmahal and Bakhtiari Province',
                           children: [
                               {
                                  label: 'Lir Abi',
                                  children: []
                               }
                           ]
                       },
                       {
                          label: 'West Azerbaijan Province',
                          children: [
                              {
                                 label: 'Post',
                                 children: []
                              }
                          ]
                       }
                   ]
               }
           ]
        },
        {
          label: "Africa",
          children: [
              {
                  label: 'Somalia',
                  children: [
                      {
                          label: 'Bakool',
                          children: [
                              {
                                  label: 'Oddur',
                                  children: []
                              }
                          ]
                      }
                  ]
              }
          ]
        },
        {
          label: "Europe",
          children: [
              {
                  label: 'Cyprus',
                  children: [
                      {
                          label: 'Ammochostos',
                          children: [
                              {
                                  label: 'Protaras',
                                  children: []
                              }
                          ]
                      }
                  ]
              }
          ]
        }
    ]

我试图使用的代码如下:

    const returnTree = []
    function unflatten(data, property, returnArr) {
        for (let i = 0; i < data.length; i++) {
            const currObj = data[i];
            const currContinent = data[i][property]
            let continentIdx = returnArr.findIndex(obj => obj.label === currContinent)
            if (continentIdx === -1) {
                continentIdx = returnArr.length
                returnArr.push({
                    'label': currContinent,
                    'children': [currObj]
                })
            } else {
                returnArr[continentIdx].children.push(currObj)
            }
            // exceeed max call stack if I continue even one more level in
            unflatten(returnArr[continentIdx].children, 'country_name', returnTree)
        }
        console.log(returnArr)
        return returnArr
    }
    unflatten(inputData, 'continent_name', returnTree)

我遇到的问题是使用这种递归方法时,超过了最大调用堆栈限制,我想知道是否有更好的方法来处理这个问题,也许可以采用迭代的方式?

非常感谢帮助!谢谢!

3个回答

2

另一种方法是使用对象作为哈希表,并为每个子数组设置结果集。

const
    data = [{ continent_name: "Asia", country_name: "Iran", subdivision_1_name: "Chaharmahal and Bakhtiari Province", city_name: "Lir Abi" }, { continent_name: "Europe", country_name: "Cyprus", subdivision_1_name: "Ammochostos", city_name: "Protaras" }, { continent_name: "Asia", country_name: "Iran", subdivision_1_name: "West Azerbaijan Province", city_name: "Post" }, { continent_name: "Africa", country_name: "Somalia", subdivision_1_name: "Bakool", city_name: "Oddur" }],
    keys = ["continent_name", "country_name", "subdivision_1_name", "city_name"],
    result = data
        .reduce((r, o) => {
            keys.reduce(function (q, k) {
                const label = o[k];
                if (!q[label]) q._.push({ label, children: (q[label] = { _: [] })._ });
                return q[label];
            }, r);
            return r;
        }, { _: [] })
        ._;

console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }


0
我认为递归不是这里的解决方法,因为对于每个属性值(用于创建对象),在生成的结构中只有一个可以放置。在迭代属性时,将最后一个外部数组保留在一个外部变量中,您可以使用.find查找是否已经插入匹配的对象 - 如果没有,则创建一个。

const input=[{continent_name:"Asia",country_name:"Iran",subdivision_1_name:"Chaharmahal and Bakhtiari Province",city_name:"Lir Abi"},{continent_name:"Europe",country_name:"Cyprus",subdivision_1_name:"Ammochostos",city_name:"Protaras"},{continent_name:"Asia",country_name:"Iran",subdivision_1_name:"West Azerbaijan Province",city_name:"Post"},{continent_name:"Africa",country_name:"Somalia",subdivision_1_name:"Bakool",city_name:"Oddur"}];

const output = [];
for (const obj of input) {
  let nestedArr = output;
  for (const [key, value] of Object.entries(obj)) {
    const existingInnerObj = nestedArr.find(({ label }) => label === value);
    if (existingInnerObj) {
      nestedArr = existingInnerObj.children;
    } else {
      const newObj = { label: value, children: [] };
      nestedArr.push(newObj);
      nestedArr = newObj.children;
    }
  }
}
console.log(output);

您可以通过将创建的对象保存在由标签索引的映射或对象中,或者首先在对象中创建它们,然后稍后将它们转换为数组来降低复杂度(这将把.findO(n)操作转换为属性查找的O(1))。


哇,这太棒了,使用预定义映射的恒定时间访问的额外提示也是一个好主意,而不是从数组中搜索。非常感谢您的帮助,感激不尽。 - TKTR

0

这个版本配置了您想要嵌套的键列表,并且每个键都会进行递归。它的递归仅限于生成树的级别。如果这导致递归深度问题,那么就有更多重要的问题需要处理。

const omitting = (prop) => ({[prop]: p, ...rest}) => rest

const group = (fn) => (xs) => Object .values (xs .reduce 
  ((a, x, _, __, k = fn (x)) => ((a [k] = a [k] || []), (a [k] .push (x)), a), 
  {}
))

const labelGroup = ([label, ...labels]) => (xs) => 
  label == undefined
    ? xs
    : group (x => x [label]) (xs) .map (ys => ({
        label: ys [0] [label], 
        children: labelGroup (labels) (
          ys .map (omitting (label)) .filter (x => Object .keys (x) .length)
        )
      }))

const divisions = [{continent_name: "Asia", country_name: "Iran", subdivision_1_name: "Chaharmahal and Bakhtiari Province", city_name: "Lir Abi"}, {continent_name: "Europe", country_name: "Cyprus", subdivision_1_name: "Ammochostos", city_name: "Protaras"}, {continent_name: "Asia", country_name: "Iran", subdivision_1_name: "West Azerbaijan Province", city_name: "Post"}, {continent_name: "Africa", country_name: "Somalia", subdivision_1_name: "Bakool", city_name: "Oddur"}]

const labels = ['continent_name', 'country_name', 'city_name', 'subdivision_1_name']

console .log (
  labelGroup (labels) (divisions)
)
.as-console-wrapper {max-height: 100% !important; top: 0}

我们从两个辅助函数开始:

  • omitting 返回省略指定键后的对象副本。

  • group 接受一个键生成函数,并返回一个函数,该函数将值数组分组成每个子数组,每个子数组都映射到单个键。

主函数labelGroup接受要分组的标签列表,并返回一个函数,该函数接受一个值数组,并递归地将这些值分组为共享标签的组,通过提取通用标签、省略每个键并返回应用剩余键的labelGroup的递归结果来映射结果。当没有标签可以提取时,递归停止。

唯一复杂的部分是在递归应用之前的filter调用。 我们使用它来完全删除空对象。这使您在对对象中的每个标签进行分组时获得所需的结果,就像这里所做的那样,但也允许您对较小的标签列表进行分组。因此,如果您只传递'continent_name''countryName',则会得到类似以下内容的内容:

    {
        "label": "Asia",
        "children": [
            {
                "label": "Iran",
                "children": [
                    {
                        "subdivision_1_name": "Chaharmahal and Bakhtiari Province",
                        "city_name": "Lir Abi"
                    },
                    {
                        "subdivision_1_name": "West Azerbaijan Province",
                        "city_name": "Post"
                    }
                ]
            }
        ]
    },
    /* ... */
}

我认为这是一种相当灵活的技术。我们可以通过允许我们对复合键进行分组或选择在每个级别显示的多个字段来使其更加灵活,我们还可以将字段“label”和“children”配置化。但这是留待以后再说的事情。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接