在一个对象数组中进行分组的最有效方法

915

什么是在数组中对对象进行groupby的最有效方法?

例如,给定以下对象数组:

[ 
    { Phase: "Phase 1", Step: "Step 1", Task: "Task 1", Value: "5" },
    { Phase: "Phase 1", Step: "Step 1", Task: "Task 2", Value: "10" },
    { Phase: "Phase 1", Step: "Step 2", Task: "Task 1", Value: "15" },
    { Phase: "Phase 1", Step: "Step 2", Task: "Task 2", Value: "20" },
    { Phase: "Phase 2", Step: "Step 1", Task: "Task 1", Value: "25" },
    { Phase: "Phase 2", Step: "Step 1", Task: "Task 2", Value: "30" },
    { Phase: "Phase 2", Step: "Step 2", Task: "Task 1", Value: "35" },
    { Phase: "Phase 2", Step: "Step 2", Task: "Task 2", Value: "40" }
]

我正在使用表格展示这些信息。我想按不同的方法进行分组,但是我想对值进行求和。

我正在使用Underscore.js的groupby函数,这很有帮助,但并不能完全满足我的需求,因为我不想让它们“分开”,而是更像SQL中的group by 方法将它们“合并”起来。

我想要的是能够对特定值进行汇总(如果被请求的话)。

所以如果我按Phase 进行分组,我希望收到:

[
    { Phase: "Phase 1", Value: 50 },
    { Phase: "Phase 2", Value: 130 }
]

如果我将 Phase/Step 进行分组,我会收到:

[
    { Phase: "Phase 1", Step: "Step 1", Value: 15 },
    { Phase: "Phase 1", Step: "Step 2", Value: 35 },
    { Phase: "Phase 2", Step: "Step 1", Value: 55 },
    { Phase: "Phase 2", Step: "Step 2", Value: 75 }
]

是否有适用于此的有用脚本,或者我应该坚持使用Underscore.js,然后循环遍历结果对象自己进行总计?


虽然 _.groupBy 本身不能完成工作,但它可以与其他 Underscore 函数结合使用来完成所需的操作,无需手动循环。请参考此回答: https://dev59.com/uGYq5IYBdhLWcg3wfgnd#66112210。 - Julian
更易读的答案版本:function groupBy(data, key){ return data.reduce( (acc, cur) => { acc[cur[key]] = acc[cur[key]] || []; // 如果键是新的,则将其值初始化为数组,否则保留其自己的数组值 acc[cur[key]].push(cur); return acc; } , []) } - aderchox
62个回答

2

补充一下Scott Sauyet的答案,有些人在评论中问如何使用他的函数按value1、value2等分组,而不是只按一个值进行分组。

只需要编辑他的sum函数:

DataGrouper.register("sum", function(item) {
    return _.extend({}, item.key,
        {VALUE1: _.reduce(item.vals, function(memo, node) {
        return memo + Number(node.VALUE1);}, 0)},
        {VALUE2: _.reduce(item.vals, function(memo, node) {
        return memo + Number(node.VALUE2);}, 0)}
    );
});

保留主要的内容(DataGrouper)不变:

var DataGrouper = (function() {
    var has = function(obj, target) {
        return _.any(obj, function(value) {
            return _.isEqual(value, target);
        });
    };

    var keys = function(data, names) {
        return _.reduce(data, function(memo, item) {
            var key = _.pick(item, names);
            if (!has(memo, key)) {
                memo.push(key);
            }
            return memo;
        }, []);
    };

    var group = function(data, names) {
        var stems = keys(data, names);
        return _.map(stems, function(stem) {
            return {
                key: stem,
                vals:_.map(_.where(data, stem), function(item) {
                    return _.omit(item, names);
                })
            };
        });
    };

    group.register = function(name, converter) {
        return group[name] = function(data, names) {
            return _.map(group(data, names), converter);
        };
    };

    return group;
}());

2
凯撒的答案很好,但仅适用于数组内部元素的属性(在字符串的情况下为长度)。
这个实现更像是:这个链接
const groupBy = function (arr, f) {
    return arr.reduce((out, val) => {
        let by = typeof f === 'function' ? '' + f(val) : val[f];
        (out[by] = out[by] || []).push(val);
        return out;
    }, {});
};

希望这有所帮助...

2

这是一个基于TS的函数,虽然不是最高效的,但易于阅读和理解!

function groupBy<T>(array: T[], key: string): Record<string, T[]> {
const groupedObject = {}
for (const item of array) {
  const value = item[key]
    if (groupedObject[value] === undefined) {
  groupedObject[value] = []
  }
  groupedObject[value].push(item)
}
  return groupedObject
}

我们最后得到的结果类似于 ->
const data = [
{ Phase: "Phase 1", Step: "Step 1", Task: "Task 1", Value: "5" },
{ Phase: "Phase 1", Step: "Step 1", Task: "Task 2", Value: "10" },
{ Phase: "Phase 1", Step: "Step 2", Task: "Task 1", Value: "15" },
{ Phase: "Phase 1", Step: "Step 2", Task: "Task 2", Value: "20" },
];
console.log(groupBy(data, 'Step'))
{
'Step 1': [
    {
      Phase: 'Phase 1',
      Step: 'Step 1',
      Task: 'Task 1',
      Value: '5'
    },
    {
      Phase: 'Phase 1',
      Step: 'Step 1',
      Task: 'Task 2',
      Value: '10'
    }
  ],
  'Step 2': [
    {
      Phase: 'Phase 1',
      Step: 'Step 2',
      Task: 'Task 1',
      Value: '15'
    },
    {
      Phase: 'Phase 1',
      Step: 'Step 2',
      Task: 'Task 2',
      Value: '20'
    }
  ]
}

2

根据 @mortb 和 @jmarceli 的回答以及 这篇文章,我利用 JSON.stringify() 来作为 基本值 多列分组的标识符。

不使用第三方工具

function groupBy(list, keyGetter) {
    const map = new Map();
    list.forEach((item) => {
        const key = keyGetter(item);
        if (!map.has(key)) {
            map.set(key, [item]);
        } else {
            map.get(key).push(item);
        }
    });
    return map;
}

const pets = [
    {type:"Dog", age: 3, name:"Spot"},
    {type:"Cat", age: 3, name:"Tiger"},
    {type:"Dog", age: 4, name:"Rover"}, 
    {type:"Cat", age: 3, name:"Leo"}
];

const grouped = groupBy(pets,
pet => JSON.stringify({ type: pet.type, age: pet.age }));

console.log(grouped);

With Lodash third-party

const pets = [
    {type:"Dog", age: 3, name:"Spot"},
    {type:"Cat", age: 3, name:"Tiger"},
    {type:"Dog", age: 4, name:"Rover"}, 
    {type:"Cat", age: 3, name:"Leo"}
];

let rslt = _.groupBy(pets, pet => JSON.stringify(
 { type: pet.type, age: pet.age }));

console.log(rslt);

keyGetter 返回未定义 - Asbar Ali
@AsbarAli 我已经在Chrome的控制台中测试了我的代码片段 - 版本号为66.0.3359.139(官方版本)(64位)。一切都运行正常。您能否请设置调试断点并查看为什么keyGetter未定义。这可能是由于浏览器版本引起的。 - Pranithan T.

2

通常我使用JavaScript实用程序库Lodash,其中包含预构建的groupBy()方法。它非常易于使用,更多细节请参见此处


2

ES6基于reduce的版本,支持函数iteratee

如果未提供iteratee函数,则正常工作:

const data = [{id: 1, score: 2},{id: 1, score: 3},{id: 2, score: 2},{id: 2, score: 4}]

const group = (arr, k) => arr.reduce((r, c) => (r[c[k]] = [...r[c[k]] || [], c], r), {});

const groupBy = (arr, k, fn = () => true) => 
  arr.reduce((r, c) => (fn(c[k]) ? r[c[k]] = [...r[c[k]] || [], c] : null, r), {});

console.log(group(data, 'id'))     // grouping via `reduce`
console.log(groupBy(data, 'id'))   // same result if `fn` is omitted
console.log(groupBy(data, 'score', x => x > 2 )) // group with the iteratee

在OP的问题背景下:

const data = [ { Phase: "Phase 1", Step: "Step 1", Task: "Task 1", Value: "5" }, { Phase: "Phase 1", Step: "Step 1", Task: "Task 2", Value: "10" }, { Phase: "Phase 1", Step: "Step 2", Task: "Task 1", Value: "15" }, { Phase: "Phase 1", Step: "Step 2", Task: "Task 2", Value: "20" }, { Phase: "Phase 2", Step: "Step 1", Task: "Task 1", Value: "25" }, { Phase: "Phase 2", Step: "Step 1", Task: "Task 2", Value: "30" }, { Phase: "Phase 2", Step: "Step 2", Task: "Task 1", Value: "35" }, { Phase: "Phase 2", Step: "Step 2", Task: "Task 2", Value: "40" } ]

const groupBy = (arr, k) => arr.reduce((r, c) => (r[c[k]] = [...r[c[k]] || [], c], r), {});
const groupWith = (arr, k, fn = () => true) => 
  arr.reduce((r, c) => (fn(c[k]) ? r[c[k]] = [...r[c[k]] || [], c] : null, r), {});

console.log(groupBy(data, 'Phase'))
console.log(groupWith(data, 'Value', x => x > 30 ))  // group by `Value` > 30

另一个与 ES6 相关的版本,它颠倒了分组并使用 values 作为 keys,而将 keys 作为 grouped values

const data = [{A: "1"}, {B: "10"}, {C: "10"}]

const groupKeys = arr => 
  arr.reduce((r,c) => (Object.keys(c).map(x => r[c[x]] = [...r[c[x]] || [], x]),r),{});

console.log(groupKeys(data))

注意:函数以其简短的形式(一行)发布,目的是为了简洁明了地传达想法。您可以展开它们并添加额外的错误检查等。


2

让我们生成一个通用的Array.prototype.groupBy()工具。为了多样化,让我们使用ES6的扩展运算符进行一些Haskellesque模式匹配,采用递归方法。同时,让我们使我们的Array.prototype.groupBy()接受一个回调函数,该函数将项(e)、索引(i)和应用的数组(a)作为参数。

Array.prototype.groupBy = function(cb){
                            return function iterate([x,...xs], i = 0, r = [[],[]]){
                                     cb(x,i,[x,...xs]) ? (r[0].push(x), r)
                                                       : (r[1].push(x), r);
                                     return xs.length ? iterate(xs, ++i, r) : r;
                                   }(this);
                          };

var arr = [0,1,2,3,4,5,6,7,8,9],
    res = arr.groupBy(e => e < 5);
console.log(res);


2
我已经改进了答案。这个函数接收一组字段的数组,并返回一个分组对象,其键也是该组字段的对象。
function(xs, groupFields) {
        groupFields = [].concat(groupFields);
        return xs.reduce(function(rv, x) {
            let groupKey = groupFields.reduce((keyObject, field) => {
                keyObject[field] = x[field];
                return keyObject;
            }, {});
            (rv[JSON.stringify(groupKey)] = rv[JSON.stringify(groupKey)] || []).push(x);
            return rv;
        }, {});
    }



let x = [
{
    "id":1,
    "multimedia":false,
    "language":["tr"]
},
{
    "id":2,
    "multimedia":false,
    "language":["fr"]
},
{
    "id":3,
    "multimedia":true,
    "language":["tr"]
},
{
    "id":4,
    "multimedia":false,
    "language":[]
},
{
    "id":5,
    "multimedia":false,
    "language":["tr"]
},
{
    "id":6,
    "multimedia":false,
    "language":["tr"]
},
{
    "id":7,
    "multimedia":false,
    "language":["tr","fr"]
}
]

groupBy(x, ['multimedia','language'])

//{
//{"multimedia":false,"language":["tr"]}: Array(3), 
//{"multimedia":false,"language":["fr"]}: Array(1), 
//{"multimedia":true,"language":["tr"]}: Array(1), 
//{"multimedia":false,"language":[]}: Array(1), 
//{"multimedia":false,"language":["tr","fr"]}: Array(1)
//}

2

发布此问题是因为即使这个问题已经存在7年了,我仍然没有看到满足原始标准的答案:

我不想让它们“分开”,而是“合并”,更像SQL group by方法。

我最初来到这篇文章是因为我想找到一种方法来减少对象数组(例如,当您从csv中读取时创建的数据结构)并按给定索引进行聚合以生成相同的数据结构。 我正在寻找的返回值是另一个对象数组,而不是像在这里提出的嵌套对象或映射。

以下函数接受数据集(对象数组)、索引列表(数组)和缩小函数,并返回将缩小函数应用于索引作为对象数组的结果。

function agg(data, indices, reducer) {

  // helper to create unique index as an array
  function getUniqueIndexHash(row, indices) {
    return indices.reduce((acc, curr) => acc + row[curr], "");
  }

  // reduce data to single object, whose values will be each of the new rows
  // structure is an object whose values are arrays
  // [{}] -> {{}}
  // no operation performed, simply grouping
  let groupedObj = data.reduce((acc, curr) => {
    let currIndex = getUniqueIndexHash(curr, indices);

    // if key does not exist, create array with current row
    if (!Object.keys(acc).includes(currIndex)) {
      acc = {...acc, [currIndex]: [curr]}
    // otherwise, extend the array at currIndex
    } else {
      acc = {...acc, [currIndex]: acc[currIndex].concat(curr)};
    }

    return acc;
  }, {})

  // reduce the array into a single object by applying the reducer
  let reduced = Object.values(groupedObj).map(arr => {
    // for each sub-array, reduce into single object using the reducer function
    let reduceValues = arr.reduce(reducer, {});

    // reducer returns simply the aggregates - add in the indices here
    // each of the objects in "arr" has the same indices, so we take the first
    let indexObj = indices.reduce((acc, curr) => {
      acc = {...acc, [curr]: arr[0][curr]};
      return acc;
    }, {});

    reduceValues = {...indexObj, ...reduceValues};


    return reduceValues;
  });


  return reduced;
}

我将创建一个返回count(*)和sum(Value)的reducer:
reducer = (acc, curr) => {
  acc.count = 1 + (acc.count || 0);
  acc.value = +curr.Value + (acc.value|| 0);
  return acc;
}

最终,将我们的reducer与原始数据集一起应用agg函数,就可以得到一个应用了适当聚合的对象数组:

agg(tasks, ["Phase"], reducer);
// yields:
Array(2) [
  0: Object {Phase: "Phase 1", count: 4, value: 50}
  1: Object {Phase: "Phase 2", count: 4, value: 130}
]

agg(tasks, ["Phase", "Step"], reducer);
// yields:
Array(4) [
  0: Object {Phase: "Phase 1", Step: "Step 1", count: 2, value: 15}
  1: Object {Phase: "Phase 1", Step: "Step 2", count: 2, value: 35}
  2: Object {Phase: "Phase 2", Step: "Step 1", count: 2, value: 55}
  3: Object {Phase: "Phase 2", Step: "Step 2", count: 2, value: 75}
]

2
JavaScript中对对象数组进行元素分组的最高效方法是使用内置方法:

Object.groupBy()

const input = [
    { Phase: "Phase 1", Step: "Step 1", Task: "Task 1", Value: "5" },
    { Phase: "Phase 2", Step: "Step 1", Task: "Task 2", Value: "30" },
    { Phase: "Phase 2", Step: "Step 2", Task: "Task 1", Value: "35" },
    { Phase: "Phase 2", Step: "Step 2", Task: "Task 2", Value: "40" }
];

const output = Object.groupBy(input, ({ Phase }) => Phase);

console.log(JSON.stringify(output, null, 4));

Result:

{
    "Phase 1": [
        {
            "Phase": "Phase 1",
            "Step": "Step 1",
            "Task": "Task 1",
            "Value": "5"
        }
    ],
    "Phase 2": [
        {
            "Phase": "Phase 2",
            "Step": "Step 1",
            "Task": "Task 2",
            "Value": "30"
        },
        {
            "Phase": "Phase 2",
            "Step": "Step 2",
            "Task": "Task 1",
            "Value": "35"
        },
        {
            "Phase": "Phase 2",
            "Step": "Step 2",
            "Task": "Task 2",
            "Value": "40"
        }
    ]
}

注意:现在支持Chrome 117中的Object.groupBy()方法,并且其他浏览器开始实现该方法。请查看浏览器兼容性core-js库中的Object.groupBy的Polyfill。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接