获取数组中所有非唯一值（即重复/多次出现的值）

Question

获取数组中所有非唯一值（即重复/多次出现的值）

javascriptarrays

542

我需要检查JavaScript数组是否有重复值。最简单的方法是什么？我只需要找出重复的值 - 我不需要它们的索引或重复次数。

我知道可以循环遍历数组并检查其他值是否匹配，但似乎应该有更简单的方法。

类似的问题：

获取JavaScript数组中所有唯一的值（删除重复项）

- Scott Saunders

29

关于这个问题似乎存在多年的混淆。我需要知道数组中重复的元素：“我只需要找到重复的值是什么”。正确答案不应该从数组中删除重复项。那与我想要的相反：我需要一个重复项的列表，而不是唯一元素的列表。 - Scott Saunders

我会翻译这个答复，但是考虑到回答的长度，它可能永远不会被看到。 - lonewarrior556

98个回答

215

如果您想消除重复项，请尝试这个好的解决方案：

function eliminateDuplicates(arr) {
  var i,
      len = arr.length,
      out = [],
      obj = {};

  for (i = 0; i < len; i++) {
    obj[arr[i]] = 0;
  }
  for (i in obj) {
    out.push(i);
  }
  return out;
}

console.log(eliminateDuplicates([1,6,7,3,6,8,1,3,4,5,1,7,2,6]))

Source: http://dreaminginjavascript.wordpress.com/2008/08/22/eliminating-duplicates/

- rapfaria

27

这是不错的代码，但不幸的是它并没有实现我所要求的功能。 - Scott Saunders

71

上述的代码（这是我的博客）可以让你接近解决问题。微调一下就行了。首先，你可以看到如果 arr.length 和 out.length 相等，那么就没有重复的元素。但是，您可能想要更多。如果您想要在发生重复时“捕获”它们，请检查 obj[arr[i]]=0 行后数组的长度是否增加。很聪明，是吧？ :-)感谢 Raphael Montanaro 的赞扬。 - Nosredna

6

@MarcoDemaio：嗯，不对啊，为什么代码不能使用空格？你可以在属性名称中放置任何内容 - 只是不能使用点语法访问带有空格的属性（也不能使用其他会破坏解析的各种字符的属性）。 - Gijs

4

@Gijs: +1，你是正确的。我不知道这一点。但是当它是对象数组时，它仍然无法工作。 - Marco Demaio

3

这个算法也会返回一个已排序的数组，这可能不是你想要的结果。 - asymmetric

显示剩余7条评论

207

这是我在重复的线程中的答案:

在编写此条目时（2014年），所有示例都是for循环或jQuery。JavaScript拥有完美的工具： sort，map和reduce。

查找重复项

var names = ['Mike', 'Matt', 'Nancy', 'Adam', 'Jenny', 'Nancy', 'Carl']

const uniq = names
  .map((name) => {
    return {
      count: 1,
      name: name
    };
  })
  .reduce((result, b) => {
    result[b.name] = (result[b.name] || 0) + b.count;

    return result;
  }, {});
const duplicates = Object.keys(uniq).filter((a) => uniq[a] > 1);

console.log(duplicates); // [ 'Nancy' ]

更多功能性语法:

@Dmytro-Laptin指出了一些可以删除的代码。这是相同代码的更紧凑版本，利用了一些ES6技巧和高阶函数:

const names = ['Mike', 'Matt', 'Nancy', 'Adam', 'Jenny', 'Nancy', 'Carl'];
const count = names =>
  names.reduce((result, value) => ({ ...result,
    [value]: (result[value] || 0) + 1
  }), {}); // don't forget to initialize the accumulator
const duplicates = dict =>
  Object.keys(dict).filter((a) => dict[a] > 1);

console.log(count(names)); // { Mike: 1, Matt: 1, Nancy: 2, Adam: 1, Jenny: 1, Carl: 1 }
console.log(duplicates(count(names))); // [ 'Nancy' ]

- Christian Landgren

@ChristianLandgren，'dict'变量在哪里声明？也许应该使用'count'？ - Dmytro Laptin

dict变量是传递给箭头函数的参数。它是function(dict) { return Object.keys(dict) ... }的简写形式。 - Christian Landgren

请注意，由于=>语法，此代码不兼容较低版本的IE。 - J0ANMM

159

更新：获取重复项的简短单行代码：

[1, 2, 2, 4, 3, 4].filter((e, i, a) => a.indexOf(e) !== i) // [2, 4]

为了得到没有重复项的数组，只需将条件反转：

为了得到没有重复项的数组，请简单地改变条件。

[1, 2, 2, 4, 3, 4].filter((e, i, a) => a.indexOf(e) === i) // [1, 2, 3, 4]

请注意，本答案的主要目标是简洁明了。如果您需要用于大型数组的高性能解决方案，一种可能的方法是先对数组进行排序（如果可排序），然后执行以下操作以获得与上述相同类型的结果：

myHugeSortedArray.filter((e, i, a) => a[i-1] === e)

以下是一个包含1,000,000个整数的数组示例：

const myHugeIntArrayWithDuplicates =
  [...Array(1_000_000).keys()]
  // adding two 0 and four 9 duplicates
  .fill(0, 2, 4).fill(9, 10, 14)

console.time("time")
console.log(
  myHugeIntArrayWithDuplicates
  // a possible sorting method for integers
  .sort((a, b) => a > b ? 1 : -1)
  .filter((e, i, a) => a[i-1] === e)
)
console.timeEnd("time")

在我的AMD Ryzen 7 5700G开发机上，它输出：

[ 0, 0, 9, 9, 9, 9 ]
time: 22.738ms

正如评论中指出的那样，无论是简短的解决方案还是高效的解决方案，如果原始数组中有重复项，则都会返回一个包含多个相同重复项的数组：

[1, 1, 1, 2, 2, 2, 2].filter((e, i, a) => a.indexOf(e) !== i) // [1, 1, 2, 2, 2]

如果需要唯一重复项，则可以使用类似以下函数的功能：

function duplicates(arr) {
  return [...new Set(arr.filter((e, i, a) => a.indexOf(e) !== i))]
}

可以使用以下代码，使duplicates([1, 1, 1, 2, 2, 2, 2])返回 [1, 2]：

如果你只需要检查是否存在重复项，就像这个问题所问的那样，你可以使用every()方法：

[1, 2, 3].every((e, i, a) => a.indexOf(e) === i) // true

[1, 2, 1].every((e, i, a) => a.indexOf(e) === i) // false

请注意，every()在IE 8及以下版本不可用。

- Laurent Payot

6

记住：[2,2,2,2].filter((e, i, a) => a.indexOf(e) !== i) 返回 [2, 2, 2]。 - Wajahath

6

@Wajahath 的说法正确，感谢他指出。如果想要找到唯一的重复元素，则可以使用以下函数 f = arr => [...new Set(arr.filter((e, i, a) => a.indexOf(e) !== i))]，这样执行 f([1, 1, 1, 2, 2, 2, 2]) 将返回 [1, 2]。 - Laurent Payot

从性能方面来看，如果您有100万条记录，那么这真的很糟糕。 - strix25

1

@strix25，没错，我添加了一个类似的解决方案，在一个100万个_已排序_数组上表现更好。 - Laurent Payot

77

在数组中查找重复值

这应该是实际上查找数组中重复值最短的方法之一。正如楼主所请求的那样，这不会删除重复项，而是找到它们。

var input = [1, 2, 3, 1, 3, 1];

var duplicates = input.reduce(function(acc, el, i, arr) {
  if (arr.indexOf(el) !== i && acc.indexOf(el) < 0) acc.push(el); return acc;
}, []);

document.write(duplicates); // = 1,3 (actual array == [1, 3])

// Or, using Sets (about 4 times faster)

var duplicates = Array.from(items.reduce((acc, v, i, arr) {
  return arr.indexOf(v) !== i ? acc.add(v) : acc;
}, new Set()))

这个不需要排序或任何第三方框架。它也不需要手动循环。它适用于每个indexOf()的值(或更明确地说：严格比较运算符支持的值）。

由于 reduce() 和 indexOf()，它至少需要IE 9版本支持。

- flu

11

ES6箭头/简洁/纯净版本： const dupes = items.reduce((acc, v, i, arr) => arr.indexOf(v) !== i && acc.indexOf(v) === -1 ? acc.concat(v) : acc, []) - ZephDavies

如果 (arr.indexOf(el) !== i && !acc.includes(el)) acc.push(el); return acc; 也可以工作 - Kopi Bryant

30

你可以添加此函数，或调整它并将其添加到JavaScript的Array原型中：

Array.prototype.unique = function () {
    var r = new Array();
    o:for(var i = 0, n = this.length; i < n; i++)
    {
        for(var x = 0, y = r.length; x < y; x++)
        {
            if(r[x]==this[i])
            {
                alert('this is a DUPE!');
                continue o;
            }
        }
        r[r.length] = this[i];
    }
    return r;
}

var arr = [1,2,2,3,3,4,5,6,2,3,7,8,5,9];
var unique = arr.unique();
alert(unique);

- karim79

这是最好的解决方案，但要小心将其添加到数组原型中，因为如果循环遍历值，它会破坏IE。 - Sampsa Suoninen

@RoyTinker Perl 也支持它们，但我不知道 JavaScript 也支持。 - Luke H

3

不满足提问者的要求，返回重复的内容。 - RWC

28

更新：以下使用了优化的组合策略。它通过优化原始查找来受益于哈希O(1)查找时间（在原始数组上运行unique是O(n)）。对象查找通过在迭代过程中为对象打上唯一标识进行优化，因此识别重复对象也是每个项目O(1)和整个列表O(n)。唯一的例外是被冻结的项目，但这些很少，并提供了一个使用数组和indexOf的回退。

var unique = function(){
  var hasOwn = {}.hasOwnProperty,
      toString = {}.toString,
      uids = {};

  function uid(){
    var key = Math.random().toString(36).slice(2);
    return key in uids ? uid() : uids[key] = key;
  }

  function unique(array){
    var strings = {}, numbers = {}, others = {},
        tagged = [], failed = [],
        count = 0, i = array.length,
        item, type;

    var id = uid();

    while (i--) {
      item = array[i];
      type = typeof item;
      if (item == null || type !== 'object' && type !== 'function') {
        // primitive
        switch (type) {
          case 'string': strings[item] = true; break;
          case 'number': numbers[item] = true; break;
          default: others[item] = item; break;
        }
      } else {
        // object
        if (!hasOwn.call(item, id)) {
          try {
            item[id] = true;
            tagged[count++] = item;
          } catch (e){
            if (failed.indexOf(item) === -1)
              failed[failed.length] = item;
          }
        }
      }
    }

    // remove the tags
    while (count--)
      delete tagged[count][id];

    tagged = tagged.concat(failed);
    count = tagged.length;

    // append primitives to results
    for (i in strings)
      if (hasOwn.call(strings, i))
        tagged[count++] = i;

    for (i in numbers)
      if (hasOwn.call(numbers, i))
        tagged[count++] = +i;

    for (i in others)
      if (hasOwn.call(others, i))
        tagged[count++] = others[i];

    return tagged;
  }

  return unique;
}();

如果您有ES6集合可用，那么有一个更简单且显著更快的版本。(IE9+和其他浏览器的shim在这里：https://github.com/Benvie/ES6-Harmony-Collections-Shim)

function unique(array){
  var seen = new Set;
  return array.filter(function(item){
    if (!seen.has(item)) {
      seen.add(item);
      return true;
    }
  });
}

- user748221

真的吗？为什么要回答一个已经解决了两年的问题呢？ - Rene Pot

3

我正在回答另一个问题时，不小心点击了有人链接到这个问题并称其为重复的链接，结果我复制了我的回答并把自己搞糊涂了。我经常编辑我的回答。 - user748221

http://stackoverflow.com/questions/7683845/removing-duplicates-from-an-array-in-javascript - user748221

17

我认为采用不同的解决方案很好。即使这个话题已经老生常谈并且已经有解决方法了，我们仍然可以想出不同的做法。这是计算机科学中的一个典型问题。 - Emil Vikström

你可能需要提到这依赖于在IE < 9中未实现的ES5数组方法。 - Tim Down

为什么不为“uid”函数使用一个简单的计数器？ - Bergi

24

var a = ["a","a","b","c","c"];

a.filter(function(value,index,self){ return (self.indexOf(value) !== index )})

- Angel David Calderaro Pacciott

这似乎是有效的，但你可能应该包含一些描述它如何工作的文本。 - The DIMM Reaper

4

如果重复值出现超过2次，就无法运行。 - vasa

1

这非常优雅简洁，我很喜欢。对于那些想要弄清楚它们如何工作的人，我创建了一个要点，展示如何显示重复项和消除重复项。请参见此处：https://gist.github.com/jbcoder/f1c616a32ee4d642691792eebdc4257b - Josh

@TheDIMMReaper 在数组的第二个 'a' 中，过滤函数内的 index == 1，而 self.indexOf('a') == 0。 - Sergiy Ostrovsky

23

从3个数组（或更多）中查找非唯一的值：

ES2015

//                             
var arr =  [1,2,2,3,3,4,5,6,2,3,7,8,5,22],
    arr2 = [1,2,511,12,50],
    arr3 = [22,0],
    merged,
    nonUnique;

// Combine all the arrays to a single one
merged = arr.concat(arr2, arr3)

// create a new (dirty) Array with only the non-unique items
nonUnique = merged.filter((item,i) => merged.includes(item, i+1))

// Cleanup - remove duplicate & empty items items 
nonUnique = [...new Set(nonUnique)]

console.log(nonUnique)

ES2015之前:

在下面的例子中，我选择在Array 原型上添加一个unique方法，允许从任何地方访问，并具有更加“声明性”的语法。我不建议在大型项目中采用此方法，因为它可能会与具有相同自定义名称的另一个方法发生冲突。

Array.prototype.unique = function () {
    var arr = this.sort(), i=arr.length; // input must be sorted for this to work
    while(i--)
      arr[i] === arr[i-1] && arr.splice(i,1) // remove duplicate item
    return arr
}

Array.prototype.nonunique = function () {
    var arr = this.sort(), i=arr.length, res = []; // input must be sorted for this to work
    while(i--)
      arr[i] === arr[i-1] && (res.indexOf(arr[i]) == -1) && res.push(arr[i]) 
    return res
}

//                             
var arr =  [1,2,2,3,3,4,5,6,2,3,7,8,5,22],
    arr2 = [1,2,511,12,50],
    arr3 = [22,0],
    // merge all arrays & call custom Array Prototype - "unique"
    unique = arr.concat(arr2, arr3).unique(),
    nonunique = arr.concat(arr2, arr3).nonunique()

console.log(unique)     // [1,12,2,22,3,4,5,50,511,6,7,8]
console.log(nonunique)  // [1,12,2,22,3,4,5,50,511,6,7,8]

- vsync

@shekhardesigner - 更新了答案。"r" 是你要搜索的数组。 - vsync

@vsync，我不得不初始化“var r = [];”才能让你的代码运行起来。结果像魔法一样顺利。 - absqueued

@shekhardesigner - 对于混淆我感到抱歉，对于数组原型解决方案，您不需要一个 r 变量。 - vsync

2

不符合提问者的要求，返回重复项。 - RWC

@RWC - 看起来是这样。我会在接下来的几天内更新答案，以最适合OP的问题。 - vsync

显示剩余2条评论

22

这应该可以帮你得到想要的，只有重复项。

function find_duplicates(arr) {
  var len=arr.length,
      out=[],
      counts={};

  for (var i=0;i<len;i++) {
    var item = arr[i];
    counts[item] = counts[item] >= 1 ? counts[item] + 1 : 1;
    if (counts[item] === 2) {
      out.push(item);
    }
  }

  return out;
}

find_duplicates(['one',2,3,4,4,4,5,6,7,7,7,'pig','one']); // -> ['one',4,7] in no particular order.

- Daniel Beardsley

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- swilliams · Accepted Answer

您可以对数组进行排序，然后遍历它，看看下一个（或上一个）索引是否与当前索引相同。假设您的排序算法很好，这应该小于O(n²)：

const findDuplicates = (arr) => {
  let sorted_arr = arr.slice().sort(); // You can define the comparing function here. 
  // JS by default uses a crappy string compare.
  // (we use slice to clone the array so the
  // original array won't be modified)
  let results = [];
  for (let i = 0; i < sorted_arr.length - 1; i++) {
    if (sorted_arr[i + 1] == sorted_arr[i]) {
      results.push(sorted_arr[i]);
    }
  }
  return results;
}

let duplicatedArray = [9, 9, 111, 2, 3, 4, 4, 5, 7];
console.log(`The duplicates in ${duplicatedArray} are ${findDuplicates(duplicatedArray)}`);

如果您需要返回一个函数来查找重复项，可以使用类似的方法。此处提供参考链接：https://stackoverflow.com/a/57532964/8119511。