合并数组并保持顺序

Question

合并数组并保持顺序

javascriptarraystopological-sort

6

注意

根据 @Kaddath 的建议，编辑了问题以突出显示排序不一定要按字母顺序进行，而是取决于数组中项目的位置。

我有一个数组的数组，每个数组都基于给定的顺序，但它们可能会有些不同。

例如，基本排序为X -> D -> H -> B，这是我的数组：

const arrays = [
  ['X', 'D', 'H', 'B'],
  ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
  ['X', 'M', 'D', 'H', 'B'],
  ['X', 'H', 'T'],
  ['X', 'D', 'H', 'B']
]

我希望将所有数组合并为一个数组，并通过保持顺序来去除重复项。在我的示例中，结果应为['X'，'M'，'D'，'K'，'Z'，'H'，'T'，'B'，'A']。

在此示例中，我们可以看到M在第三个数组中位于X和D之间，最终输出中也是这样。

我知道可能会出现冲突，但以下是规则：

每个项都应出现在最终输出中。
如果某个项在多个数组中以不同的位置出现，则第一次出现的位置是正确的（跳过其他位置）。

到目前为止我所做的是使用

const merged = [].concat.apply([], arrays);

参考：https://dev59.com/cmgv5IYBdhLWcg3wD8yY#10865042。

然后，使用此代码片段从https://dev59.com/BHI-5IYBdhLWcg3w-9wK#1584377获取唯一值：

Array.prototype.unique = function() {
    var a = this.concat();
    for(var i=0; i<a.length; ++i) {
        for(var j=i+1; j<a.length; ++j) {
            if(a[i] === a[j])
                a.splice(j--, 1);
        }
    }

    return a;
}; 
const finalArray = merged.unique();

但我的结果是这样的：

[
  "X",
  "D",
  "H",
  "B",
  "K",
  "Z",
  "A",
  "M",
  "T"
]

任何帮助都受欢迎！感谢。

- MHogge

1

合并后不能对数组进行排序吗？ - Jerodev

2

你可以对它进行排序，例如 finalArray.sort()。 - Satpal

我不认为你有其他的选择，除了事后对它们进行排序。如果你仔细想一想，在你的情况下，“保持顺序”会导致冲突，你想要保留第一个数组的顺序还是第二个数组的顺序（如果它们有不同的顺序）？第三个呢？哪些标准必须适用？ - Kaddath

数据不可“排序”。在这个例子中有一个基本数组['A', 'B', 'C', 'D']，但它也可以是['X', '1', 'D', 'EE']（任何其他东西），结果应该保持基本数组的顺序，但在现有项之间添加项目（例如A-bis被添加在A和B之间，不是因为它按字母顺序排序，而是因为它出现在以下数组之一的这两个项之间）。 - MHogge

2

那么我认为你应该编辑你的帖子，使其不像字母顺序排列，并明确必须应用的顺序取决于数组的顺序（先应用第一个数组的顺序，然后是第二个等等），如果是这种情况。你必须意识到，以下数组排序可能会与已经应用的排序冲突，并明确如果发生这种情况，它是否应该被忽略或覆盖现有的排序。 - Kaddath

11个回答

5

const arrays = [
  ['X', 'D', 'H', 'B'],
  ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
  ['X', 'M', 'D', 'H', 'B'],
  ['X', 'H', 'T'],
  ['X', 'D', 'H', 'B']
];
const result = [];
arrays.forEach(array => {
  array.forEach((item, idx) => {
    // check if the item has already been added, if not, try to add
    if(!~result.indexOf(item)) {
      // if item is not first item, find position of his left sibling in result array
      if(idx) {
        const result_idx = result.indexOf(array[idx - 1]);
        // add item after left sibling position
        result.splice(result_idx + 1, 0, item);
        return;
      }
      result.push(item);
    }
  });
});
console.log('expected result', ['X', 'M', 'D', 'K', 'Z', 'H', 'T', 'B', 'A'].join(','));
console.log(' current result',result.join(','));

- ponury-kostek

1

抱歉，但这个不起作用。 arrays = [['c','a'], ['c','b'], ['c','a','b']]; 得到的结果是 期望结果 c,a,b 当前结果 c,b,a - Anton

好的解决方案，但是对我来说，条件 'if(idx)' 应该替换为 'if(idx>0)'。 - Stan

3

简化“展平，去重和排序”的方法：

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D'],
];
console.log(
  arrays
    .flat()
    .filter((u, i, all) => all.indexOf(u) === i)
    .sort((a, b) => a.localeCompare(b)),
);

根据Mohammad Usman现已删除的帖子，甚至更简单：

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D'],
];
console.log(
  [...new Set([].concat(...arrays))].sort((a, b) =>
    a.localeCompare(b),
  ),
);

- HMR

看起来这正是我在寻找的东西，除了我的排序不应该是按字母顺序而是更像“你有一个基本数组 ['A'，'B'，'C'，'D']，保持那个顺序并在现有项之间添加项目，具体取决于以下数组。” 不确定我是否让自己理解。但我想我需要更改sort函数以实现我的目标。 - MHogge

@MHogge，从arrays[0]中取出第一个元素，如果它是'A'，则从其他数组中取出所有以'A'开头的元素，然后从arrays[0]中取下一个元素并重复此过程。 - HMR

不依赖字母排序，“A-bis”可以被重命名为“XXX”，但它仍然必须放在“A”和“B”之间，因为它是在以下一个数组中被找到的位置。请查看我在原帖中的评论，也许会更清晰。 - MHogge

2

你可以使用.concat()和Set结合，得到唯一值的结果数组：

const data = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
];

const result = [...new Set([].concat(...data))].sort((a, b) => a.localeCompare(b));

console.log(result);

.as-console-wrapper { max-height: 100% !important; top: 0; }

- Mohammad Usman

结果不是我期望的，详见我的问题：我希望得到['A', 'A-bis', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis', 'E']作为结果。 - MHogge

2

使用array#concat创建一个单一的数组，然后使用Set获取该数组中的唯一值，最后对该数组进行排序。

const arrays = [ ['A', 'B', 'C', 'D'], ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'], ['A', 'A-bis', 'B', 'C', 'D'], ['A', 'C', 'E'], ['A', 'B', 'C', 'D'] ],
      result = [...new Set([].concat(...arrays))].sort();
console.log(result);

- Hassan Imam

1

合并 [].concat.apply([], arrays)
查找唯一值 [...new Set(merged)]
排序 .sort()

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
];


let merged = [].concat.apply([], arrays);  // merge array

let sort = [...new Set(merged)].sort(); // find uniq then sort

console.log(sort);

- Shiv Kumar Baghel

1

有趣的问题需要解决；我认为我只部分成功了。

我忽略了“未明确说明”的例子：B -> A -> T vs T -> B -> A
它非常低效

仍然发布这篇文章是因为我认为它可能会帮助你正确处理事情。以下是我的方法：

第一步：创建一个简单的索引

我们正在创建一个对象，对于嵌套数组中的每个唯一元素，跟踪其已成功或先前出现的元素：

{
  "X": { prev: Set({}), next: Set({ "D", "H", "B", "K", "Z", "A", "M", "T" })
  "M": { prev: Set({ "X" }), next: Set({ "D", "H", "B" })
  // etc.
}

我将其命名为“naive”，因为这些集合只包含一层深度的信息。

它们仅报告在同一个数组中的元素之间的关系。它们无法看到M出现在K之前，因为它们从未在同一个数组中。

第二步：递归地连接索引

这是我忽略了所有可能存在的大O问题的地方。我递归地合并索引：M的下一个是D、H、B的下一个的联接。递归直到找到没有下一个元素的元素，即T或A。

第三步：创建一个遵守排序索引的排序器：

const indexSorter = idx => (a, b) => 
    idx[a].next.has(b) || idx[b].prev.has(a) ? -1 :
    idx[a].prev.has(b) || idx[b].next.has(a) ?  1 :
                                                0 ;

这个函数创建了一个排序方法，使用生成的索引来查找任意两个元素之间的排序顺序。

将所有内容整合在一起：

(function() {


  const naiveSortIndex = xss => xss
    .map(xs =>
      // [ prev, cur, next ]
      xs.map((x, i, xs) => [
        xs.slice(0, i), x, xs.slice(i + 1)
      ])
    )

    // flatten
    .reduce((xs, ys) => xs.concat(ys), [])

    // add to index
    .reduce(
      (idx, [prev, cur, next]) => {
        if (!idx[cur])
          idx[cur] = {
            prev: new Set(),
            next: new Set()
          };

        prev.forEach(p => {
          idx[cur].prev.add(p);
        });

        next.forEach(n => {
          idx[cur].next.add(n);
        });

        return idx;
      }, {}
    );

  const expensiveSortIndex = xss => {
    const naive = naiveSortIndex(xss);

    return Object
      .keys(naive)
      .reduce(
        (idx, k) => Object.assign(idx, {
          [k]: {
            prev: mergeDir("prev", naive, k),
            next: mergeDir("next", naive, k)
          }
        }), {}
      )
  }

  const mergeDir = (dir, idx, k, s = new Set()) =>
    idx[k][dir].size === 0 
      ? s 
      : Array.from(idx[k][dir])
          .reduce(
            (s, k2) => mergeDir(dir, idx, k2, s),
            new Set([...s, ...idx[k][dir]])
          );

  // Generate a recursive sort method based on an index of { key: { prev, next } }
  const indexSorter = idx => (a, b) =>
    idx[a].next.has(b) || idx[b].prev.has(a) ? -1 :
    idx[a].prev.has(b) || idx[b].next.has(a) ? 1 :
    0;

  const uniques = xs => Array.from(new Set(xs));


  // App:
  const arrays = [
    ['X', 'D', 'H', 'B'],
    ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
    ['X', 'M', 'D', 'H', 'B'],
    ['X', 'H', 'T'],
    ['X', 'D', 'H', 'B']
  ];

  const sortIndex = expensiveSortIndex(arrays);
  const sorter = indexSorter(sortIndex);

  console.log(JSON.stringify(
    uniques(arrays.flat()).sort(sorter)
  ))

}())

建议

我认为解决这个问题的优雅方法可能是通过使用链表/树形结构并通过遍历直到找到其prev/next元素，将元素注入正确的索引来跳过所有Set的合并。

- user3297291

实际上这很好，是一个不错的开始，但即使"未指定"项目对我来说并不是真正的问题，效率可能是一个问题。我可能需要将此机制应用于大型数组，而对于小型数组，它已经需要 ~2 秒钟的时间。不幸的是，我并不真的熟悉 JavaScript，但我会查看您的建议，并看看是否能够获得更高效的东西。无论如何，感谢您的帮助！ - MHogge

如果你设法改进它，请告诉我！非常好奇。如果我今天/本周稍后有更多时间，我可能会深入挖掘一下。 - user3297291

你可以检查一下@ponury-kostek的解决方案。对我来说它很好用。 - MHogge

0

我会将数组压平，将它们作为键映射到对象中（从而删除重复项），然后对最终结果进行排序。

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
];

const final = Object.keys( arrays.flat().reduce( (aggregate, entry) => {
  aggregate[entry] = '';
  return aggregate;
}, {} ) ).sort( (x1, x2) => x1.localeCompare(x2) );

console.log( final );

- Icepickle

0

在你的代码中，合并后需要删除重复项。这样你就可以得到唯一的数组。

使用array.sort来对数组进行排序。

希望这能解决问题。

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
]

const merged = [].concat.apply([], arrays);

const unique = Array.from(new Set(merged))


const sorted = unique.sort()

console.log("sorted Array", sorted)

// Single Line
      const result = [...new Set([].concat(...arrays))].sort();
      
 console.log("sorted Array single line", result)

- Learner

0

我的解决方案没有关注效率，所以我不会尝试用于大型数组。但对于我来说，它的效果很好。

这个想法是多次遍历所有元素，并仅在以下三种情况之一中将元素插入到排序后的数组中：

当前元素是其数组中的第一个元素之一，并且其中一个后继元素是排序后的数组中的第一个元素。
当前元素是其数组中的最后一个元素之一，并且其中一个前任元素是排序后的数组中的最后一个元素。
前面的元素在排序后的数组中，并且当前元素的某个后继元素直接跟在该前面的元素之后。

对于当前的问题，如上所述，T和B，A之间的顺序并不是唯一确定的。为了处理这个问题，我使用一个标志force，当在迭代过程中没有新的插入时，可以采取任何合法的选项。

来自问题的以下规则没有在我的解决方案中实现。 如果一个项目在不同位置出现在多个数组中，则第一次出现是正确的（跳过其他项）。 数组之间没有层次关系。但是，如果不满足所需的检查和continue，则应该很容易实现。

let merge = (arrays) => {
  let sorted = [...arrays[0]];
  const unused_rules = arrays.slice(1);
  let not_inserted = unused_rules.flat().filter((v) => !sorted.includes(v));
  let last_length = -1;
  let force = false;

  // avoids lint warning
  const sortedIndex = (sorted) => (v) => sorted.indexOf(v);

  // loop until all elements are inserted, or until not even force works
  while (not_inserted.length !== 0 && !force) {
    force = not_inserted.length === last_length; //if last iteration didn't add elements, our arrays lack complete information and we must add something using what little we know
    last_length = not_inserted.length;
    for (let j = 0; j < unused_rules.length; j += 1) {
      const array = unused_rules[j];
      for (let i = 0; i < array.length; i += 1) {
        // check if element is already inserted
        if (sorted.indexOf(array[i]) === -1) {
          if (i === 0) {
            // if element is first in its array, check if it can be prepended to sorted array
            const index = array.indexOf(sorted[0]);
            if (index !== -1 || force) {
              const insert = array.slice(0, force ? 1 : index);
              sorted = [...insert, ...sorted];
              not_inserted = not_inserted.filter((v) => !insert.includes(v));
              force = false;
            }
          } else if (i === array.length - 1) {
            // if element is last in its array, check if it can be appended to sorted array
            const index = array.indexOf(sorted[sorted.length - 1]);
            if (index !== -1 || force) {
              const insert = array.slice(force ? array.length - 1 : index + 1);
              sorted = [...sorted, ...insert];
              not_inserted = not_inserted.filter((v) => !insert.includes(v));
              force = false;
            }
          } else {
            const indices = array.map(sortedIndex(sorted)); // map all elements to its index in sorted
            const predecessorIndexSorted = indices[i - 1]; // index in the sorted array of the element preceding current element
            let successorIndexArray;
            if (force) {
              successorIndexArray = i + 1;
            } else {
              successorIndexArray = indices.indexOf(predecessorIndexSorted + 1); // index in the current array of the element succeeding the current elements predecessor in the sorted array
            }
            if (predecessorIndexSorted !== -1 && successorIndexArray !== -1) {
              // insert all elements between predecessor and successor
              const insert = array.slice(i, successorIndexArray);
              sorted.splice(i, 0, ...insert);
              not_inserted = not_inserted.filter((v) => !insert.includes(v));
              force = false;
            }
          }
        }
      }
    }
  }
  return sorted;
};

事实上，规则 如果一个项目出现在多个不同位置的数组中，则第一次出现是正确的（跳过其他） 有点模糊。例如，使用下面的数组，最终得到 arrays[3] 作为排序后的数组是否可以，因为它不违反任何元素的第一次出现，还是应该优先选择 arrays [2]？

const arrays = [['a', 'b', 'd'],
                ['a', 'c', 'd'],
                ['a', 'b', 'c', 'd']
                ['a', 'c', 'b', 'd']]

- Anton

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- dugokontov · Accepted Answer

每个数组实际上都是一组规则，告诉我们元素之间的相对顺序。最终列表应该返回所有元素，同时尊重由所有规则定义的相对顺序。

有些解决方案已经解决了最初的问题，有些甚至没有解决那个问题（所有建议使用排序的都错过了问题的关键）。然而，没有一个提出了通用的解决方案。

问题

如果我们看看OP中提出的问题，这就是规则定义元素之间相对位置的方式：

   M    K -> Z    T
  ^ \  ^      \  ^
 /   v/        v/
X -> D ------> H -> B -> A

因此，我们很容易看出我们的数组以X开头。下一个元素可以是D和M。但是，D需要M已经在数组中。这就是为什么我们将M作为下一个元素，然后是D。接下来，D指向K和H。但由于H还有其他尚未收集的前任（实际上它有D，但列表中已经收集了），而K没有（实际上它有D，但已经在列表中收集），因此我们将放置K和Z，然后才是H。

H指向T和B。事实上，我们首先放哪个元素并不重要。因此，最后三个元素可以按以下三种顺序中的任何一种排列：

- T、B、A - B、A、T - B、T、A

让我们也考虑一个稍微复杂一点的情况。以下是规则：

['10', '11', '12', '1', '2'],
['11', '12', '13', '2'],
['9', '13'],
['9', '10'],

如果按照这些规则绘制图表，我们会得到以下结果：

   --------------> 13 ----
  /                ^      \
 /                /        v
9 -> 10 -> 11 -> 12 > 1 -> 2

这个案例有什么特别之处？两点：

只有在最后一个规则中，“我们才发现”数字9是数组的开头
从12到2有两条非直接路径（一条经过数字1，另一条经过数字13）。

解决方案

我的想法是为每个元素创建一个节点。然后使用该节点来跟踪所有直接后继和直接前任。之后，我们将找到所有没有前任的元素，并从那里开始“收集”结果。如果我们到达了具有多个前任但其中某些未被收集的节点，则会停止递归。可能会出现某些继承者已经在其他路径中被收集的情况。我们将跳过该继承者。

function mergeAndMaintainRelativeOrder(arrays/*: string[][]*/)/*: string[]*/ {
    /*
    interface NodeElement {
        value: string;
        predecessor: Set<NodeElement>;
        successor: Set<NodeElement>;
        collected: boolean;
    }
    */
    const elements/*: { [key: string]: NodeElement }*/ = {};
    // For every element in all rules create NodeElement that will
    // be used to keep track of immediate predecessors and successors
    arrays.flat().forEach(
        (value) =>
            (elements[value] = {
                value,
                predecessor: new Set/*<NodeElement>*/(),
                successor: new Set/*<NodeElement>*/(),
                // Used when we form final array of results to indicate
                // that this node has already be collected in final array
                collected: false,
            }),
    );

    arrays.forEach((list) => {
        for (let i = 0; i < list.length - 1; i += 1) {
            const node = elements[list[i]];
            const nextNode = elements[list[i + 1]];

            node.successor.add(nextNode);
            nextNode.predecessor.add(node);
        }
    });

    function addElementsInArray(head/*: NodeElement*/, array/*: string[]*/) {
        let areAllPredecessorsCollected = true;
        head.predecessor.forEach((element) => {
            if (!element.collected) {
                areAllPredecessorsCollected = false;
            }
        });
        if (!areAllPredecessorsCollected) {
            return;
        }
        array.push(head.value);
        head.collected = true;
        head.successor.forEach((element) => {
            if (!element.collected) {
                addElementsInArray(element, array);
            }
        });
    }

    const results/*: string[]*/ = [];
    Object.values(elements)
        .filter((element) => element.predecessor.size === 0)
        .forEach((head) => {
            addElementsInArray(head, results);
        });
    return results;
}

console.log(mergeAndMaintainRelativeOrder([
    ['X', 'D', 'H', 'B'],
    ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
    ['X', 'M', 'D', 'H', 'B'],
    ['X', 'H', 'T'],
    ['X', 'D', 'H', 'B'],
]));


console.log(mergeAndMaintainRelativeOrder([
    ['10', '11', '12', '1', '2'],
    ['11', '12', '13', '2'],
    ['9', '13'],
    ['9', '10'],
]));

大 O 符号

如果我们假设 n 是规则的数量，而 m 是每个规则中元素的数量，则此算法的复杂度为 O(n*m)。这考虑到 JS 的 Set 实现接近 O(1)。