遍历二维布尔数组，只保留最大的连续的“由1组成的二维块”。

Question

遍历二维布尔数组，只保留最大的连续的“由1组成的二维块”。

4

好的，这个问题的表述有点奇怪，但我希望这能给您澄清事情。

我有一个示例2D数组。

$array = array(
array(1, 0, 0, 0, 1, 0, 0, 1),
array(0, 0, 1, 1, 1, 1, 0, 1),
array(0, 1, 1, 0, 1, 0, 0, 0),
array(0, 1, 1, 0, 0, 0, 1, 0),
array(1, 0, 0, 0, 1, 1, 1, 1),
array(0, 1, 1, 0, 1, 0, 1, 0),
array(0, 0, 0, 0, 0, 0, 0, 1)
);

当按行迭代（并以 \n 结束每一行），然后对每一行进行按列迭代时，它会输出类似于以下内容：（░░=0，▓▓=1）

    ▓▓░░░░░░▓▓░░░░▓▓
    ░░░░▓▓▓▓▓▓▓▓░░▓▓
    ░░▓▓▓▓░░▓▓░░░░░░
    ░░▓▓▓▓░░░░░░▓▓░░
    ▓▓░░░░░░▓▓▓▓▓▓▓▓
    ░░▓▓▓▓░░▓▓░░▓▓░░
    ░░░░░░░░░░░░░░▓▓

但是我想做的是“分析”数组，只保留一个连续的形状（具有最多“单元格”），在这个例子中，结果将是：

    ░░░░░░░░▓▓░░░░░░
    ░░░░▓▓▓▓▓▓▓▓░░░░
    ░░▓▓▓▓░░▓▓░░░░░░
    ░░▓▓▓▓░░░░░░░░░░
    ▓▓░░░░░░░░░░░░░░
    ░░▓▓▓▓░░░░░░░░░░
    ░░░░░░░░░░░░░░░░

我的初始方法是：

Assign each ▓▓ cell a unique number (be it completely random, or the current iteration number):

01      02    03
    04050607  08
  0910  11      
  1213      14  
15      16171819
  2021  22  23  
              24

Iterate through the array many, MANY times: every iteration, each ▓▓ cell assumes the largest unique number among his neighbours. The loop would go on indefinitely until there's no change detected between the current state and the previous state. After the last iteration, the result would be this:
```
01      21    08
    21212121  08
  2121  21      
  2121      24  
21      24242424
  2121  24  24  
              24
```
Now it all comes down to counting the value that occurs the most. Then, iterating once again, to turn all the cells whose value is not the most popular one, to 0, giving me the desired result.

然而，我觉得这种方式对于如此简单的任务来说有些绕路和计算量大，必须有更好的方法。如果您有任何想法，将不胜感激，谢谢！

额外加分：将所有blob分成一个2D数组的数组，按单元格数排序，以便我们也可以处理最小的blob。

- Nikedemos

这是一种常见的算法，用于获取图的连接组件。它最多需要与最大组件大小相同的迭代次数。你不能指望得到比这更好的结果。 - MrSmith42

我认为你应该把它看作一个矩阵。通过查看选定点周围的邻居，您可以找到最大的 blob。每次记住使用了哪些点。我会使用面向对象编程来实现这个。 - Universus

阅读有关https://en.wikipedia.org/wiki/Flood_fill和https://en.wikipedia.org/wiki/Connected-component_labeling的内容。 - גלעד ברקן

3个回答

2

这些问题总是很有趣。我曾经解决过类似的问题，所以我将我的代码放在这里，希望对你有所帮助。它基本上通过查看一个单元格及其周围的八个单元格来跟踪每个形状，并且如果它们相连则前往连接单元格，再次查看并如此往复...

<?php 
$shape_nr=1;
$ln_max=count($array);
$cl_max=count($array[0]);
$done=[];

//LOOP ALL CELLS, GIVE 1's unique number
for($ln=0;$ln<$ln_max;++$ln){
for($cl=0;$cl<$cl_max;++$cl){
    if($array[$ln][$cl]===0)continue;
    $array[$ln][$cl] = ++$shape_nr;
}}

//DETECT SHAPES
for($ln=0;$ln<$ln_max;++$ln){
for($cl=0;$cl<$cl_max;++$cl){
    if($array[$ln][$cl]===0)continue;

    $shape_nr=$array[$ln][$cl];
    if(in_array($shape_nr,$done))continue;

    look_around($ln,$cl,$ln_max,$cl_max,$shape_nr,$array);
    //SET SHAPE_NR to DONE, no need to look at that number again
    $done[]=$shape_nr;
}}  

//LOOP THE ARRAY and COUNT SHAPENUMBERS
$res=array();
for($ln=0;$ln<$ln_max;++$ln){
for($cl=0;$cl<$cl_max;++$cl){
    if($array[$ln][$cl]===0)continue;
    if(!isset($res[$array[$ln][$cl]]))$res[$array[$ln][$cl]]=1;
    else $res[$array[$ln][$cl]]++;
}}

//get largest shape
$max = max($res);
$shape_value_max = array_search ($max, $res);

//get smallest shape
$min = min($res);
$shape_value_min = array_search ($min, $res);

// recursive function: detect connecting cells  
function look_around($ln,$cl,$ln_max,$cl_max,$nr,&$array){
    //create mini array
    $mini=mini($ln,$cl,$ln_max,$cl_max);
    if($mini===false)return false;

    //loop surrounding cells
    foreach($mini as $v){
        if($array[$v[0]][$v[1]]===0){continue;}
        if($array[$v[0]][$v[1]]!==$nr){
            // set shape_nr of connecting cell
            $array[$v[0]][$v[1]]=$nr;

            // follow the shape
            look_around($v[0],$v[1],$ln_max,$cl_max,$nr,$array);
            }
        }
    return $nr;
    }

// CREATE ARRAY WITH THE 9 SURROUNDING CELLS
function mini($ln,$cl,$ln_max,$cl_max){
    $look=[];   
    $mini=[[-1,-1],[-1,0],[-1,1],[0,-1],[0,1],[1,-1],[1,0],[1,1]];
    foreach($mini as $v){
        if( $ln + $v[0] >= 0 &&
            $ln + $v[0] < $ln_max &&
            $cl + $v[1] >= 0 &&
            $cl + $v[1] < $cl_max
            ){
            $look[]=[$ln + $v[0], $cl + $v[1]];
            }
        }

    if(count($look)===0){return false;}
    return $look;
    }

这里是一个代码示例

- Michel

1

如果您的数组大小不是很大，内存也不是问题，那么递归解决方案可能会更快。我在这里找到了一个C++算法，可以实现此功能： https://www.geeksforgeeks.org/find-length-largest-region-boolean-matrix/

- Mark D

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- MrSmith42 · Accepted Answer

我只能想到一些小的改进：

保持非空字段的链表。在第二步中，您不需要触及n²矩阵元素，您只需要触及链表中的元素。这取决于矩阵的稀疏程度，可能会减少很多。
你只需要比较右边，右下，左下和下方的方向。否则，其他方向已经从前一行/列检查过了。我的意思是：当我大于我的右邻居时，我可以改变右邻居的数字（同样适用于下方和右下角）。这将减少一半的比较次数。