确定哪些对象与主根对象链接或未链接。

10
我试图通过一堆链接到其他对象的对象导航。我想从最低的ID号(根对象)开始, 根据连接的链接遍历每个对象。一些对象链接将循环回先前的对象,因此我要确保每个只看一次,否则会陷入无限循环。我还希望能够告诉哪些对象不能通过从第一个链接开始通过链接进行访问。
我的数据库表格如下:
对象表(Object Table):
+----+---------+
| id | title   |
+----+---------+
|  1 | Apple   |
|  3 | Carrot  |
|  4 | Dill    |
|  5 | Egg     |
|  6 | Fred    |
|  7 | Goat    |
|  8 | Harry   |
|  9 | Igloo   |
| 10 | Jason   |
| 11 | Klaus   |
| 12 | Banana  |
| 15 | Oyster1 |
| 16 | Oyster2 |
+----+---------+

对象链接表:

+----+---------+--------------+
| id |  obj_id |  obj_link_id |
+----+---------+--------------+
|  1 |       1 |           12 |
|  2 |       1 |            5 |
|  3 |       3 |            1 |
|  4 |       3 |           12 |
|  5 |       3 |            3 |
|  6 |       4 |            1 |
|  7 |       4 |            5 |
|  8 |       5 |            6 |
|  9 |       6 |            7 |
| 10 |       7 |            7 |
| 11 |       7 |            8 |
| 12 |       9 |           12 |
| 13 |       9 |            5 |
| 14 |      10 |            1 |
| 15 |      10 |            5 |
| 16 |      10 |            8 |
| 17 |      11 |            1 |
| 18 |      11 |            5 |
| 19 |      11 |           10 |
| 20 |      12 |            3 |
| 21 |      15 |           16 |
| 22 |      16 |           15 |
+----+---------+--------------+

那么,从表格中可以看到对象1与对象12和5都有链接。

我的MySQL查询代码如下:

select  object.id, title, obj_link_id
    from  object
    left join  object_links  ON object.id = object_links.object_id
    order by  object.id 

生成以下表格:

+----+---------+--------------+
| id | title   |  obj_link_id |
+----+---------+--------------+
|  1 | Apple   |           12 |
|  1 | Apple   |            5 |
|  3 | Carrot  |            1 |
|  3 | Carrot  |           12 |
|  3 | Carrot  |            3 |
|  4 | Dill    |            1 |
|  4 | Dill    |            5 |
|  5 | Egg     |            6 |
|  6 | Fred    |            7 |
|  7 | Goat    |            7 |
|  7 | Goat    |            8 |
|  8 | Harry   |         NULL |
|  9 | Igloo   |           12 |
|  9 | Igloo   |            5 |
| 10 | Jason   |            1 |
| 10 | Jason   |            5 |
| 10 | Jason   |            8 |
| 11 | Klaus   |            1 |
| 11 | Klaus   |            5 |
| 11 | Klaus   |           10 |
| 12 | Banana  |            3 |
| 15 | Oyster1 |           16 |
| 16 | Oyster2 |           15 |
+----+---------+--------------+

我在 PHP 中使用:

$objects = $stmt->fetchAll(PDO::FETCH_CLASS);

我不确定是否有更好的方法来获取这些内容,所以非常欢迎提出建议。

print_r($objects) 的输出结果为:

Array
(
    [0] => stdClass Object
        (
            [id] => 1
            [title] => Apple
            [obj_link_id] => 12
        )

    [1] => stdClass Object
        (
            [id] => 1
            [title] => Apple
            [obj_link_id] => 5
        )

    [2] => stdClass Object
        (
            [id] => 3
            [title] => Carrot
            [obj_link_id] => 1
        )

    [3] => stdClass Object
        (
            [id] => 3
            [title] => Carrot
            [obj_link_id] => 12
        )

    [4] => stdClass Object
        (
            [id] => 3
            [title] => Carrot
            [obj_link_id] => 3
        )

    [5] => stdClass Object
        (
            [id] => 4
            [title] => Dill
            [obj_link_id] => 1
        )

    [6] => stdClass Object
        (
            [id] => 4
            [title] => Dill
            [obj_link_id] => 5
        )

    [7] => stdClass Object
        (
            [id] => 5
            [title] => Egg
            [obj_link_id] => 6
        )

    [8] => stdClass Object
        (
            [id] => 6
            [title] => Fred
            [obj_link_id] => 7
        )

    [9] => stdClass Object
        (
            [id] => 7
            [title] => Goat
            [obj_link_id] => 7
        )

    [10] => stdClass Object
        (
            [id] => 7
            [title] => Goat
            [obj_link_id] => 8
        )

    [11] => stdClass Object
        (
            [id] => 8
            [title] => Harry
            [obj_link_id] =>
        )

    [12] => stdClass Object
        (
            [id] => 9
            [title] => Igloo
            [obj_link_id] => 12
        )

    [13] => stdClass Object
        (
            [id] => 9
            [title] => Igloo
            [obj_link_id] => 5
        )

    [14] => stdClass Object
        (
            [id] => 10
            [title] => Jason
            [obj_link_id] => 1
        )

    [15] => stdClass Object
        (
            [id] => 10
            [title] => Jason
            [obj_link_id] => 5
        )

    [16] => stdClass Object
        (
            [id] => 10
            [title] => Jason
            [obj_link_id] => 8
        )

    [17] => stdClass Object
        (
            [id] => 11
            [title] => Klaus
            [obj_link_id] => 1
        )

    [18] => stdClass Object
        (
            [id] => 11
            [title] => Klaus
            [obj_link_id] => 5
        )

    [19] => stdClass Object
        (
            [id] => 11
            [title] => Klaus
            [obj_link_id] => 10
        )

    [20] => stdClass Object
        (
            [id] => 12
            [title] => Banana
            [obj_link_id] => 3
        )

    [21] => stdClass Object
        (
            [id] => 15
            [title] => Oyster1
            [obj_link_id] => 16
        )

    [22] => stdClass Object
        (
            [id] => 16
            [title] => Oyster2
            [obj_link_id] => 15
        )

)
请注意,括号中的数字仅为数组索引,而非对象 ID 编号,请不要被索引所迷惑。
我正在尝试找到一种确定哪些是已链接对象和未链接对象的方法。根据上述情况,对象应该分别如下:
**Linked:**

    Apple
    Banana
    Carrot
    Egg
    Fred
    Goat
    Harry

**Not Linked:**

    Dill
    Igloo
    Jason
    Klaus
    Oyster1
    Oyster2

我主要的问题是:

在 PHP 中,当每个对象都可能有多个链接时,如何创建一个循环以遍历这样的结构?最终,我想要产生两个对象集合,一个包含已链接的对象,另一个包含未链接的对象。示例集合可能如下所示:

stdClass Object
(
  [LinkedElements] => stdClass Object
    (
      [1] => stdClass Object
        (
          [id] => 1
          [name] => Apple
          [link] => Array
            (
              [0] => 14
              [1] => 5
            )

        )

      [14] => stdClass Object
        (
          [id] => 14
          [name] => Banana
          [link] => Array
            (
              [0] => 3
            )

        )

      [3] => stdClass Object
        (
          [id] => 3
          [name] => Carrot
          [link] => Array
            (
              [0] => 1
              [1] => 14
              [2] => 3
            )

        )

      [5] => stdClass Object
        (
          [id] => 5
          [name] => Egg
          [link] => Array
            (
              [0] => 6
            )

        )

      [6] => stdClass Object
        (
          [id] => 6
          [name] => Fred
          [link] => Array
            (
              [0] => 7
            )

        )

      [7] => stdClass Object
        (
          [id] => 7
          [name] => Goat
          [link] => Array
            (
              [0] => 7
              [1] => 8
            )

        )

      [8] => stdClass Object
        (
          [id] => 8
          [name] => Harry
        )

    )

  [UnLinkedElements] => stdClass Object
    (
      [4] => stdClass Object
        (
          [id] => 4
          [name] => Dill
          [link] => Array
            (
              [0] => 1
              [1] => 5
            )

        )

      [9] => stdClass Object
        (
          [id] => 9
          [name] => Igloo
          [link] => Array
            (
              [0] => 14
              [1] => 5
            )

        )

      [10] => stdClass Object
        (
          [id] => 10
          [name] => Jason
          [link] => Array
            (
              [0] => 1
              [1] => 5
              [2] => 8
            )

        )

      [11] => stdClass Object
        (
          [id] => 11
          [name] => Klaus
          [link] => Array
            (
              [0] => 1
              [1] => 5
              [2] => 10
            )

        )

      [15] => stdClass Object
        (
          [id] => 15
          [name] => Oyster1
          [link] => Array
            (
              [0] => 16
            )

        )

      [16] => stdClass Object
        (
          [id] => 16
          [name] => Oyster2
          [link] => Array
            (
              [0] => 15
            )

        )

    )

)

请注意:

4个回答

1

这是一个图遍历问题。从一个节点(根节点)开始,您希望遍历图并跟踪沿途访问的每个节点。遍历结束后,访问过的节点将被连接起来。可以使用广度优先搜索来完成:

//To form a graph fetch all objects from the database (sorted by id) and 
//index them in a hash map
$objects = $stmt->fetchAll(PDO::FETCH_OBJ);
$nodes = [];
foreach ($objects as $object) {
    $nodes[$object->id] = new Node($object);
}

//fetch all connections from the database and link the objects
$links = $stmt->fetchAll(PDO::FETCH_OBJ);
foreach ($links as $link) {
    $nodes[$link->obj_id]->addLink($nodes[$link->obj_link_id]);
}

//let's assume root is the first node (sorted by id), 
//traverse the graph starting from root
$root = reset($nodes);
$root->traverse();

//now if a node can be reached by the root it is marked as visited
$linked = [];
$notLinked = [];
foreach ($nodes as $node) {
    if ($node->isVisited()) {
        $linked[] = $node;
    } else {
        $notLinked[] = $node;
    }
}

并且节点类:

class Node
{

    /**
     * List of neighbor nodes.
     *
     * @var Node[]
     */
    private $links = [];

    /**
     * id, title info
     *
     * @var array
     */
    private $data = [];

    /**
     * To track visited nodes.
     *
     * @var bool
     */
    private $visited = false;

    /**
     * Node constructor.
     * @param array $data
     */
    public function __construct($data)
    {
        $this->data = $data;
    }


    /**
     * Add a link to this node.
     *
     * @param Node $node
     * @return void
     */
    public function addLink(Node $node)
    {
        $this->links[] = $node;
    }

    /**
     * Traverse the graph in a Breadth-First-Search fashion marking
     * every visited node.
     * @return void
     */
    public function traverse()
    {
        //initialize queue
        $q = new SplQueue();

        //add root to queue and mark as visited
        $q->enqueue($this);
        $this->visited = true;

        while (!$q->isEmpty()) {
            /** @var Node $cur */
            $cur = $q->dequeue();

            foreach ($cur->links as $link) {

                //if link not visited already add it to queue and mark visited
                if (!$link->visited) {
                    $link->visited = true;
                    $q->enqueue($link);
                }
            }
        }
    }

    /**
     * Checks if node has been visited.
     *
     * @return bool
     */
    public function isVisited()
    {
        return $this->visited;
    }

}

当我运行此代码并尝试执行print_r($linked)或print_r($notLinked)时,我会得到一个巨大的嵌套对象列表,其中包含打印出的信息“* RECURSION *”。这是否是我应该期望的结果? - kojow7
$linked$notLinked 是节点对象的集合。根据您的应用程序,您希望它们(节点对象)执行有用的操作(使用构造函数中提供的数据)。您还可以使用类似于 getData() 的方法仅返回数据。由您决定。重点是,$linked 应该包含可以从根节点访问的节点,而 $notLinked 包含图中其余的节点。 - Weltschmerz

1

在我看来,将数据处理为两个单独的数组会更容易。一个对象集合和它们之间的链接。另外,作为第一部分,我将对象转换为以ID为键,这允许我直接使用它,而不必每次都搜索ID。

此外,为了使解决方案更简单,我在访问对象数组时创建了一个标志,这样当我尝试再次引用它时,我可以检查它是否已被访问过。

<?php 
error_reporting ( E_ALL );
ini_set ( 'display_errors', 1 );

$objects =[[1,'apple'],
        [3, 'Carrot'],
        [4, 'Dill'],
        [5, 'Egg '],
        [6, 'Fred'],
        [7, 'Goat'],
        [8, 'Harry'],
        [9, 'Igloo'],
        [10, 'Jason'],
        [11, 'Klaus'],
        [12, 'Banana'],
        [15, 'Oyster1'],
        [16, 'Oyster2' ]];
$links =[[1,12],
        [1,5],
        [3,1],
        [3,12],
        [3,3],
        [4,1],
        [4,5],
        [5,6],
        [6,7],
        [7,7],
        [7,8],
        [8,null],
        [9,12],
        [9,5],
        [10,1],
        [10,5],
        [10,8],
        [11,1],
        [11,5],
        [11,10],
        [12,3],
        [15,16],
        [16,15 ]];

function buildTree ( $id, &$objects, $links )   {
    foreach ( $links as $testNode )    {
        if ( $testNode[0] == $id && 
                $testNode[1] != $id &&
                $testNode[1] != null &&
                !isset($objects[$testNode[1]]['visited']) )    {
            $objects[$testNode[1]]['visited'] = true;
            buildTree ( $testNode[1], $objects, $links);
        }
    }
}

// Convert array to have the ID as key
$objects = array_combine(array_column($objects, 0), $objects);
// Fetch ID of first item
reset($objects);
$startID = key($objects);
// Mark as visited
$objects[$startID]['visited'] = true;
// Process
buildTree ( $startID, $objects, $links);

$linked = [];
$notLinked = [];
foreach ( $objects as $object)  {
    if ( isset($object['visited']) )    {
        $linked[] = $object[1];
    }
    else    {
        $notLinked[] = $object[1];
    }
}
echo "Linked...".PHP_EOL;
print_r($linked);

echo "Not linked...".PHP_EOL;
print_r($notLinked);

正如您所看到的,核心是递归的buildTree函数。这使用&$objects,因为这意味着对函数的所有调用都将使用相同的数组。由于我想建立所有项目的用途,这一点非常重要。

buildTree中的条件检查是否为我们想要的节点,它不是引用同一节点(继续查找浪费时间),不是null(不确定为什么链接到null,但也不值得再次查找),并且该节点尚未被访问。如果这些条件都正确,则将下一个节点标记为已访问,并进入下一级。

输出结果是...

Linked...
Array
(
    [0] => apple
    [1] => Carrot
    [2] => Egg 
    [3] => Fred
    [4] => Goat
    [5] => Harry
    [6] => Banana
)
Not linked...
Array
(
    [0] => Dill
    [1] => Igloo
    [2] => Jason
    [3] => Klaus
    [4] => Oyster1
    [5] => Oyster2
)

从技术上讲,对象8没有链接,这就是为什么SQL查询的结果表明它链接到NULL的原因。我最初是将这两个表连接在一起,但根据您和Weltschmerz的答案,似乎不连接它们会是更好的方法。 - kojow7
更多的是要意识到某些事物的含义,如果你包含它,那么就要适当地处理它,而排除它可能更容易。只要你能应对每种可能性,那么就没问题了,但当我没有将其排除时,我的代码最初创建了一个指向“null”新条目的链接,看起来很奇怪。 - Nigel Ren
我发现可以通过使用“PDO::FETCH_GROUP|PDO::FETCH_UNIQUE”从数据库中获取数据,使您的代码更加高效。这样一来,我就不需要转换数组,并且可以消除array_combine行。 - kojow7
1
知道如何改进你所编写的任何代码总是很有用的,感谢提供这些信息 :) - Nigel Ren

0
假设“根”是obj_id 1
这里有一个有点蛮力的算法,但它利用了SQL对“set”操作的喜好。
Insert into table1 the root (1)
Loop
    Create table2 with all nodes linked to any node in table1
    Exit if number of rows in table2 = num rows in table1
    table1 := table2

更接近 SQL:

# Initialize:
CREATE TABLE table1 (
    obj_id ...
    PRIMARY KEY(obj_id)
)
    SELECT 1;   # assuming root is 1

start of loop:
CREATE TABLE table2 (
    obj_id ...
    PRIMARY KEY(obj_id)
)
    SELECT DISTINCT obj_link_id
        FROM table1
        JOIN object_links USING(obj_id);

SELECT @fini := ( SELECT COUNT(*) FROM table1 ) = 
                ( SELECT COUNT(*) FROM table2 )   # will give true/false
DROP TABLE table1;
RENAME TABLE table2 TO table1;
loop if @fini=0

输出的是所有“已连接”的ID。如果您想要未连接的ID:

SELECT obj_id
    FROM object_links
    LEFT JOIN table1 USING(obj_id)
    WHERE table1.obj_id IS NULL;   # that is, missing from table1

1
我认为我更喜欢用PHP来实现这个,而不是在数据库中创建新表。不过还是谢谢你的帮助! - kojow7
PHP的array_merge可以用于将数组“tables2”合并到“table1”中。因此,问题实际上是关于算法,而不是PHP或MySQL。 - Rick James
我也有点困惑,不确定你指的是哪两个表。我已经在我的SQL选择语句中合并了Object表和Object Links表。 - kojow7
我的错。我应该使用object_links - Rick James

0

以下是一个非常简短的方法来获取所有链接ID:

$pdo = new PDO('mysql:host=localhost;dbname=test_obj_link', 'testread', 'testread');

$links = $pdo
    ->query('select obj_id, obj_link_id from object_links')
    ->fetchAll(PDO::FETCH_GROUP|PDO::FETCH_COLUMN);

function getLinks($objId, $links,  $indirectNodes = []) {
    $directNodes = isset($links[$objId]) ? $links[$objId] : [];
    foreach($directNodes as $linkedNode) {
        if (!in_array($linkedNode, $indirectNodes)) {
            $indirectNodes[] = $linkedNode;
            $indirectNodes = array_unique(array_merge(
                $indirectNodes,
                getLinks($linkedNode, $links, $indirectNodes)
            ));
        }
    }
    return $indirectNodes;
}

$linkedObjectIds = getLinks(1, $links);

fetchAll(PDO::FETCH_GROUP|PDO::FETCH_COLUMN) 将返回一个结构化数组,每个对象的链接都由对象ID索引,看起来像这样:

$links = [
    1  => ['5', '12'],
    3  => ['1', '3', '12'],
    4  => ['1', '5'],
    5  => ['6'],
    6  => ['7'],
    7  => ['7', '8'],
    9  => ['5', '12'],
    10 => ['1', '5', '8'],
    11 => ['1', '5', '10'],
    12 => ['3'],
    15 => ['16'],
    16 => ['15'],
];

getLinks函数将递归“遍历”$links数组,并合并在途中找到的所有子数组。由于PHP似乎没有array_union函数,因此使用array_unique(array_merge(..))代替。

结果:

$linkedObjectIds = array (
  0 => '5',
  1 => '6',
  2 => '7',
  3 => '8',
  4 => '12',
  10 => '3',
  11 => '1',
)

请注意这里的索引没有意义。
要获取相应的对象,您可以执行以下操作:
$objects = $pdo
    ->query('select id, title from object')
    ->fetchAll(PDO::FETCH_KEY_PAIR);
$linkedObjects = array_intersect_key($objects, array_flip($linkedObjectIds));
$notLinkedObjects = array_diff_key($objects, $linkedObjects);

这些变量将包含:

$objects = array (
  1 => 'Apple',
  3 => 'Carrot',
  4 => 'Dill',
  5 => 'Egg',
  6 => 'Fred',
  7 => 'Goat',
  8 => 'Harry',
  9 => 'Igloo',
  10 => 'Jason',
  11 => 'Klaus',
  12 => 'Banana',
  15 => 'Oyster1',
  16 => 'Oyster2',
);
$linkedObjects = array (
  1 => 'Apple',
  3 => 'Carrot',
  5 => 'Egg',
  6 => 'Fred',
  7 => 'Goat',
  8 => 'Harry',
  12 => 'Banana',
);
$notLinkedObjects = array (
  4 => 'Dill',
  9 => 'Igloo',
  10 => 'Jason',
  11 => 'Klaus',
  15 => 'Oyster1',
  16 => 'Oyster2',
);

演示:http://rextester.com/ZQQGE35352

请注意,in_array()array_unique()可能很慢,因为它们必须在未排序的值中搜索。这可能会导致某些数据集的性能问题。假设PHP可以更快地搜索键,我们可以使用array_key_exists()代替in_array(),并使用数组运算符+(按键合并)代替array_unique(array_merge())。而且代码甚至会更短:

function getLinks($objId, $links,  $indirectNodes = []) {
    $directNodes = isset($links[$objId]) ? $links[$objId] : [];
    foreach($directNodes as $linkedNode) {
        if (!array_key_exists($linkedNode, $indirectNodes)) {
            $indirectNodes[$linkedNode] = 0;
            $indirectNodes = $indirectNodes + getLinks($linkedNode, $links, $indirectNodes);
        }
    }
    return $indirectNodes;
}

然而,由于该函数现在返回所需结果作为键,我们需要使用array_keys()来提取它们:
$linkedObjectIds = array_keys(getLinks(1, $links));

演示:http://rextester.com/GHO7179

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接