在Golang中高效地将一对多、多对多数据库映射到结构体

Question

在Golang中高效地将一对多、多对多数据库映射到结构体

sqlgostructsqlx

44

问题

在处理Golang中的一对多或多对多SQL关系时，将行映射到结构体的最佳（高效，推荐，“Go-like”）方法是什么？

在下面的示例设置中，我已经尝试详细说明了一些方法以及每种方法的优缺点，但想知道社区的建议。

要求

与PostgreSQL配合使用（可以是通用的，但不包括MySQL / Oracle特定功能）
效率-不要蛮力每个组合
没有ORM-理想情况下仅使用database/sql和jmoiron/sqlx

示例

为了清晰起见，我已删除了错误处理

模型

type Tag struct {
  ID int
  Name string
}

type Item struct {
  ID int
  Tags []Tag
}

数据库

CREATE TABLE item (
  id                      INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
);

CREATE TABLE tag (
  id                      INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  name                    VARCHAR(160),
  item_id                 INT REFERENCES item(id)
);

方法1 - 选择所有项目，然后为每个项目选择标签

var items []Item
sqlxdb.Select(&items, "SELECT * FROM item")

for i, item := range items {
  var tags []Tag
  sqlxdb.Select(&tags, "SELECT * FROM tag WHERE item_id = $1", item.ID)
  items[i].Tags = tags
}

优点

简单易懂
容易理解

缺点

随着项目数量的增加，数据库查询次数呈比例增长，效率低下

方法2 - 构建SQL连接并手动循环行

var itemTags = make(map[int][]Tag)

var items = []Item{}
rows, _ := sqlxdb.Queryx("SELECT i.id, t.id, t.name FROM item AS i JOIN tag AS t ON t.item_id = i.id")
for rows.Next() {
  var (
    itemID  int
    tagID   int
    tagName string
  )
  rows.Scan(&itemID, &tagID, &tagName)
  if tags, ok := itemTags[itemID]; ok {
    itemTags[itemID] = append(tags, Tag{ID: tagID, Name: tagName,})
  } else {
    itemTags[itemID] = []Tag{Tag{ID: tagID, Name: tagName,}}
  }
}
for itemID, tags := range itemTags {
  items = append(Item{
    ID: itemID,
    Tags: tags,
  })
}

优点

只需进行一次数据库调用和游标循环，不会占用太多内存。

缺点

多个连接和许多结构属性使它复杂且难以开发。
性能不佳，相比于更多的网络调用，会使用更多的内存和处理时间。

失败的尝试3 - sqlx struct scanning

虽然失败，但我仍希望包括这种方法，因为我认为这是目前在开发简单性与效率方面的最佳选择。我的希望是通过显式设置每个结构字段上的db标签，sqlx可以进行一些高级结构扫描。

var items []Item
sqlxdb.Select(&items, "SELECT i.id AS item_id, t.id AS tag_id, t.name AS tag_name FROM item AS i JOIN tag AS t ON t.item_id = i.id")

很不幸，这会报错，错误信息为在*[]Item中缺少目标名称tag_id，这让我认为StructScan不足以递归地循环遍历行（没有批评之意——这是一个复杂的场景）。

可能的解决方案4 - PostgreSQL数组聚合器和GROUP BY 虽然我确定这不会起作用，但我已经包含了这个未经测试的选项，看看它是否可以改进，以便它能够工作。

var items = []Item{}
sqlxdb.Select(&items, "SELECT i.id as item_id, array_agg(t.*) as tags FROM item AS i JOIN tag AS t ON t.item_id = i.id GROUP BY i.id")

我有时间的话，会在这里尝试运行一些实验。

- Ewan

关于您的 sqlx 尝试：您的“Item”结构类型如何定义？ - LeGEC

@LeGEC - 在我上面的问题的“模型”部分有详细说明 - 在这个简单的例子中只有一个ID int和一个Tags []Tag数组。实际上，这要复杂得多。 - Ewan

5个回答

7

我可以提供另一种方法，我以前用过。

你可以在查询中将标签制作成json并返回。

优点：只需要调用一次数据库来汇总数据，然后将json解析成数组即可。

缺点：有点丑陋。如果有问题请随时指出。

type jointItem struct {
  Item 
  ParsedTags string
  Tags []Tag `gorm:"-"`
}

var jointItems []*jointItem
db.Raw(`SELECT 
  items.*, 
  (SELECT CONCAT(
            '[', 
             GROUP_CONCAT(
                  JSON_OBJECT('id', id,
                             'name', name 
                  )
             ), 
            ']'
         )) as parsed_tags 
   FROM items`).Scan(&jointItems)

for _, o := range jointItems {
var tempTags []Tag
   if err := json.Unmarshall(o.ParsedTags, &tempTags) ; err != nil {
      // do something
   }
  o.Tags = tempTags
}

编辑：由于代码的行为可能会变得奇怪，因此我发现在移动时使用临时标记数组比使用相同的结构体更好。

- Muhamed Keta

感谢提供另一种方法。我已经点赞，因为我认为这绝对是一个选项，尽管你说得对 - 它不太美观 - 并且将其添加到具有20多个字段的结构体中会增加更多复杂性和另一个需要维护的映射。然而，我不会接受它，因为我希望有一个更“优雅”的解决方案。 - Ewan

这真的很聪明！！丑陋的外观可以修复，但这完全可以使用常规的group by和自定义的json unmarshal来实现。 - Timo Huovinen

2

你可以使用来自https://github.com/jackskj/carta的carta.Map()，它会自动跟踪has-many关系。

- jackskj

1

检查下面的代码，使用自定义扫描和array_to_json(array_agg(tags))可以正常工作。模型

type Tag struct {
    ID     int
    ItemID int
    Name   string
}

type Item struct {
    ID   int
    Tags TagList
}

type TagList []Tag

func (t *TagList) Scan(src any) error {
    return json.Unmarshal(src.([]byte), t)
}

方法

rows, _ := db.Query(`
SELECT i.id, array_to_json(array_agg(t)) FROM items i
LEFT JOIN tags t ON t.item_id=i.id
GROUP BY i.id
`)
var items = []Item{}
for rows.Next() {
    var item = Item{
        Tags: []Tag{},
    }
    rows.Scan(&item.ID, &item.Tags)
    items = append(items, item)
}

也许使用jsonb_agg()比使用array_to_json(array_agg(tags))更高效。

- Saba

这是一个非常好的解决方案，使用了Go标准库，谢谢！ - undefined

是的，可能上面的array_to_json是不必要的，但我不确定在上面的Scan函数中如何解析它而不使用json解析器。 - undefined

-1

我写了一个库，试图改进一些我在sqlx库中发现的问题，其中一个改进正是你的使用案例：

库：github.com/vingarcia/ksql 用法：

(为了简洁起见，我也省略了错误处理)

type Tag struct {
  ID   int    `ksql:"id"`
  Name string `ksql:"name"`
}

type Item struct {
  ID   int   `ksql:"id"`
  Tags []Tag
}

// This is the target variable where we'll load the DB results:
var rows []struct{
    Item Item `tablename:"i"` // i is the alias for item on the query
    Tag  Tag  `tablename:"t"` // t is the alias for tag on the query
}
// When using the `tablename` above you need to start your query from `FROM`
// so KSQL can build the SELECT part based on the struct tags for you:
_ = ksqldb.Query(ctx, &rows, "FROM item AS i JOIN tag AS t ON t.item_id = i.id")

这仍然无法在Item.Tags属性上插入标签，所以您必须自己完成，这可能会很复杂，不是因为KSQL，而是因为您将有多行具有相同的ItemID，这将强制您使用映射来去重。

就性能而言，这种解决方案还存在另一个问题，即您实际上是将整个数据库加载到内存中，这可能会导致应用程序出现内存不足错误。

因此，更好的替代方案实际上很大程度上取决于您的具体用例。

我将提出两个解决方案：

解决方案1

如果项目数量实际上并不是整个数据库，并且具有尽可能小的内存占用量不是要求，那么只需接受更多的查询数量。

此外，拥有较短的查询并且不必多次返回相同的ID对于您的数据库来说实际上是有益的，因为数据库是共享资源，可能是单点故障或瓶颈，将负载从数据库中移除并转移到微服务中通常是一个不错的选择：

// Defining a smaller struct so we don't use more memory than necessary:
var items []struct{
    ID int `ksql:"id"`
}
_ = ksqldb.Query(ctx, &items, "SELECT id FROM item WHERE some_criteria = $1", someCriteria)

for _, item := range items {
    var tags []Tags
    _ = ksqldb.Query(ctx, "FROM tags WHERE tags.id = $1", item.ID)
    
    // Do something with it as soon as possible,
    // so you don't have to keep it in memory:
    DoSomethingWithItem(Item{
        ID: item.ID,
        Tags: tags,
    })

    // Alternatively you can add it to a slice of items:
    completedItems = append(completedItems, Item{ID: item.ID, Tags: tags})
}

这个解决方案的总时间较慢，主要是因为微服务和数据库之间的往返次数较多。就网络负载和数据库负载而言，这几乎与效率相当，也就是说它具有良好的可扩展性。不过，如果你只加载少量项目，那么这里的总时间应该不会太显著。

第二种解决方案

如果你需要在微服务端尽可能快速，并且需要非常小的内存占用，即一次性不会加载大量项目到内存中，那么你应该按数据块进行处理，而KSQL也支持这种方式：

type row struct{
    Item Item `tablename:"i"`
    Tag  Tag  `tablename:"t"`
}

// Here we are building each item one at a time, with all its tags, and
// then doing something with it as soon as we get to the next item.
//
// Note that for this to work we added a ORDER BY clause to the query.

var currentItem Item
_ = ksqldb.QueryChunks(ctx, ksql.ChunkParser{
    Query: "FROM item AS i JOIN tag AS t ON t.item_id = i.id ORDER BY i.id",
    ChunkSize: 100, // Load 100 rows at a time
    ForEachChunk: func(rows []row) error {
        for _, row := range rows {
            if currentItem.ID == 0 {
                currentItem = row.Item
            } else if row.Item.ID != currentItem.ID {
                // If we finished receiving one item:
                DoSomethingWithCurrentItem(currentItem)
                // Set the current item variable to the new Item:
                currentItem = row.Item
            }
            
            // Collect the tags of that item, one by one:
            currentItem.Tags = append(currentItem.Tags, rows.Tag)
        }
    },
})

// Do something with the last item you were parsing:
DoSomethingWithCurrentItem(currentItem)

我通常不会采用这种方法，因为一次性加载这么多数据到内存中很少是一个要求，并且这段代码比仅仅执行多个查询要复杂得多。但如果那是一个要求，我会这样做。如果您没有使用KSQL，您还可以在database/sql或sqlx库上使用类似的rows.Next()实现。

- VinGarcia

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Tsingson Qin · Accepted Answer

PostgreSQL 中的 SQL：

create schema temp;
set search_path = temp;
create table item
(
  id INT generated by default as identity primary key
);

create table tag
(
  id      INT generated by default as identity primary key,
  name    VARCHAR(160),
  item_id INT references item (id)
);

create view item_tags as
select id,
  (
          select
            array_to_json(array_agg(row_to_json(taglist.*))) as array_to_json
          from (
                select tag.name, tag.id
                 from tag
                         where item_id = item.id
               ) taglist ) as tags
from item ;


-- golang query this maybe 
select  row_to_json(row)
from (
    select * from item_tags
) row;

然后Golang查询这个SQL：

select  row_to_json(row)
from (
    select * from item_tags
) row;

并将数据反序列化为go结构体:

优点:

PostgreSQL管理数据之间的关系。使用SQL函数添加/更新数据。
Golang管理业务模型和逻辑。

这是一个简单的方法。

.