考虑在BigQuery中使用的表结构:
Table User
{
user_id: STRING (REQUIRED)
user_name: STRING (REQUIRED)
actions: RECORD (REPEATED)
{
action_id: STRING (REQUIRED)
action_type: INTEGER (REQUIRED)
action_date: TIMESTAMP (REQUIRED)
}
}
我希望找到所有创建过某种类型动作超过一次的用户(user_id和user_name),并且这些动作之间的最短时间小于X天。
每个用户存储的动作数量没有定义(可以是1、2或n)。这些动作没有按任何标准排序(但我认为使用ORDER BY
可以解决这个问题)。
例如,对于以下用户:
{
user_id: "u1",
user_name: "User 1",
actions:
{action_id: "a1", action_type: 1, action_date: "2016-02-22"},
{action_id: "a2", action_type: 1, action_date: "2016-01-22"},
{action_id: "a3", action_type: 1, action_date: "2015-12-22"}
},
{
user_id: "u2",
user_name: "User 2",
actions:
{action_id: "a4", action_type: 1, action_date: "2016-02-22"},
{action_id: "a5", action_type: 2, action_date: "2016-01-22"},
{action_id: "a6", action_type: 1, action_date: "2015-12-22"}
},
{
user_id: "u3",
user_name: "User 3",
actions:
{action_id: "a7", action_type: 1, action_date: "2016-02-22"}
},
{
user_id: "u4",
user_name: "User 4",
actions:
{action_id: "a8", action_type: 1, action_date: "2016-02-22"},
{action_id: "a9", action_type: 1, action_date: "2015-02-22"},
{action_id: "a10", action_type: 1, action_date: "2015-01-22"}
},
如何在BigQuery上执行查询:“选择执行了类型为1
动作超过一次,并且每次执行之间最小时间少于45
天的用户”,应该返回User 1
和User 4
。
有什么想法可以实现吗?