MySQL计算来自不同行的队伍排名

Question

MySQL计算来自不同行的队伍排名

4

我正在尝试建立一种类似小测验的游戏。为此，我有以下表格：

团队

CREATE TABLE `teams` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `creator_id` int(11) NOT NULL,
  `friend_id` int(11) DEFAULT NULL,
  `team_name` varchar(128) NOT NULL,
  PRIMARY KEY (`id`)
);

团队日志

CREATE TABLE IF NOT EXISTS `progress_tracker` (
  `id` int(8) NOT NULL AUTO_INCREMENT,
  `user_id` int(8) NOT NULL,
  `team_id` int(11) NOT NULL,
  `date` date NOT NULL,
  `clues_found` int(11) NOT NULL,
  `clues_to_find` int(11) NOT NULL,
  PRIMARY KEY (`id`)
);

每个团队由两个用户组成；每个用户最初都有一定数量的线索；线索数可以增加或减少。不能保证最高的数字是最新的；我需要根据用户自加入以来找到的线索数量的平均值（对于团队中的两个用户），根据行上的clues_found的最大日期减去记录上的clues_found的最小日期，获得团队的排名（百分比）。

例如，如果我对每个表格有以下数据： 团队表数据

+--------+------------+------------+---------------+
|     id | creator_id | friend_id  |   team_name   |
+--------+------------+------------+---------------+
|      1 |         25 |         28 |         Test1 |
|      2 |         31 |          5 |         Test2 |
+--------+------------+------------+---------------+

团队日志表数据

+--------+---------+---------+------------+-------------+---------------+
|     id | user_id | team_id |    date    | clues_found | clues_to_find |
+--------+---------+---------+------------+-------------+---------------+
|      1 |      25 |       1 | 2013-01-6  |           3 |            24 |
|      2 |      25 |       1 | 2013-01-8  |           7 |            24 |
|      3 |      25 |       1 | 2013-01-10 |          10 |            24 |
|      4 |      28 |       1 | 2013-01-8  |           5 |            30 |
|      5 |      28 |       1 | 2013-01-14 |          20 |            30 |
|      6 |      31 |       2 | 2013-01-11 |           6 |            14 |
|      7 |       5 |       2 | 2013-01-9  |           2 |            20 |
|      8 |       5 |       2 | 2013-01-10 |          10 |            20 |
|      9 |       5 |       2 | 2013-01-12 |          14 |            20 |
+--------+---------+---------+------------+-------------+---------------+

期望结果

+-------------+---------------------+
|     team_id |   team_percentage   |
+-------------+---------------------+
|           1 |         39,58333333 |
|           2 |         30          |
+-------------+---------------------+

作为参考，这是一个中间表示，可能有助于理解：

+-------------+---------+---------------------+
|     user_id | team_id | precentage_per_user |
+-------------+---------+---------------------+
|          25 |       1 | 29,16666667         |
|          28 |       1 | 50                  |
|          31 |       2 | 0                   |
|           5 |       2 | 60                  |
+-------------+---------+---------------------+

到目前为止，我有以下的 SQL 语句：

SELECT STRAIGHT_JOIN
      tl2.team_id, (tl2.weight - tl1.weight)*100/tl2.clues_to_find
   from 
       ( select
               team_id,user_id,clues_found
            FROM 
               `team_log` 
            where 1

            group by
               team_id, user_id
            order by
               `date` ) base
       join (select team_id, user_id, clues_found, clues_to_find from `team_log` where user_id = base.user_id and team_id = base.team_id group by team_id, user_id order by `date` desc) tl2

但是这会返回一个错误，因为我不被允许在第二个查询中使用base.user_id。我也不太确定我是否朝着正确的方向前进。有人可以帮忙吗？

- jribeiro

1

请在“team_log表数据”标题后添加一个换行符，这样我们才能正确地看到它。 - Roy

2

你能解释一下这个语句吗：“我需要根据用户加入后找到的线索数量的平均值（对于团队中的所有用户）来获得团队的排名（以百分比表示），公式为：最大日期上的clues_found减去最小日期上的clues_found。当我将其应用于第二个团队时，我得到了-12——总线索数为32，平均值为8。然后我减去14和6。” - Gordon Linoff

1

@jribeiro 那行代码给人留下了模棱两可的印象... :) 抱歉。你是指每个团队每个用户在第一天和最后一天找到线索的平均值吗？它似乎不像你期望结果中显示的那样给出一个大数字.. - bonCodigo

@GordonLinoff 首先是每个用户都有自己的目标，所以这是针对每个用户的。因此，对于第二个团队中的用户31：（6-6）100/14 = 0%。对于用户5：（14-2）100/20 = 60%。 - jribeiro

@bonCodigo 是的。没错。关于用户31，我搞错了，结果应该是0%，而不是57%，因为初始值和最终值相同。 - jribeiro

@jribeiro请检查我发布的答案。对于您的逻辑混淆，我很抱歉，也不确定我是否已经发布了答案。因此，也许您可以在尝试后发表评论 :)无论是我从您的问题中获取的数据与您更新的数据不同...请澄清。 - bonCodigo

4个回答

1

SQLFiddle

SELECT `team_id`,
  (SUM(CASE WHEN b.`date` IS NULL THEN 0 ELSE `clues_found` * 100 / `clues_to_find` END) -
  SUM(CASE WHEN c.`date` IS NULL THEN 0 ELSE `clues_found` * 100 / `clues_to_find` END)) / 2
FROM `team_log` a
  LEFT JOIN (
    SELECT `team_id`, `user_id`, MAX(date) AS `date`
    FROM `team_log`
    GROUP BY `team_id`, `user_id`) b
  USING (`team_id`, `user_id`, `date`)
  LEFT JOIN (
    SELECT `team_id`, `user_id`, MIN(date) AS `date`
    FROM `team_log`
    GROUP BY `team_id`, `user_id`) c
  USING (`team_id`, `user_id`, `date`)
  GROUP BY `team_id`

既然你说团队成员总是两个，我使用了/2。对于大小可变的团队来说，这可能会稍微复杂一些。

- ic3b3rg

谢谢你的回答。问题在于clues_found可能会增加或减少。不能保证最高的数字是最新的。而且SUM(clues_found)不会对每一行的所有值求和吗？例如，sum(clues_found)对于user_id = 25不会返回20吗？ - jribeiro

好的...我完全误解了问题...现在理解了...我会再试一次。 - ic3b3rg

@jribeiro "clues_found可以增加或减少。" 如何取消找到的线索？ - user645280

答案可能在管理员验证后被视为无效。由于这是每日记录，稍后会反映出来。 - jribeiro

这种方法应该比 @ebyrob 的方法更快。 - ic3b3rg

我认为@Marty McVry的性能比我们都要好。很难比一个按user_id分组的min/max组和两个主键查找更快。 - user645280

1

请看一下并发表评论：

SQLFIDDLE DEMO

团队PCT：

select z.team_id, avg(z.pct) as teampct
from (
select x.user_id, y.team_id, x.mndate,
y.mxdate, x.mnclues_found,
y.mxclues_found, 
(((y.mxclues_found - x.mnclues_found)*100)
/y.mxclues_tofind) pct
from 
(select user_id, team_id, min(date) mndate, 
min(clues_found) as mnclues_found
from team_log
group by user_id, team_id) x
left join 
(select user_id, team_id, max(date) mxdate, 
max(clues_found) as mxclues_found, 
 max(clues_to_find) as mxclues_tofind
from team_log
group by user_id, team_id) y
on x.user_id = y.user_id and
x.team_id = y.team_id) z
group by z.team_id
;

结果1：

| USER_ID | TEAM_ID |   MNDATE |   MXDATE | MNCLUES_FOUND | MXCLUES_FOUND |     PCT |
-------------------------------------------------------------------------------------
|       5 |       2 | 13-01-09 | 13-01-12 |             2 |            14 |      60 |
|      25 |       1 | 13-01-06 | 13-01-10 |             3 |            10 | 29.1667 |
|      28 |       1 | 13-01-08 | 13-01-14 |             5 |            20 |      50 |
|      31 |       2 | 13-01-11 | 13-01-11 |             6 |             6 |       0 |

结果最终：

| TEAM_ID |  TEAMPCT |
----------------------
|       1 | 39.58335 |
|       2 |       30 |

- bonCodigo

我认为不同的值是因为你使用了错误的公式。应该是类似这样的东西 (maxDateRow.clues_found - minDateRow.clues_found)*100/clues_to_find。但是这个 sqlfiddle 真是太有价值了。 ;) 非常感谢你的帮助。 - jribeiro

1

@jribeiro我很高兴你得到了答案 :)并且很高兴你使用了teh sqlfiddle。这是一个很好的问题，表述清晰，+1。我更新了正确的公式。但是无法按照您的问题匹配结果 ;)在sqlfiddle中显示的“解释计划”将帮助您确定最佳答案。 - bonCodigo

1

@jribeiro 你说得太对了，确实需要改变公式。我没有注意到你要除以“最大线索数”😉祝你好运！ - bonCodigo

@jribeiro 实际上查询可以更短。您不必使用 out 查询。它可以在 x 和 y 的连接中完成。看看这个：http://sqlfiddle.com/#!2/bc0f6/1 速度也相当快。 - bonCodigo

@ebyrob，答案来自哪里？我的意思是这里的数字来自哪里？ - bonCodigo

显示剩余4条评论

1

这看起来有点丑，但应该能够工作：

select
   team_id,
   AVG(percentage_per_user) as team_percentage
from (select
  team_id,
  user_id,
  ((select clues_found from progress_tracker as x
      where x.user_id = m.user_id order by x.date desc limit 0, 1)
    - (select clues_found from progress_tracker as y
      where y.user_id = m.user_id order by y.date asc limit 0, 1))
  / MAX(clues_to_find)
  as percentage_per_user
from progress_tracker as m
group by team_id, user_id
) as userScore
group by team_id
order by team_percentage desc;

请注意，单独运行的内部查询将产生您的中间“按用户”结果。

- user645280

感谢@ebyrob，确实有些丑陋，但结果非常准确。非常感谢您的帮助！ - jribeiro

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Marty McVry · Accepted Answer

这里有另一个查询，可以产生正确的结果：

SELECT calc.team_id, AVG((calc.end_clues - calc.start_clues)/calc.total_clues*100) as team_percentage
FROM
    (SELECT log1.user_id, log1.team_id, log1.clues_found as start_clues, log2.clues_found as end_clues, log2.clues_to_find as total_clues FROM team_log log1
    JOIN
    (SELECT MIN(id) as start_id, MAX(id) as end_id FROM team_log GROUP BY user_id) ids
    ON ids.start_id = log1.id
    JOIN team_log log2 ON ids.end_id = log2.id) calc
GROUP BY team_id
ORDER BY team_id;

And the SQL Fiddle-link...