计算两个已计算字段的差异

4

我有一个查询,基本上是获取客户在过去一年和三个月内的平均消费:

SELECT SQL_CALC_FOUND_ROWS 
       customer_id,
       customer_name,
       AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 1_year_average_spend,
       AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 3_month_average_spend

FROM   customer_spends
GROUP  BY customer__id

但是我还需要得到支出平均值的百分比差异:

例如(伪代码)

if (1_year_average_spend = 0)
    change = N/A
else 
    change = 3_month_average_spend / 1_year_average_spend - 1

您有什么建议可以实现这个功能呢?

我能想到的唯一办法很糟糕:

IF(
    AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            `spend_amount`,
            NULL
    )) > 0,
    AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
            spend_amount,
            NULL
    )) / AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            `spend_amount`,
            NULL
    )) - 1,
    "N/A"
) AS 3_month_performance
4个回答

1
使用内部选择(类似于临时视图)并从中选择。 这应该可以工作:
SELECT
  customer_id,
  customer_name,
  1_year_average_spend,
  3_month_average_spend,
  if (1_year_average_spend = 0, "N/A", (3_month_average_spend / 1_year_average_spend) - 1) AS 3_month_performance
FROM (SELECT
  customer_id,
  customer_name,
  AVG(IF(DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
    DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"), spend_amount, NULL)) AS 1_year_average_spend,
  AVG(IF(DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
    DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"), spend_amount, NULL)) AS 3_month_average_spend)
FROM customer_spends 
GROUP BY customer_id, customer_name ) x

1

如果使用NULL作为N/A的值,那么您可以像这样将NULLIF()应用于分母:

a / NULLIF(b, 0) - 1

NULLIF函数当第一个参数等于第二个参数时返回NULL。并且如果操作数是NULL,则整个表达式将被评估为NULL

与@Bohemian一样,我也建议使用子查询。以下是完整的查询:

SELECT SQL_CALC_FOUND_ROWS
  customer_id,
  customer_name,
  1_year_average_spend,
  3_month_average_spend,
  3_month_average_spend / NULLIF(1_year_average_spend, 0) - 1 AS change
FROM (
  SELECT
    customer_id,
    customer_name,
    AVG(IF(
         DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
             DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
         spend_amount,
         NULL
    )) AS 1_year_average_spend,
    AVG(IF(
         DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
             DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
         spend_amount,
         NULL
    )) AS 3_month_average_spend
  FROM customer_spends
  GROUP BY customer__id
) s

整个 if null or null = any or null if a = b 有点令人困惑,但是在我理解之后,听起来不错。 - Petah

1

如果您愿意使用MySQL特定的代码,您可以像这样使用用户定义变量(简化版本):

SELECT @avg1 := ROUND((1 + 2 + 3) / 3, 2) AS avg1,
       @avg2 := ROUND((4 + 5 + 6) / 3, 2) AS avg2,
   IF( @avg1, ROUND(@avg2 / @avg1 - 1, 2), NULL ) AS result;

+------+------+--------+
| avg1 | avg2 | result |
+------+------+--------+
| 2.00 | 5.00 |   1.50 |
+------+------+--------+

那将成为:

SELECT SQL_CALC_FOUND_ROWS 
       customer_id,
       customer_name,
       @1_year_average_spend := AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 1 YEAR, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 1_year_average_spend,
       @3_month_average_spend := AVG(IF(
            DATE(CONCAT(year_of_spend, "-", month_of_spend, "-01")) >=
                DATE_FORMAT(NOW() - INTERVAL 3 MONTH, "%Y-%m-01"),
            spend_amount,
            NULL
       )) AS 3_month_average_spend,
       IF( @1_year_average_spend,
           @3_month_average_spend / @1_year_average_spend - 1,
           NULL
       ) AS diff

FROM   customer_spends
GROUP  BY customer__id

注意1:我使用diff作为差异的列名,因为change是一个保留字,可能会导致问题。

注意2:您需要注意文档中的以下注意事项,因为它们可能会影响您的结果:

分配十进制和实数值不保留精度或比例。

还有:

通常情况下,您不应该在同一语句中给用户变量赋值并读取该值。您可能会得到预期的结果,但这并不是保证的。涉及用户变量的表达式的计算顺序是未定义的,并且可能会根据给定语句中包含的元素而更改。在SELECT @a,@a:= @a + 1,...中,您可能认为MySQL将首先评估@a,然后进行第二次分配。但是,更改语句(例如添加GROUP BY,HAVING或ORDER BY子句)可能会导致MySQL选择具有不同计算顺序的执行计划。

因此,请谨慎使用,并进行适当的测试!


很奇怪,我在几个小时前编辑了我的帖子,因为我意识到“change”是保留字,但除此之外,一切都很好。+1 - Petah
1
@Petah:我不确定是否想回答那个问题;-)变量可以起作用,并且它们可以显著加快某些查询的速度。如果对于给定的查询它能够正常工作一次,它将会继续工作(尽管我猜测不同的MySQL版本可能会以不同的方式处理相同的查询)。然而,使用变量是非标准的SQL语法,因此它不具有可移植性。子查询路线在长期来看可能更为稳健,但执行速度可能会受到影响。因此,最终取决于您的要求。很抱歉我无法明确表态! - Mike
不,谢谢您客观和详细的回答。但是,为了争论而争论,假设我正在使用MySQL 5.1.49,并且我永远不会更改。您是否仍然保持中立? - Petah
@Petah:在这种情况下,我会使用变量。至于它们的数据类型或精度,我不知道。文档似乎没有透露太多,但我也很想知道。 - Mike
3
@Petah:我建议您尝试使用您的数据(以及更大的测试表格)测试所有答案(并对其进行测量),然后再决定选择哪个。您始终可以将所有选项保留为备选方案。 - ypercubeᵀᴹ
显示剩余3条评论

0

摆脱IF()DATE()CONCAT()函数调用。你现在的查询必须扫描整个customer_spends表,并检查所有行的复杂条件,即使它们是10年前的数据。

这也将使用索引(year_of_spend, month_of_spend)(customer_id, year_of_spend, month_of_spend)来加速查询:

SELECT c.customer_id
     , c.customer_name
     , 1_year_average_spend
     , 3_month_average_spend
     , CASE WHEN 1_year_average_spend = 0
              THEN 'N/A'
              ELSE (3_month_average_spend / 1_year_average_spend) - 1
       END AS percent_difference
FROM
    customer AS c
  JOIN
    ( SELECT customer_id
           , AVG(spend_amount) AS 1_year_average_spend 
      FROM customer_spends 
      WHERE (year_of_spend, month_of_spend) >=
               ( YEAR(CUR_DATE() - INTERVAL 1 YEAR)
               , MONTH(CUR_DATE() - INTERVAL 1 YEAR)
               )
      GROUP BY customer_id
    ) AS grp1year
    ON grp1year.customer_id = c.customer_id
  LEFT JOIN
    ( SELECT customer_id
           , AVG(spend_amount) AS 3_month_average_spend
      FROM customer_spends 
      WHERE (year_of_spend, month_of_spend) >=
               ( YEAR(CUR_DATE() - INTERVAL 3 MONTH)
               , MONTH(CUR_DATE() - INTERVAL 3 MONTH)
               )
      GROUP BY customer_id
    ) AS grp3month
    ON grp3month.customer_id = c.customer_id    

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接