我想选择每个用户的一行数据,不关心获取哪张图片。这个查询在MySQL中可以工作,但在SQL Server中无法工作:
SELECT user.id, (images.path + images.name) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
我想选择每个用户的一行数据,不关心获取哪张图片。这个查询在MySQL中可以工作,但在SQL Server中无法工作:
SELECT user.id, (images.path + images.name) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
MIN/MAX
聚合或ROW_NUMBER
的解决方案可能不是最有效的(取决于数据分布),因为它们通常必须检查所有匹配行才能选择每个组的一个。TransactionType
和ReferenceOrderID
以用于每个ProductID
:
MIN
/MAX
聚合SELECT
p.ProductID,
MIN(th.TransactionType + STR(th.ReferenceOrderID, 11))
FROM Production.Product AS p
INNER JOIN Production.TransactionHistory AS th ON
th.ProductID = p.ProductID
GROUP BY
p.ProductID;
ROW_NUMBER
WITH x AS
(
SELECT
th.ProductID,
th.TransactionType,
th.ReferenceOrderID,
rn = ROW_NUMBER() OVER (PARTITION BY th.ProductID ORDER BY (SELECT NULL))
FROM Production.TransactionHistory AS th
)
SELECT
p.ProductID,
x.TransactionType,
x.ReferenceOrderID
FROM Production.Product AS p
INNER JOIN x ON x.ProductID = p.ProductID
WHERE
x.rn = 1
OPTION (MAXDOP 1);
ANY
聚合函数SELECT
q.ProductID,
q.TransactionType,
q.ReferenceOrderID
FROM
(
SELECT
p.ProductID,
th.TransactionType,
th.ReferenceOrderID,
rn = ROW_NUMBER() OVER (
PARTITION BY p.ProductID
ORDER BY p.ProductID)
FROM Production.Product AS p
JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID
) AS q
WHERE
q.rn = 1;
关于ANY
聚合的详细信息,请参阅此博客文章。
TOP
的相关子查询SELECT p.ProductID,
(
-- No ORDER BY, so could be any row
SELECT TOP (1)
th.TransactionType + STR( th.ReferenceOrderID, 11)
FROM Production.TransactionHistory AS th WITH (FORCESEEK)
WHERE
th.ProductID = p.ProductID
)
FROM Production.Product AS p;
CROSS APPLY
和 TOP (1)
前面的查询需要连接字符串,并且对于没有交易历史记录的产品返回 NULL
。使用 CROSS APPLY
和 TOP
可以解决这两个问题:
SELECT
p.Name,
ca.TransactionType,
ca.ReferenceOrderID
FROM Production.Product AS p
CROSS APPLY
(
SELECT TOP (1)
th.TransactionType,
th.ReferenceOrderID
FROM Production.TransactionHistory AS th WITH (FORCESEEK)
WHERE
th.ProductID = p.ProductID
) AS ca;
如果进行最佳索引,并且每个用户通常有许多图像,则APPLY
可能是最有效的。
如果用户有多张图片,而你只想要其中一张图片,你想要哪一张呢?虽然MySQL的语法比较随意,不强制让你做出选择,只是给你任意一个任意值,但SQL Server会让你做出选择。一种方法是使用MIN
函数:
SELECT u.id, MIN(i.path + i.name) AS image_path
FROM dbo.users AS u
INNER JOIN dbo.images AS i
ON u.id = i.user_id
GROUP BY u.id;
您也可以将MIN
替换为MAX
。根据SQL Server的版本以及实际需要更多列的情况,可能有其他更有效的方法来避免一些排序/分组工作。例如,如果您想要路径和名称分别显示,则此方法可能不太合适:
SELECT u.id, MIN(i.path), MIN(i.name)
FROM dbo.users AS u
INNER JOIN dbo.images AS i
ON u.id = i.user_id
GROUP BY u.id;
由于理论上您可以从两行中获取路径和名称,因此这个结果将不再有意义。因此,您可以这样做:
;WITH x AS
(
SELECT user_id, path, name, rn = ROW_NUMBER() OVER
(PARTITION BY user_id ORDER BY (SELECT NULL))
FROM dbo.images
)
SELECT u.id, x.path, x.name
FROM dbo.users AS u
INNER JOIN x
ON u.id = x.user_id
WHERE x.rn = 1;
是否在现有案例中使用这种变异方式,取决于这两个表的索引方式,但您可以尝试这种方法并比较计划/性能:
;WITH x AS
(
SELECT user_id, path + name AS image_path, rn = ROW_NUMBER() OVER
(PARTITION BY user_id ORDER BY (SELECT NULL))
FROM dbo.images
)
SELECT u.id, x.image_path
FROM dbo.users AS u
INNER JOIN x
ON u.id = x.user_id
WHERE x.rn = 1;
dbo.images
中窄索引中的主列替换SELECT NULL
。)AS 'alias'
语法。该形式已被弃用,使别名看起来像字符串文字。此外,始终使用模式前缀,并使用别名,这样您就不必在整个查询中重复完整的表名...SELECT user.id, max((images.path + images.name)) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
MySQL对GROUP BY子句的处理被普遍认为是糟糕的。
根据需要使用最大值或最小值:
SELECT user.id, max(images.path + images.name) as image_path
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
SELECT user.id, min(images.path + images.name) as image_path
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
GROUP BY
时,你只能使用聚合函数来聚合其他列。SELECT user.id, (MAX(images.path) + MAX(images.name)) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
SELECT user.id, MAX(images.path + images.name)) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
MAX(images.path) + MAX(images.name)
是 images.path + images.name
的成员。 - Mike Sherrill 'Cat Recall'