MySQL的Match...Against查询非常缓慢

6

我正在处理一个使用Propel ORM的网站,有如下查询:

if(isset($args["QueryText"]) && $args["QueryText"] != "") {
  $query = $query
    ->withColumn("(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE))", "RequestRelevance")
    ->condition('cond1', "(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE)) > 0.2")
    ->condition('cond2', 'Request.Id = ?', $args["QueryText"])
    ->where(array('cond1', 'cond2'), 'or')
    ->orderBy("RequestRelevance", Criteria::DESC);
}

这在SQL中的翻译如下:

SELECT DISTINCT 
(MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
    MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) 
  AS RequestRelevance, requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID)
WHERE ((MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
  MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) > 0.2 OR requests.requestID = '46104') 
ORDER BY requests.created ASC,RequestRelevance DESC

使用 Propel 在网站加载该内容需要大约 20 秒,使用 SQL 查询则需要 7.020 秒。
我尝试使用以下方法代替:
SELECT DISTINCT 
  requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID) 
WHERE (requests.subject LIKE '%46104%' OR requests.detail LIKE '%46104%' OR  Response.response LIKE '%46104%' OR requests.requestID = '46104') 
ORDER BY requests.created

这段代码执行时间为3.308秒,如果将其中的OR Response.response LIKE '%46104%'删除,则执行时间降至0.140秒。responses表中包含了288317行数据,responses.responses列是一个带有全文索引的TEXT()类型。

那么如何才能更好地减少这个搜索的执行时间呢?我尝试使用了https://dba.stackexchange.com/questions/15214/why-is-like-more-than-4x-faster-than-match-against-on-a-fulltext-index-in-mysq中提供的方法,但当我执行以下代码时:

SELECT responseID FROM
(
 SELECT * FROM responses
 WHERE requestID = 15000
 AND responseID != 84056
) A
WHERE MATCH(response) AGAINST('twisted');

我遇到了这个错误:
错误代码:1191。找不到与列列表匹配的FULLTEXT索引。
非常感谢您的帮助!
编辑1:
尝试了@Richard EB的查询:
ALTER TABLE responses ADD FULLTEXT(response)
288317 row(s) affected Records: 288317  Duplicates: 0  Warnings: 0  78.967 sec

然而:
SELECT responseID FROM (     SELECT * FROM responses     WHERE requestID = 15000     AND responseID != 84056 ) A WHERE MATCH(response) AGAINST('twisted') LIMIT 0, 2000 
Error Code: 1191. Can't find FULLTEXT index matching the column list    0.000 sec

去掉DISTINCT可以将执行时间减少到0.952秒,但是它不能检索我需要的结果。

编辑2:

执行此查询:

SELECT DISTINCT 
responses.requestID AS "Id" FROM responses WHERE MATCH(response) AGAINST('twisted')

执行这个操作需要0.062秒。

执行以下操作:

SELECT DISTINCT responses.requestID
  FROM responses 
WHERE (
  MATCH (Responses.response) AGAINST ('twisted' IN BOOLEAN MODE)
) 
ORDER BY responses.requestID ASC

仅需0.046秒。但是,从Requests中选择并加入Responses是导致查询变慢的原因。我不确定是否意味着必须完全重写整个查询以从Responses中选择并加入Requests?

编辑3:

这是我在 Requests Responses 表上拥有的索引:

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    responses, 0, PRIMARY, 1, responseID, A, 288317, , , , BTREE, , 
    responses, 1, requestID, 1, requestID, A, 48052, , , YES, BTREE, , 
    responses, 1, response, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_2, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_3, 1, response, , 1, , , YES, FULLTEXT, , 

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    requests, 0, PRIMARY, 1, requestID, A, 46205, , , , BTREE, , 
    requests, 1, supportstatusID, 1, supportstatusID, A, 14, , , YES, BTREE, , 
    requests, 1, internaluserID, 1, internaluserID, A, 344, , , YES, BTREE, , 
    requests, 1, customergroupID, 1, customergroupID, A, 198, , , , BTREE, , 
    requests, 1, userID, 1, userID, A, 1848, , , YES, BTREE, , 
    requests, 1, siteID, 1, siteID, A, 381, , , YES, BTREE, , 
    requests, 1, request, 1, subject, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 2, detail, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 3, ponumber, , 1, , , YES, FULLTEXT, , 
1个回答

3

LIKE搜索将遍历每条记录并执行非精确字符串比较,这就是为什么它很慢的原因。

你所贴出的mysql错误表明,在MATCH子句中引用的列没有全文索引(或者有,但在MATCH子句中引用的方式不正确)。

假设您正在使用MyISAM或具有MySQL 5.6和InnoDB,请尝试:

ALTER TABLE responses ADD FULLTEXT(response);

编辑:答案(在评论中): “如果MATCH语句中提到了两列,则这两列应该有一个全文索引,而不是每列都有两个全文索引”


谢谢,我已经尝试过了,但是它并没有解决问题。我已经编辑了我的原始帖子。问题是,LIKE搜索所需的时间只有MATCH ... AGAINST搜索的一半,但是我认为通过优化/改进后者,它可以比LIKE搜索更快。我正确吗? - Pawel
谢谢,我已经在我的原帖中编辑了查询和结果。 - Pawel
1
LIKE查询速度较慢。如果你说MATCH查询很快,但是当你将它与另一个使用LIKE的查询连接起来时(就像你在帖子中所说的那样),它就会变慢,那么这并不奇怪。如果它很慢,而你只是连接表格(而没有使用LIKE),那么问题必须出在连接上(ID列是否被索引了?)。 - Richard EB
编辑了帖子。我不是创建表或编写查询的人,但是在“请求”中,我不应该有一个关于responseID的索引吗? - Pawel
1
如果查询时间缩短到可接受的限制,则这才是最重要的。要了解为什么EXPLAIN没有显示使用的索引,您需要从头开始只使用MATCH子句来执行查询,然后添加查询子句,直到看到现在所看到的问题。我已编辑答案,请确认。 - Richard EB
显示剩余7条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接