MySQL Execution Plan -- IN条件与ORDER BY组合优化
测试环境
MySQL版本: 5.7.27-30-log Percona Server (GPL), wsrep_31.39
涉及表结构:
CREATE TABLE `scout_job` (
`task_id` varchar(22) NOT NULL DEFAULT '' COMMENT '任务id',
`job_id` int(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'jobId',
`env_id` varchar(10) NOT NULL DEFAULT '' COMMENT '环境id',
`status` int(2) NOT NULL DEFAULT '0' COMMENT '0-初始化任务 1-任务执行中 2-执行成功 3-执行失败 -1:任务被清理',
`start_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '开始时间',
`end_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '结束时间',
PRIMARY KEY (`job_id`) USING BTREE,
KEY `idx_envid` (`env_id`) USING BTREE,
KEY `idx_id_status_endTime` (`env_id`,`status`,`end_time`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3416771 DEFAULT CHARSET=utf8mb4 COMMENT='任务记录表'
涉及SQL:
SELECT job_id FROM scout_job WHERE env_id = '393684' and status in (2,3) ORDER by end_time desc limit 2;
在系统没有任何压力情况下,该SQL执行时间超过200ms。
问题分析
查看SQL对应执行计划:
mysql> DESC SELECT job_id FROM scout_job WHERE env_id = '393684' and status in (2,3) ORDER by end_time desc limit 2 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: scout_job
partitions: NULL
type: ref
possible_keys: idx_envid,idx_id_status_endTime
key: idx_envid
key_len: 42
ref: const
rows: 152938
filtered: 20.00
Extra: Using index condition; Using where; Using filesort
1 row in set, 1 warning (0.00 sec)
查看满足WHERE条件数据:
mysql> SELECT COUNT(1) FROM scout_job WHERE env_id = '393684' and status in (2,3);
+----------+
| COUNT(1) |
+----------+
| 94828 |
+----------+
1 row in set (0.15 sec)
通过profiling查看耗时情况:
mysql> SHOW PROFILE CPU,BLOCK IO,SWAPS FOR QUERY 1;
+--------------------------+----------+----------+------------+--------------+---------------+-------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | Swaps |
+--------------------------+----------+----------+------------+--------------+---------------+-------+
| starting | 0.000065 | NULL | NULL | NULL | NULL | NULL |
| checking permissions | 0.000005 | NULL | NULL | NULL | NULL | NULL |
| Opening tables | 0.000014 | NULL | NULL | NULL | NULL | NULL |
| init | 0.000031 | NULL | NULL | NULL | NULL | NULL |
| System lock | 0.000008 | NULL | NULL | NULL | NULL | NULL |
| optimizing | 0.000011 | NULL | NULL | NULL | NULL | NULL |
| statistics | 0.000156 | NULL | NULL | NULL | NULL | NULL |
| preparing | 0.000019 | NULL | NULL | NULL | NULL | NULL |
| Sorting result | 0.000004 | NULL | NULL | NULL | NULL | NULL |
| executing | 0.000002 | NULL | NULL | NULL | NULL | NULL |
| Sending data | 0.000005 | NULL | NULL | NULL | NULL | NULL |
| Creating sort index | 0.208818 | NULL | NULL | NULL | NULL | NULL |
| innobase_commit_low (-1) | 0.000011 | NULL | NULL | NULL | NULL | NULL |
| end | 0.000005 | NULL | NULL | NULL | NULL | NULL |
| query end | 0.000016 | NULL | NULL | NULL | NULL | NULL |
| innobase_commit_low (-1) | 0.000008 | NULL | NULL | NULL | NULL | NULL |
| closing tables | 0.000011 | NULL | NULL | NULL | NULL | NULL |
| freeing items | 0.000033 | NULL | NULL | NULL | NULL | NULL |
| cleaning up | 0.000017 | NULL | NULL | NULL | NULL | NULL |
+--------------------------+----------+----------+------------+--------------+---------------+-------+
19 rows in set, 1 warning (0.00 sec)
根据profiling结果可以发现99.9%的耗时在Creating sort index
环节,查询条件中包含IN操作,MySQL需要对满足env_id = '393684' and status in (2,3)
条件的结果集进行排序(ORDER by end_time desc
)然后取前2行(limit 2
),由于满足条件记录较多,所以排序操作消耗时间较长。
问题优化
由于表上存在索引idx_id_status_endTime
(env_id
,status
,end_time
) ,如果IN条件仅包含1个可选值,通过该索引经过WHERE条件过滤后的数据在end_time
列上有序,即可避免排序操作,如:
mysql> DESC SELECT job_id FROM scout_job WHERE env_id = '393684' and status in (2) ORDER by end_time desc limit 2 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: scout_job
partitions: NULL
type: ref
possible_keys: idx_envid,idx_id_status_endTime
key: idx_id_status_endTime
key_len: 46
ref: const,const
rows: 34002
filtered: 100.00
Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)
相比IN中包含多个值的执行计划,IN单个值的执行计划中的rows仍较大,但Extra列中Using filesort
已被消除。
通过profiling查看耗时情况:
+--------------------------+----------+----------+------------+--------------+---------------+-------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | Swaps |
+--------------------------+----------+----------+------------+--------------+---------------+-------+
| starting | 0.000066 | NULL | NULL | NULL | NULL | NULL |
| checking permissions | 0.000005 | NULL | NULL | NULL | NULL | NULL |
| Opening tables | 0.000013 | NULL | NULL | NULL | NULL | NULL |
| init | 0.000028 | NULL | NULL | NULL | NULL | NULL |
| System lock | 0.000007 | NULL | NULL | NULL | NULL | NULL |
| optimizing | 0.000013 | NULL | NULL | NULL | NULL | NULL |
| statistics | 0.000126 | NULL | NULL | NULL | NULL | NULL |
| preparing | 0.000016 | NULL | NULL | NULL | NULL | NULL |
| Sorting result | 0.000003 | NULL | NULL | NULL | NULL | NULL |
| executing | 0.000002 | NULL | NULL | NULL | NULL | NULL |
| Sending data | 0.000039 | NULL | NULL | NULL | NULL | NULL |
| innobase_commit_low (-1) | 0.000004 | NULL | NULL | NULL | NULL | NULL |
| end | 0.000002 | NULL | NULL | NULL | NULL | NULL |
| query end | 0.000009 | NULL | NULL | NULL | NULL | NULL |
| innobase_commit_low (-1) | 0.000005 | NULL | NULL | NULL | NULL | NULL |
| closing tables | 0.000005 | NULL | NULL | NULL | NULL | NULL |
| freeing items | 0.000022 | NULL | NULL | NULL | NULL | NULL |
| cleaning up | 0.000011 | NULL | NULL | NULL | NULL | NULL |
+--------------------------+----------+----------+------------+--------------+---------------+-------+
耗时为208ms的Creating sort index
已被优化掉,查询从208ms优化到0.1毫秒。
对于IN包含多个值的情况,可以通过SQL改写来优化:
# 改写前SQL:
DESC SELECT job_id FROM scout_job WHERE env_id = '393684' and status in (2,3) ORDER by end_time desc limit 2
# 改写后SQL:
SELECT job_id FROM (
SELECT * FROM (SELECT job_id, end_time FROM scout_job WHERE env_id = '393684' AND STATUS IN (2) ORDER BY end_time DESC LIMIT 2) AS T2
UNION
SELECT * FROM (SELECT job_id, end_time FROM scout_job WHERE env_id = '393684' AND STATUS IN (3) ORDER BY end_time DESC LIMIT 2) AS T3
) AS T1 ORDER BY end_time DESC LIMIT 2
由于MySQL的UNION限制,对于含有ORDER BY的查询需要使用派生表的方式解决。
如果IN包含值较多,改写后的SQL会看起来比较"复杂",也可以考虑在应用程序端进行调整,将IN操作改为等值操作。