This is an English translation of a Japanese blog. Some content may not be fully translated.
AWS

Trying QPM (Query Plan Management) in Aurora (PostgreSQL)

What is QPM (Query Plan Management)?


A feature that records execution plans for SQL, then either fixes those execution plans or migrates them to another environment for fixation. This is equivalent to Oracle’s SPM (SQL Plan Management). Since Aurora (PostgreSQL) also supports this feature, let’s actually try it.

How to Control Query Plan Management
rds.enable_plan_management = { 0,1} # Set in parameter group

apg_plan_mgmt.capture_plan_baselines = { manual, automatic, off }
# automatic : Automatic plan capture. Captures SQL executed one or more times
# manual    : Manually capture execution plans for individual SQL
# off       : Disable Query Plan capture

apg_plan_mgmt.use_plan_baselines = { true, false }
# true  : Use captured Query Plan to fix execution plan
# false : Do not use captured Query Plan

Environment and Scenario Description


Assuming migration of execution plans from development environment to production environment.

  • Production DB name: prod
  • Development DB name: dev

Migrate the full scan (seq scan) execution plan from the development environment to the production environment to reproduce an inefficient execution plan.

Modify Parameter Group (dev/prod environments)


Set rds.enable_plan_management in the cluster parameter group to 1

Enable Extension (dev/prod environments)


Install apg_plan_mgmt using the create extension command

Command
\dx
create extension apg_plan_mgmt;
\dx
Execution Result
# Execution result
dev=> \dx
                 List of installed extensions
  Name   | Version |   Schema   |         Description
---------+---------+------------+------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(1 row)
dev=>
dev=> create extension apg_plan_mgmt;    # Enable extension
CREATE EXTENSION
dev=> \dx
                                        List of installed extensions
     Name      | Version |    Schema     |                            Description

---------------+---------+---------------+--------------------------------------------------
-----------------
 apg_plan_mgmt | 1.0.1   | apg_plan_mgmt | Amazon Aurora with PostgreSQL compatibility Query # ★← apg_plan_mgmt added
 Plan Management
 plpgsql       | 1.0     | pg_catalog    | PL/pgSQL procedural language
(2 rows)
dev=>
dev=>

Pre-check dba_plans Table (dev/prod environments)


The apg_plan_mgmt.dba_plans table has nothing stored in it yet.

Command
SELECT sql_hash, plan_hash, status, enabled, sql_text::varchar(100)
FROM   apg_plan_mgmt.dba_plans
ORDER BY sql_text, plan_created;
Execution Result
dev=> SELECT sql_hash, plan_hash, status, enabled, sql_text::varchar(100)
dev-> FROM   apg_plan_mgmt.dba_plans
dev-> ORDER BY sql_text, plan_created;
 sql_hash | plan_hash | status | enabled | sql_text
----------+-----------+--------+---------+----------
(0 rows)

Create Test Table (dev/prod environments)


Create a test table

Command
CREATE TABLE t1 AS
SELECT num                         a -- generated value
      ,'1'                         b -- constant
      ,to_char(num,'FM00000')      c -- use generated value (string like ID)
      ,current_timestamp      d
FROM   generate_series(1,10000) num
;

ALTER TABLE t1 ADD PRIMARY KEY(a);

\d t1
Execution Result
dev=> CREATE TABLE t1 AS
dev-> SELECT num                         a -- generated value
dev->       ,'1'                         b -- constant
dev->       ,to_char(num,'FM00000')      c -- use generated value (string like ID)
dev->       ,current_timestamp      d
dev-> FROM   generate_series(1,10000) num
dev-> ;
SELECT 10000
dev=>
dev=> ALTER TABLE t1 ADD PRIMARY KEY(a);
ALTER TABLE
dev=>
dev=> \d t1
                         Table "public.t1"
 Column |           Type           | Collation | Nullable | Default
--------+--------------------------+-----------+----------+---------
 a      | integer                  |           | not null |
 b      | text                     |           |          |
 c      | text                     |           |          |
 d      | timestamp with time zone |           |          |
Indexes:
    "t1_pkey" PRIMARY KEY, btree (a)
dev=>

Execute SQL and Capture (dev environment)


Execute SQL that results in full scan (seq scan) and SQL that results in Index Scan. At this time, manually capture only the execution plan for the SQL that results in full scan (seq scan).

In this example, you can see that “Plan Hash: -52705300” has been captured in the apg_plan_mgmt.dba_plans table.

Command
SET apg_plan_mgmt.capture_plan_baselines = manual; # Enable manual execution plan capture

explain (hashes true) select count(*) from t1 where a > 1;

SET apg_plan_mgmt.capture_plan_baselines = off;  # Stop manual execution plan capture

explain (hashes true) select count(*) from t1 where a > 9999;

SET apg_plan_mgmt.capture_plan_baselines = off;

SELECT sql_hash, plan_hash, status, enabled, sql_text::varchar(100)
FROM   apg_plan_mgmt.dba_plans
ORDER BY sql_text, plan_created;
Execution Result
# Execution log
dev=> SET apg_plan_mgmt.capture_plan_baselines = manual;
SET
dev=>
dev=> explain (hashes true) select count(*) from t1 where a > 1;
                          QUERY PLAN
--------------------------------------------------------------
 Aggregate  (cost=214.00..214.01 rows=1 width=8)
   ->  Seq Scan on t1  (cost=0.00..189.00 rows=10000 width=0)
         Filter: (a > 1)
 SQL Hash: -1895827505, Plan Hash: -52705300
(4 rows)
dev=>
dev=>
dev=> SET apg_plan_mgmt.capture_plan_baselines = off;
SET
dev=>
dev=>
dev=>
dev=> explain (hashes true) select count(*) from t1 where a > 9999;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Aggregate  (cost=8.30..8.31 rows=1 width=8)
   ->  Index Only Scan using t1_pkey on t1  (cost=0.29..8.30 rows=1 width=0)
         Index Cond: (a > 9999)
 Plan Hash: 1222455040
(4 rows)
dev=>
dev=> SELECT sql_hash, plan_hash, status, enabled, sql_text::varchar(100)
dev-> FROM   apg_plan_mgmt.dba_plans
dev-> ORDER BY sql_text, plan_created;
  sql_hash   | plan_hash |  status  | enabled |               sql_text
-------------+-----------+----------+---------+--------------------------------------
 -1895827505 | -52705300 | Approved | t       | select count(*) from t1 where a > 1; # ★← execution plan has been captured
(1 row)
dev=>
dev=>
dev=>

Create Export Table (dev)


CREATE TABLE plans_copy AS SELECT * FROM apg_plan_mgmt.plans;

Export Using pg_dump (dev)


export PGPASSWORD=xxxxxx
pg_dump -U postgres -h aurora-postgres-cluster-dev.cluster-xxxxxxxx.ap-northeast-1.rds.amazonaws.com -t plans_copy -Ft dev > plans_copy.tar

# Delete the intermediate table if no longer needed
--DROP TABLE plans_copy;

Import to Another DB (prod)


pg_restore -U postgres -h aurora-postgres-cluster-prod.cluster-xxxxxxxxx.ap-northeast-1.rds.amazonaws.com -t plans_copy -Ft plans_copy.tar -d prod

Verify Import and Merge (prod)


Perform execution plan merge with reference to this manual section.

Maintaining Execution Plans - Amazon Aurora https://docs.aws.amazon.com/ja_jp/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Optimize.Maintenance.html#AuroraPostgreSQL.Optimize.Maintenance.ExportingImporting

Exporting and importing plans

Command
INSERT INTO apg_plan_mgmt.plans SELECT * FROM plans_copy
 ON CONFLICT ON CONSTRAINT plans_pkey
 DO UPDATE SET
 status = EXCLUDED.status,
 enabled = EXCLUDED.enabled,
 last_used = CASE WHEN EXCLUDED.last_used > plans.last_used
 THEN EXCLUDED.last_used ELSE plans.last_used END,
 estimated_startup_cost = EXCLUDED.estimated_startup_cost,
 estimated_total_cost = EXCLUDED.estimated_total_cost,
 planning_time_ms = EXCLUDED.planning_time_ms,
 execution_time_ms = EXCLUDED.execution_time_ms,
-- estimated_rows = EXCLUDED.estimated_rows,
-- actual_rows = EXCLUDED.actual_rows,
 total_time_benefit_ms = EXCLUDED.total_time_benefit_ms,
 execution_time_benefit_ms = EXCLUDED.execution_time_benefit_ms;

# Reload management plans to shared memory
SELECT apg_plan_mgmt.reload(); -- refresh shared memory
DROP TABLE plans_copy;

-- Verify
SELECT sql_hash, plan_hash, status, enabled, sql_text::varchar(100)
FROM   apg_plan_mgmt.dba_plans
ORDER BY sql_text, plan_created;

Note that estimated_rows and actual_rows need to be commented out. It appears these columns don’t actually exist. Manual error?

ERROR:  column excluded.actual_rows does not exist
LINE 13:  actual_rows = EXCLUDED.actual_rows,
Execution Result
prod=> INSERT INTO apg_plan_mgmt.plans SELECT * FROM plans_copy
prod->  ON CONFLICT ON CONSTRAINT plans_pkey
prod->  DO UPDATE SET
prod->  status = EXCLUDED.status,
prod->  enabled = EXCLUDED.enabled,
prod->  last_used = CASE WHEN EXCLUDED.last_used > plans.last_used
prod->  THEN EXCLUDED.last_used ELSE plans.last_used END,
prod->  estimated_startup_cost = EXCLUDED.estimated_startup_cost,
prod->  estimated_total_cost = EXCLUDED.estimated_total_cost,
prod->  planning_time_ms = EXCLUDED.planning_time_ms,
prod->  execution_time_ms = EXCLUDED.execution_time_ms,
prod-> -- estimated_rows = EXCLUDED.estimated_rows,
prod-> -- actual_rows = EXCLUDED.actual_rows,
prod->  total_time_benefit_ms = EXCLUDED.total_time_benefit_ms,
prod->  execution_time_benefit_ms = EXCLUDED.execution_time_benefit_ms;
INSERT 0 2
prod=>
prod=>
prod=> SELECT apg_plan_mgmt.reload();
 reload
--------
      0
(1 row)
prod=>
prod=>
prod=> DROP TABLE plans_copy;
DROP TABLE
prod=>
prod=>
prod=>
prod=>
prod=> SELECT sql_hash, plan_hash, status, enabled, sql_text::varchar(100)
prod-> FROM   apg_plan_mgmt.dba_plans
prod-> ORDER BY sql_text, plan_created;
  sql_hash   | plan_hash  |   status   | enabled |                sql_text
-------------+------------+------------+---------+-----------------------------------------
 -1895827505 |  -52705300 | Approved   | t       | select count(*) from t1 where a > 1;
(2 rows)
prod=>
prod=>
prod=>

Execute SQL

In this example, we are forcing a fullscan where an index scan would be selected.

Command
-- SQL test
SET apg_plan_mgmt.use_plan_baselines = true;
explain (hashes true) select count(*) from t1 where a > 9999;
SET apg_plan_mgmt.use_plan_baselines = false;
Execution Result
prod=> SET apg_plan_mgmt.use_plan_baselines = true;
prod=>
prod=> explain (hashes true) select count(*) from t1 where a > 9999;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Aggregate  (cost=189.00..189.01 rows=1 width=8)
   ->  Seq Scan on t1  (cost=0.00..189.00 rows=1 width=0)
         Filter: (a > 9999)
 Note: An Approved plan was used instead of the minimum cost plan.
 SQL Hash: -1895827505, Plan Hash: -52705300, Minimum Cost Plan Hash: 1222455040
(5 rows)
prod=> SET apg_plan_mgmt.use_plan_baselines = false;
SET

In this example, the WHERE clause is filtering down the data, so the index scan is the correct execution plan. However, QPM has captured the execution plan for select count(*) from t1 where a > 1;, which results in a Full scan (Seq Scan). This is a pattern where we deliberately selected an inefficient execution plan.

The log also indicates Note: An Approved plan was used instead of the minimum cost plan. - saying that there is a plan with better cost.

prod=> explain (hashes true) select count(*) from t1 where a > 9999;

QUERY PLAN

Aggregate (cost=189.00..189.01 rows=1 width=8) -> Seq Scan on t1 (cost=0.00..189.00 rows=1 width=0) Filter: (a > 9999) Note: An Approved plan was used instead of the minimum cost plan. SQL Hash: -1895827505, Plan Hash: -52705300, Minimum Cost Plan Hash: 1222455040

References


Maintaining Execution Plans - Amazon Aurora https://docs.aws.amazon.com/ja_jp/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Optimize.Maintenance.html

Use Cases for Query Plan Management in Amazon Aurora PostgreSQL | Amazon Web Services Blog https://aws.amazon.com/jp/blogs/news/use-cases-for-query-plan-management-in-amazon-aurora-postgresql/

Suggest an edit on GitHub