A planner regression is when a PostgreSQL upgrade — including a minor version patch like 16.1 → 16.4 or a major version upgrade from 15 to 16 — causes a query that previously ran fast to suddenly become slow. The query hasn't changed. The data hasn't changed significantly. But PostgreSQL's query planner is now choosing a different execution plan.
These are among the most disorienting performance problems on RDS because they appear suddenly, correlate with maintenance events, and are invisible to standard monitoring until queries are already timing out in production.
Why planner regressions happen on RDS minor version upgrades
RDS applies minor version patches during maintenance windows. Between minor versions, PostgreSQL ships planner cost model adjustments, statistics collection improvements, and occasionally new join strategy heuristics. Any of these can cause the planner to re-evaluate a query and choose a different plan than before.
Common regression triggers on PostgreSQL 16, 17, and 18:
- Statistics reset after upgrade: RDS minor version upgrades may clear the pg_statistic table. The planner falls back to default row estimates until ANALYZE runs — which may choose suboptimal plans on tables with skewed data distributions.
- enable_memoize (PG14+): Memoisation of inner-loop results was introduced
in PostgreSQL 14. On certain join patterns, memoize can be slower than the planner
predicts. It can be disabled per-session with
SET enable_memoize = off. - Parallel query threshold changes: PostgreSQL 16 and 17 adjusted the cost thresholds for parallel query initiation. A query that previously ran in a single backend process may now spawn parallel workers — adding overhead on small result sets.
- JIT compilation defaults: JIT was enabled by default from PostgreSQL 12.
On short-running OLTP queries, JIT compilation overhead exceeds execution time savings.
Check
SHOW jitand considerSET jit = offfor OLTP workloads.
Step 1 — Identify regressed queries
Compare pg_stat_statements before and after a maintenance event to surface regressions:
-- Queries where average execution time is significantly higher than stddev suggests -- (indicating a plan change rather than data volume change): SELECT left(query, 80) AS query_snippet, calls, round(mean_exec_time::numeric, 2) AS avg_ms, round(stddev_exec_time::numeric, 2) AS stddev_ms, round(max_exec_time::numeric, 2) AS max_ms FROM pg_stat_statements WHERE calls > 50 AND mean_exec_time > 100 -- queries averaging over 100ms ORDER BY mean_exec_time DESC LIMIT 25;
Use CloudWatch QueryExecutionTime via Performance Insights to spot queries that changed cost profile around the time of the maintenance event. A step-change in average execution time for a specific query fingerprint is the signature of a plan change.
Step 2 — Confirm a plan change with EXPLAIN
Run EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) on the suspect query and compare
the plan to what you'd expect. Look for:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ... -- your suspect query here;
Key signals of a regression:
Seq Scanon a large table where you'd expect an index scan- Estimated rows wildly different from actual rows (planner is guessing wrong)
Hash Joinreplaced byNested Loopon large tables- High
Buffers: shared read(disk reads) where previous plan was all-cache
Step 3 — Fix without rewriting the query
Before rewriting queries or adding hints (PostgreSQL doesn't have native hints), try these in order:
Run ANALYZE to refresh statistics:
ANALYZE VERBOSE schema_name.table_name; -- Or analyze the whole database (takes time on large schemas): ANALYZE VERBOSE;
Disable JIT for the regressed query pattern:
-- At session level (useful for testing): SET jit = off; -- In the parameter group for all connections: jit = off -- set in RDS parameter group
Disable memoize if the regression is on a nested-loop join:
SET enable_memoize = off;
Force a specific join strategy for a session:
SET enable_hashjoin = off; -- force nested loop or merge join SET enable_nestloop = off; -- force hash or merge join SET enable_seqscan = off; -- force index usage (use with caution)
PostgreSQL 16 and 17 specific changes to be aware of
PostgreSQL 16 introduced incremental sort improvements and changed the cost model for merge joins in certain cases. PostgreSQL 17 refined the planner's handling of parameterised nested loops and adjusted partition pruning logic. Both versions improved statistics for range types and multi-column statistics — but the improvements assume fresh statistics. Running ANALYZE immediately after any minor version upgrade is good practice on RDS.
Planner regressions need a before/after comparison that most teams don't have.
PGFlare's Diagnostic session uses
pg_stat_statements
and Performance Insights data to identify regressed query fingerprints, produce before/after
EXPLAIN
comparisons, and recommend targeted fixes — including parameter group changes, statistics
updates,
and index additions — with the lowest-risk mitigation path first.