EXPLAIN ANALYZE is the definitive tool for understanding why a PostgreSQL query
is slow. Unlike EXPLAIN alone (which shows the planner's estimate), EXPLAIN
ANALYZE actually executes the query and shows both the estimated and actual costs —
making it possible to identify exactly where the planner's model diverges from reality.
On AWS RDS PostgreSQL 16, 17, and 18, the same tool works identically to community PostgreSQL,
with a few RDS-specific considerations.
SELECT statements this
is safe. On INSERT, UPDATE, and DELETE, the statement
runs and modifies data. Wrap in a transaction and rollback if you don't want changes committed:
BEGIN; EXPLAIN ANALYZE UPDATE ...; ROLLBACK;
The full diagnostic form
For production diagnosis, always use all four options:
EXPLAIN ( ANALYZE, -- actually execute and show real timings BUFFERS, -- show shared buffer hits/misses (disk vs cache) VERBOSE, -- show output columns and schema-qualified names FORMAT TEXT -- TEXT is readable; JSON is machine-parseable ) SELECT ... -- your query;
Reading the output — what each line means
A typical plan node looks like:
Index Scan using orders_user_id_idx on orders (cost=0.43..8.45 rows=1 width=72) (actual time=0.032..0.033 rows=1 loops=1) Index Cond: (user_id = 12345) Buffers: shared hit=3
| Field | Meaning |
|---|---|
cost=0.43..8.45 |
Planner's estimated startup cost .. total cost (arbitrary units). Not wall-clock time. |
rows=1 (estimate) |
Planner's estimate of rows returned. Divergence from actual is the key signal. |
actual time=0.032..0.033 |
Real startup ms .. total ms for this node. Cumulative — includes child nodes. |
rows=1 (actual) |
Rows actually returned. Compare to estimate — large gaps mean stale statistics. |
loops=1 |
How many times this node executed. In nested loops, multiply time × loops for true cost. |
shared hit=3 |
Pages served from shared_buffers (fast, no disk I/O). |
shared read=N |
Pages read from disk (slow). High values = table/index not in cache on RDS. |
The most important signal: estimated vs actual rows
When the planner estimates 1 row but the query actually returns 50,000 — or vice versa — it will choose the wrong execution strategy. This is the root cause of most unexplained query slowdowns on RDS.
The fix is almost always ANALYZE on the relevant table(s) to update statistics.
For highly skewed distributions (e.g. a status column where 99% of rows have
status = 'completed'), increase the statistics target on that column:
-- Default statistics target is 100. Increase for skewed columns: ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500; -- Then refresh: ANALYZE orders;
Spotting a Seq Scan that should be an Index Scan
A Seq Scan on a large table (millions of rows) where you'd expect an index
scan is a common finding. Causes:
- No index exists on the filter column(s)
- The planner estimates the query returns a large fraction of the table (sequential is faster than random index I/O above ~10% of table size)
- The index exists but is bloated — check with
pgstattuple - Implicit type cast prevents index use (e.g.
WHERE id = '12345'on an integer column)
Using auto_explain on RDS to capture production plans
auto_explain logs query plans automatically for queries exceeding a time
threshold — without requiring developer intervention. Enable it in the RDS parameter group
(requires a reboot to add to shared_preload_libraries):
# In RDS parameter group: shared_preload_libraries = pg_stat_statements,auto_explain auto_explain.log_min_duration = 1000 # log plans for queries taking >1s auto_explain.log_analyze = on # include actual timings (executes query) auto_explain.log_buffers = on # include buffer statistics auto_explain.log_nested_statements = on # capture plans inside functions
Plans appear in the RDS PostgreSQL log (available in CloudWatch Logs if log export is
enabled). Search for duration: followed by Query Text: and
the QUERY PLAN block.
log_min_duration high enough (1000–5000ms) to avoid logging the majority
of fast queries, or use auto_explain.sample_rate (PG12+) to log a
percentage of all queries rather than only slow ones.
Reading one plan is easy. Reading fifty of them systematically is where the value
is.
PGFlare's Diagnostic session analyses
pg_stat_statements output to identify your highest-impact queries, runs EXPLAIN ANALYZE
on each, and delivers a prioritised remediation plan — query rewrites, index additions,
and statistics fixes — with expected impact estimates per change.