Client Outcomes

What PGFlare delivers
in practice

Real results from engineering teams across FinTech, HealthTech, and SaaS — from emergency incident response to structured cost reduction programmes.

Client details are anonymised by mutual agreement. Instance classes, cost figures, and timelines are accurate as reported at project close. Specific metrics available under NDA on request.

FinTech & Payments

UK Payment Processor Cuts RDS Bill by 65%

Diagnostic + Remediation+ session pair on a db.r5.4xlarge Multi-AZ PostgreSQL 16.9 cluster

65% RDS cost reduction
14× Query speedup (P95)
£3.1k Monthly saving

Challenge

Transaction reporting queries were hitting 8–12 second latency during peak hours. The team had right-sized twice in six months (from r5.2xlarge to r5.4xlarge) without improvement. AWS support recommended moving to r5.8xlarge at an additional £2,400/month.

What PGFlare Found

  • Three sequential scans on a 180 GB transactions table (missing composite index)
  • Autovacuum dead tuple threshold 4× too high — 38 GB of table bloat
  • Connection pooling absent — 900+ direct connections exhausting shared_buffers
  • Two N+1 query patterns in ORM layer (87,000 queries per report run)

Results Timeline

Day 1 — Diagnostic: findings report delivered. 12 priority issues ranked by impact. Immediate: indexes created CONCURRENTLY — query time 8s → 1.4s

Day 2 — Remediation+: autovacuum tuned, bloat cleared, PgBouncer configured. CPU utilisation: 78% → 22%

Week 3 — Downsize to db.r5.xlarge completed. SLA maintained. Monthly bill: £4,820 → £1,690

"We'd been told the instance was too small. PGFlare showed us in one day that the queries were the problem, not the hardware. We saved more in the first month than the entire engagement cost."
— CTO, UK Payment Processor (anonymised)
HealthTech

P1 Database Incident Resolved: 4-Hour Emergency Response

Emergency response + follow-up Diagnostic on a db.m5.2xlarge PostgreSQL 15.6 patient records system

4h Time to resolution
0 Data loss
47% RDS cost reduction

Challenge

At 07:42 on a Tuesday, all write operations began failing. A patient scheduling module serving 12 NHS-connected clinics went dark. The engineering team had no PostgreSQL specialist in-house and couldn't identify the root cause. ICB escalation risk was active within 2 hours.

What PGFlare Found

  • Long-running VACUUM blocking autovacuum — table bloat 180% of table data
  • Lock chain: 1 stale connection holding AccessExclusiveLock for 4.2 hours
  • pg_toast table grown to 94 GB (JSONB column storing large blobs inline)
  • Dead connection accumulation depleting max_connections (configured at 100)

Results Timeline

08:15 — PGFlare emergency engaged. IAM access granted. Root cause identified within 22 minutes

09:40 — Writes restored after controlled lock termination + connection limit tuning. JSONB overflow fix deployed without schema change

Following week — Full Diagnostic session: structural fixes implemented. Instance right-sized from m5.2xlarge to m5.large at 3 weeks

"We were staring at a regulatory incident. PGFlare was on the call within minutes, had a diagnosis in 22 minutes, and writes were back in under 2 hours. That's exactly what you need at 8am on a Tuesday."
— Lead Platform Engineer, HealthTech SaaS (anonymised)
E-Commerce & Retail

Black Friday Prep: Autovacuum Bloat Crisis Averted

Diagnostic session on a db.r6g.2xlarge PostgreSQL 16.9 order management system — 6 weeks before peak

83% Table bloat reduction
11× Checkout query speedup
£1.8k Monthly saving

Challenge

Six weeks before Black Friday, checkout query P99 latency had grown from 380ms to 2.1 seconds vs. the prior year. A Grafana alert showed autovacuum running continuously but not catching up. Storage had grown 180 GB in 3 months with no new data volume increase.

What PGFlare Found

  • Autovacuum cost_delay set to 20ms (default) — too slow for write-heavy orders table
  • orders table: 74% dead tuples by row count (2.1 GB live, 8.7 GB dead)
  • Missing partial index on (status, created_at) — all active-order queries doing full scan
  • Three FK constraints with no backing indexes — UPDATE cascades triggering seq scans
  • pg_stat_user_tables showed 0 autovacuum runs completing cleanly in 14 days

Results Timeline

Day 1 — Diagnostic report delivered. Autovacuum tuned: cost_delay 0ms, vacuum_scale_factor reduced to 0.01. Bloat cleared within 6 hours

Day 1-2 — Partial index and FK indexes added CONCURRENTLY during low-traffic window. Checkout latency: 2.1s → 190ms P99

Black Friday — Record transaction volume processed without incident. CPU peaked at 34% vs. 89% the prior year

"PGFlare found things in one day that our team had been chasing for weeks. The autovacuum config alone halved our weekly storage growth. Black Friday went smoothly for the first time in three years."
— VP Engineering, UK E-Commerce Platform (anonymised)

Ready to see what PGFlare finds in your RDS instance?

Every engagement starts with a free 30-minute technical review — real analysis of your actual workload, no sales pitch.

View Pricing & Enquire → Estimate Your Savings