PostgreSQL Performance for Developers — Indexes, EXPLAIN, and Common Mistakes
Practical PostgreSQL optimization techniques I've learned building production systems — from index strategy to query analysis to connection pooling.
Working on platforms with 70,000+ users taught me that most performance problems aren't application bugs — they're database problems. Here are the techniques I use to diagnose and fix them.
Start With EXPLAIN ANALYZE
Before touching any index, understand what the query planner is actually doing:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.reference, o.status, c.name as company_name
FROM orders o
JOIN companies c ON c.id = o.company_id
WHERE o.status = 'PENDING'
AND o.company_id = '550e8400-e29b-41d4-a716-446655440000'
ORDER BY o.created_at DESC
LIMIT 20;The output tells you:
- Seq Scan — reading the entire table. Bad on large tables.
- Index Scan — using an index. Good.
- Hash Join vs Nested Loop — the join strategy. Nested loop is better when one side is small.
- Rows — estimated vs actual. Large discrepancy = stale statistics, run
ANALYZE. - Buffers — how much data was read from disk vs cache. High disk reads = needs caching or better indexes.
Index Strategy
The Most Valuable Index: (foreign_key, status)
Most queries filter by both a foreign key and a status column. A single-column index on each is almost never optimal. Composite index:
-- Before: two separate indexes (inefficient)
CREATE INDEX idx_orders_company ON orders(company_id);
CREATE INDEX idx_orders_status ON orders(status);
-- After: one composite index (covers both filters)
CREATE INDEX idx_orders_company_status ON orders(company_id, status);
-- Even better if you always sort by created_at
CREATE INDEX idx_orders_company_status_date
ON orders(company_id, status, created_at DESC);Column order in composite indexes matters: put equality conditions first, range/sort conditions last.
Partial Indexes
If 95% of your queries target status = 'ACTIVE' rows:
-- Index only active records — much smaller, faster
CREATE INDEX idx_users_active ON users(email) WHERE status = 'ACTIVE';This is dramatically more efficient than a full index when inactive records are the majority.
GIN Indexes for Arrays and JSONB
If you store skills or tags as an array:
-- Without index: Seq Scan on every query
SELECT * FROM jobs WHERE 'React' = ANY(skills);
-- With GIN index: instant
CREATE INDEX idx_jobs_skills ON jobs USING GIN(skills);For JSONB columns (common when migrating from MongoDB):
CREATE INDEX idx_profiles_meta ON profiles USING GIN(metadata jsonb_path_ops);Connection Pooling Is Not Optional
PostgreSQL's default behavior: one connection = one process. At scale, opening a new connection per request will bring your database to its knees.
Use PgBouncer as a connection pooler:
# pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25With pool_mode = transaction, a database connection is only held during a transaction. 1000 concurrent app connections can share 25 real Postgres connections. This is how platforms with millions of users run on small databases.
With Prisma, point your DATABASE_URL at PgBouncer and add ?pgbouncer=true:
DATABASE_URL="postgresql://user:pass@pgbouncer:6432/myapp?pgbouncer=true&connection_limit=1"
N+1 Queries: The Silent Killer
The most common performance bug in ORM-heavy code:
// BAD — 1 query for orders + N queries for companies
const orders = await prisma.order.findMany({ where: { status: 'PENDING' } })
for (const order of orders) {
const company = await prisma.company.findUnique({ where: { id: order.companyId } })
// ...
}
// GOOD — 1 query with JOIN
const orders = await prisma.order.findMany({
where: { status: 'PENDING' },
include: { company: true },
})Always check your query logs in development. I add this to every Node.js project:
const prisma = new PrismaClient({
log: process.env.NODE_ENV === 'development'
? ['query', 'warn', 'error']
: ['warn', 'error'],
})Vacuuming and Table Bloat
PostgreSQL uses MVCC — deleted rows aren't immediately removed, they become "dead tuples". Autovacuum handles this, but high-write tables can accumulate bloat.
Check table bloat:
SELECT
schemaname,
tablename,
n_dead_tup,
n_live_tup,
round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2) AS dead_pct
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;If dead_pct is above 20%, tune autovacuum or run VACUUM ANALYZE tablename manually.
Quick Wins Checklist
- Run
EXPLAIN ANALYZEbefore adding any index - Add composite indexes for (fk, status) pairs you filter often
- Use PgBouncer in transaction mode in production
- Log slow queries:
log_min_duration_statement = 500in postgresql.conf - Set
work_memappropriately for sort-heavy queries (default 4MB is often too low) - Run
ANALYZEafter bulk inserts - Use
RETURNINGto avoid extra SELECT after INSERT/UPDATE
The database is the foundation. Neglecting it means hitting invisible walls as you scale.