PostgreSQL Performance for Developers — Indexes, EXPLAIN, and Common Mistakes

Working on platforms with 70,000+ users taught me that most performance problems aren't application bugs — they're database problems. Here are the techniques I use to diagnose and fix them.

Start With EXPLAIN ANALYZE

Before touching any index, understand what the query planner is actually doing:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.reference, o.status, c.name as company_name
FROM orders o
JOIN companies c ON c.id = o.company_id
WHERE o.status = 'PENDING'
  AND o.company_id = '550e8400-e29b-41d4-a716-446655440000'
ORDER BY o.created_at DESC
LIMIT 20;

The output tells you:

Seq Scan — reading the entire table. Bad on large tables.
Index Scan — using an index. Good.
Hash Join vs Nested Loop — the join strategy. Nested loop is better when one side is small.
Rows — estimated vs actual. Large discrepancy = stale statistics, run ANALYZE.
Buffers — how much data was read from disk vs cache. High disk reads = needs caching or better indexes.

Index Strategy

The Most Valuable Index: (foreign_key, status)

Most queries filter by both a foreign key and a status column. A single-column index on each is almost never optimal. Composite index:

-- Before: two separate indexes (inefficient)
CREATE INDEX idx_orders_company ON orders(company_id);
CREATE INDEX idx_orders_status ON orders(status);
 
-- After: one composite index (covers both filters)
CREATE INDEX idx_orders_company_status ON orders(company_id, status);
 
-- Even better if you always sort by created_at
CREATE INDEX idx_orders_company_status_date 
ON orders(company_id, status, created_at DESC);

Column order in composite indexes matters: put equality conditions first, range/sort conditions last.

Partial Indexes

If 95% of your queries target status = 'ACTIVE' rows:

-- Index only active records — much smaller, faster
CREATE INDEX idx_users_active ON users(email) WHERE status = 'ACTIVE';

This is dramatically more efficient than a full index when inactive records are the majority.

GIN Indexes for Arrays and JSONB

If you store skills or tags as an array:

-- Without index: Seq Scan on every query
SELECT * FROM jobs WHERE 'React' = ANY(skills);
 
-- With GIN index: instant
CREATE INDEX idx_jobs_skills ON jobs USING GIN(skills);

For JSONB columns (common when migrating from MongoDB):

CREATE INDEX idx_profiles_meta ON profiles USING GIN(metadata jsonb_path_ops);

Connection Pooling Is Not Optional

PostgreSQL's default behavior: one connection = one process. At scale, opening a new connection per request will bring your database to its knees.

Use PgBouncer as a connection pooler:

# pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
 
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

With pool_mode = transaction, a database connection is only held during a transaction. 1000 concurrent app connections can share 25 real Postgres connections. This is how platforms with millions of users run on small databases.

With Prisma, point your DATABASE_URL at PgBouncer and add ?pgbouncer=true:

DATABASE_URL="postgresql://user:pass@pgbouncer:6432/myapp?pgbouncer=true&connection_limit=1"

N+1 Queries: The Silent Killer

The most common performance bug in ORM-heavy code:

// BAD — 1 query for orders + N queries for companies
const orders = await prisma.order.findMany({ where: { status: 'PENDING' } })
for (const order of orders) {
  const company = await prisma.company.findUnique({ where: { id: order.companyId } })
  // ...
}
 
// GOOD — 1 query with JOIN
const orders = await prisma.order.findMany({
  where: { status: 'PENDING' },
  include: { company: true },
})

Always check your query logs in development. I add this to every Node.js project:

const prisma = new PrismaClient({
  log: process.env.NODE_ENV === 'development' 
    ? ['query', 'warn', 'error'] 
    : ['warn', 'error'],
})

Vacuuming and Table Bloat

PostgreSQL uses MVCC — deleted rows aren't immediately removed, they become "dead tuples". Autovacuum handles this, but high-write tables can accumulate bloat.

Check table bloat:

SELECT 
  schemaname,
  tablename,
  n_dead_tup,
  n_live_tup,
  round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2) AS dead_pct
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

If dead_pct is above 20%, tune autovacuum or run VACUUM ANALYZE tablename manually.

Quick Wins Checklist

Run EXPLAIN ANALYZE before adding any index
Add composite indexes for (fk, status) pairs you filter often
Use PgBouncer in transaction mode in production
Log slow queries: log_min_duration_statement = 500 in postgresql.conf
Set work_mem appropriately for sort-heavy queries (default 4MB is often too low)
Run ANALYZE after bulk inserts
Use RETURNING to avoid extra SELECT after INSERT/UPDATE

The database is the foundation. Neglecting it means hitting invisible walls as you scale.