Node.js API Performance Playbook

Goal

Maximize throughput and reduce latency without changing infrastructure.

This playbook documents practical patterns that scaled an API from 100 req/s → 50,000 req/s on the same machine and database.

Core Principle

Most Node.js performance problems come from doing unnecessary work.

1. Database Connection Pooling

Problem Pattern

New DB connection per request
Errors like: too many connections
High latency under load

Detection Signals

Connection spikes in DB metrics
Requests failing under burst traffic
Slow response times even with low CPU usage

Implementation

import { Pool } from 'pg';

export const pool = new Pool({
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Use a shared pool across the app:

const result = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Decision Rules

Always use pooling for relational databases
Recommended pool size:
- min(2 * CPU cores, DB max_connections / 4)

Expected Impact

~50–70% latency reduction
Prevents connection exhaustion

Trade-offs

Too many connections → DB overload
Too few → queueing inside the pool

2. Parallelizing Independent Async Operations

Problem Pattern

Sequential awaits for independent operations

const user = await getUser(id);
const orders = await getOrders(id);
const address = await getAddress(id);

Detection Signals

High latency with low CPU usage
Multiple independent queries per request

Implementation

const [user, orders, address] = await Promise.all([
  getUser(id),
  getOrders(id),
  getAddress(id),
]);

Decision Rules

Use parallel execution when:

Operations are independent
No shared mutation/state dependency

Expected Impact

2–3x faster response time (common case)

Trade-offs

Can overload downstream services (DB, APIs)

⚠️ If needed, limit concurrency:

import pLimit from 'p-limit';

const limit = pLimit(5);
await Promise.all(tasks.map(task => limit(task)));

3. In-Memory Caching for Hot Data

Problem Pattern

Same data fetched from DB on every request
Examples: configs, permissions, feature flags

Detection Signals

High DB read volume for identical queries
Low data volatility

Implementation

import LRU from 'lru-cache';

const cache = new LRU({
  max: 1000,
  ttl: 1000 * 60, // 60s
});

export async function getConfig(key: string) {
  const cached = cache.get(key);
  if (cached) return cached;

  const value = await fetchFromDB(key);
  cache.set(key, value);
  return value;
}

Decision Rules

Use cache when:

Data changes infrequently
Same query repeats frequently

Expected Impact

Up to 99% reduction in DB reads

Trade-offs

Stale data (eventual consistency)
Memory usage grows with cache size

4. Streaming Large Payloads

Problem Pattern

Loading large datasets into memory
High RAM usage / OOM crashes

Detection Signals

Memory spikes per request
Node process crashes with "out of memory"

Implementation (PostgreSQL example)

import QueryStream from 'pg-query-stream';
import { pipeline } from 'stream/promises';

const stream = client.query(new QueryStream('SELECT * FROM large_table'));

await pipeline(
  stream,
  transformStream, // optional
  res
);

Decision Rules

Use streaming when:

Response size > ~10MB
Result set > ~10k rows

Expected Impact

Memory: GB → MB
Stable process under load

Trade-offs

More complex error handling
Harder to paginate or retry

5. Multi-Core Utilization (Cluster Mode)

Problem Pattern

Single Node.js process
Only 1 CPU core used

Detection Signals

CPU usage capped at ~100% on multi-core machine
Throughput not scaling with hardware

Implementation (PM2)

pm2 start app.js -i max

Or native cluster:

import cluster from 'cluster';
import os from 'os';

if (cluster.isPrimary) {
  const cores = os.cpus().length;
  for (let i = 0; i < cores; i++) cluster.fork();
} else {
  startServer();
}

Decision Rules

Always use clustering in production (unless using container autoscaling)

Expected Impact

Linear scaling with CPU cores (e.g., 8x on 8 cores)

Trade-offs

Requires stateless architecture
In-memory cache is not shared (use Redis if needed)

6. Optimized JSON Serialization

Problem Pattern

Large responses
High CPU time in JSON.stringify

Detection Signals

High CPU usage during response phase
Profiling shows serialization bottleneck

Implementation

import fastJson from 'fast-json-stringify';

const stringify = fastJson({
  type: 'object',
  properties: {
    id: { type: 'string' },
    name: { type: 'string' },
  },
});

res.send(stringify(data));

Decision Rules

Use optimized serializers when:

Large payloads
Known schema

Expected Impact

Up to 4x faster serialization

Trade-offs

Requires schema definition
Less flexible for dynamic data

7. Response Compression

Problem Pattern

Large payloads → slow network transfer

Detection Signals

High response size
Slow client response time despite fast backend

Implementation

import compression from 'compression';

app.use(compression());

Decision Rules

Enable compression when:

Response size > ~1KB
JSON-heavy APIs

Expected Impact

~70% reduction in payload size

Trade-offs

CPU overhead for compression

Final Outcome

Metric	Before	After
Throughput	100 req/s	50,000 req/s
Infrastructure	Same	Same
Memory	Unstable	Stable
DB Load	High	Optimized

Implementation Checklist

Before scaling infrastructure:

Final Note

If your Node.js API is slow, assume misuse before assuming limits of the runtime.

Most gains come from:

Removing redundant work
Reducing I/O
Using hardware efficiently

Not from rewriting the system.

jgcmarins/Node.js-API-Performance-Playbook.md

Node.js API Performance Playbook

Goal

Core Principle

1. Database Connection Pooling

Problem Pattern

Detection Signals

Implementation

Decision Rules

Expected Impact

Trade-offs

2. Parallelizing Independent Async Operations

Problem Pattern

Detection Signals

Implementation

Decision Rules

Expected Impact

Trade-offs

3. In-Memory Caching for Hot Data

Problem Pattern

Detection Signals

Implementation

Decision Rules

Expected Impact

Trade-offs

4. Streaming Large Payloads

Problem Pattern

Detection Signals

Implementation (PostgreSQL example)

Decision Rules

Expected Impact

Trade-offs

5. Multi-Core Utilization (Cluster Mode)

Problem Pattern

Detection Signals

Implementation (PM2)

Decision Rules

Expected Impact

Trade-offs

6. Optimized JSON Serialization

Problem Pattern

Detection Signals

Implementation

Decision Rules

Expected Impact

Trade-offs

7. Response Compression

Problem Pattern

Detection Signals

Implementation

Decision Rules

Expected Impact

Trade-offs

Final Outcome

Implementation Checklist

Final Note