Prisma with Azure SQL: Pooling, Retries, and Gotchas

Prisma + Azure SQL is among the most common backend stacks in the Microsoft ecosystem, and also among the most misconfigured. Pool sizes set for Postgres, no retry on transient faults, serverless tiers that pause and wake unpredictably. The defaults that work fine in dev fall over in production. This is the production configuration every team running this combination should ship with.

Azure SQL Pooling Is Different From Postgres

Postgres engineers tend to think in terms of max_connections = 100 and aim a pool at that. Azure SQL works differently: connection limits are per service tier, not a single knob, and exceeding them results in hard rejections rather than queuing. The Basic tier caps at 30 concurrent user connections; General Purpose scales with compute. Pool sizes must be chosen with the tier in mind.

Pool Sizing, Per Tier

Basic / Standard S0–S1. Keep per-instance Prisma connection_limit at 5. Any more and a single bad query spike can starve other instances.
General Purpose / vCore. Per-instance limit of 10 is a reasonable default; monitor and raise carefully.
Serverless tier. Watch for auto-pause. See below. Treat cold starts as a first-class concern in your retry logic.
Hyperscale. Much higher ceilings, but the same discipline applies. Size pools to your container count, not your maximum connection limit.

The Connection String Parameters That Matter

# .env — Prisma + Azure SQL with production-safe parameters
DATABASE_URL="sqlserver://server.database.windows.net:1433;\
  database=appdb;\
  user=app-user@server;\
  password=;\
  encrypt=true;\
  trustServerCertificate=false;\
  connection_limit=10;\
  pool_timeout=20;\
  connect_timeout=30;\
  loginTimeout=30"

connection_limit. Prisma's per-instance pool. Size it to the tier, not to the default.
pool_timeout. How long Prisma waits for a connection from the pool before erroring. 20 seconds balances latency and recovery.
connect_timeout / loginTimeout. The network-level handshake budget. Increase for serverless tiers that may cold-start.
encrypt=true; trustServerCertificate=false. Non-negotiable in production. Azure SQL enforces TLS; do not disable verification.

Transient Fault Handling Is Mandatory

Azure SQL will terminate idle connections, recycle compute during platform maintenance, and throttle during burst load. A production client must retry on the transient error codes. Prisma does not retry for you; wrap your queries.

// lib/retry.ts
const TRANSIENT_CODES = new Set([
  40197, 40501, 40613, 49918, 49919, 49920, // service busy / throttling
  4060,  10928, 10929, 10053, 10054, 10060, // connection-level transient
  233,   64,                                 // resource-level transient
]);

export async function withRetry<T>(fn: () => Promise<T>, attempts = 4): Promise<T> {
  let lastErr: unknown;
  for (let i = 0; i < attempts; i++) {
    try { return await fn(); }
    catch (err: any) {
      const code = err?.meta?.code ?? err?.code;
      if (!TRANSIENT_CODES.has(Number(code))) throw err;
      lastErr = err;
      const backoff = Math.min(200 * 2 ** i + Math.random() * 100, 4000);
      await new Promise(r => setTimeout(r, backoff));
    }
  }
  throw lastErr;
}

// usage
const user = await withRetry(() => prisma.user.findUnique({ where: { id } }));

Serverless Tier: Auto-Pause and First-Request Latency

The Azure SQL serverless tier pauses compute after a configurable idle period and wakes on the next connection. The first request after a pause can take 30–90 seconds. Keep serverless for dev and preview environments, and avoid it for user-facing production workloads unless you have a warming strategy. If you keep it, increase connect_timeout, ensure your retry logic tolerates 30-second waits, and accept that some user requests will absorb the wake-up.

Read Replicas

Azure SQL's read-scale replicas are underused in the Node ecosystem. Route long-running report queries to the read-only endpoint (ApplicationIntent=ReadOnly) using a separate Prisma client, keeping your primary pool reserved for writes and user-facing reads.

Monitoring the Pool

Prisma exposes connection pool metrics via prisma.$metrics. Export these to Application Insights or your chosen backend, and alert on saturation (idle connections at zero for sustained windows) before users notice.

The Minimum Viable Production Checklist

Pool size tuned to service tier, not defaults.
encrypt=true; trustServerCertificate=false non-negotiable.
Transient-fault retry wrapper around every query.
Metrics export for pool saturation.
Serverless tier only where users tolerate cold starts.
Separate Prisma client pointed at read-only replicas for heavy read workloads.