Azure Managed Identity and Key Vault: Secret Rotation Without Outages
Secret rotation outages have a consistent shape: an engineer rotates a secret, a few services keep using the old value from their process memory, and users see 401s until the next deploy or restart. The fix is not "remember to restart"; it is an architecture where secrets are fetched dynamically, rotations are announced, and overlap windows let both old and new values work during the transition.
Stop Storing Secrets as Environment Variables
The root cause of rotation outages is baking secrets into the process's environment at start time. The value is then effectively immutable for the life of the process. Move secrets behind a client that fetches them from Key Vault on-demand (with a caching layer), and rotation becomes a live, zero-restart operation.
Managed Identity. The Foundation
The application authenticates to Key Vault using its Managed Identity (system-assigned or user-assigned). No credentials exist in configuration; the Azure platform issues short-lived tokens. The Key Vault access policy (or Azure RBAC) grants get and list on secrets to that identity. Nothing else is needed to access secrets. No PATs, no stored passwords.
// Node — fetch secrets via Managed Identity, cache for TTL
import { DefaultAzureCredential } from '@azure/identity';
import { SecretClient } from '@azure/keyvault-secrets';
const client = new SecretClient(process.env.KV_URL!, new DefaultAzureCredential());
const CACHE_TTL_MS = 60_000;
const cache = new Map<string, { value: string; expiresAt: number }>();
export async function getSecret(name: string): Promise<string> {
const cached = cache.get(name);
if (cached && cached.expiresAt > Date.now()) return cached.value;
const s = await client.getSecret(name);
if (!s.value) throw new Error(`Secret ${name} is empty`);
cache.set(name, { value: s.value, expiresAt: Date.now() + CACHE_TTL_MS });
return s.value;
}
One-minute TTL is a sensible default. Fast enough that a rotation propagates quickly, long enough that Key Vault is not the hot path of every request.
Key Vault Rotation Policies
For secrets Key Vault can generate itself. TLS certs via the ACME/Akv integration, storage account keys. Key Vault's built-in rotation policy handles creation of new versions automatically. For external secrets, you set a rotation policy with an expiry, and an Azure Function (or similar) responds to the "near-expiry" Event Grid notification by generating a new value.
# Key Vault rotation — near-expiry event triggers a Function
az eventgrid event-subscription create \
--name secret-rotation \
--source-resource-id "$KV_RESOURCE_ID" \
--endpoint "$FUNCTION_URL" \
--included-event-types "Microsoft.KeyVault.SecretNearExpiry"
The Overlap Window
The key design principle for zero-outage rotation: during a rotation, both the old and the new secret must authenticate successfully. For symmetric secrets (API keys, database passwords) this means dual-active credentials at the provider. For asymmetric (public-key) setups, rotation is naturally more forgiving. The overlap must be at least 2× the TTL of your secret cache; a five-minute overlap window is standard.
Versioned References
Key Vault secrets are versioned. Avoid configurations that pin to a specific version. Doing so defeats the live-rotation story. Fetch the current version (the default behaviour of getSecret(name)) and trust the cache TTL to converge after rotation.
The Rotation Runbook
- Generate the new secret value at the source (database password change, API key re-issue).
- Update Key Vault,
setSecretcreates a new version; the old version remains accessible. - Wait for the cache TTL to elapse across all application instances.
- Verify traffic is authenticating with the new value (observability).
- Revoke the old value at the source, ending the overlap window.
What Not to Do
- Bake Key Vault references into
appsettings.jsonat startup. That is the environment-variable pattern wearing a costume. - Rotate without an overlap window. Guarantees an outage proportional to your deploy speed.
- Store Key Vault URLs as secrets. The URL is not the secret; the identity is. Put the URL in plain config.
- Share one Managed Identity across all apps. Scope per application; a compromise of one should not expose the others.
The Operational Payoff
Teams that adopt this pattern rotate secrets monthly without flinching. Teams that do not tend to rotate yearly. Or not at all. Because every rotation is a potential outage. NIS2, DORA, and every internal security audit increasingly expect rotation cadence evidence; the pattern described here makes "rotate routinely" a boring operational fact rather than a scheduled incident.