Open Playback · Free & MCP-native

Production failure memory
for your AI coding agent

Real incidents from GitHub, Cloudflare, Linear and others, structured and exposed over MCP. Plug it into Claude or Cursor and ask how production actually breaks

all abuse event api gateway auth background degradation background jobs bgp misconfiguration cache cache stampede capacity shortfall

MCP

Connect via MCP

Drop this into Claude Desktop or Cursor. Then ask your agent about cache stampedes, BGP failures, or DNS outages.

"open-playback": {
  "type": "http",
  "url": "https://open-mcp.aftermath.sh/"
}

2 encores

Sorted by date

Stripe
Jul 10, 2019
SEV-0
API severely degraded twice in one day: a minor database version upgrade introduced a latent failover bug that triggered under a rare multi-node stall condition, and the rollback intended to fix it interacted with a recent config change to cause a second distinct outage
On July 10, 2019, the Stripe API experienced two separate periods of severe degradation. The first, lasting 27 minutes, was caused by a latent bug introduced three months earlier in a minor database version upgrade: under a rare condition where multiple nodes stall simultaneously, the new version's failover protocol could not elect a primary, leaving an entire shard unable to accept writes. Because this shard underpinned a wide range of API operations, compute resources were rapidly exhausted across the API. After restarting the cluster to force a new election, Stripe recovered and then rolled back the database version as a precaution. That rollback triggered the second outage: the rolled-back version interacted unexpectedly with a recent configuration change to the production shards, causing CPU starvation on all affected shards. The second outage lasted 93 minutes and required engineers to identify the config interaction, apply a corrected configuration, and restart the cluster again.
6h 12mNot disclosed affectedAll Stripe API users globally; a substantial majority of API requests failed during both degradation windows (16:35–17:02 UTC and 21:14–22:47 UTC)
background degradationcascading failureconfig driftconfiguration fix
Stripe
Dec 17, 2015
SEV-1
API fully unavailable for 44 minutes after failures in an internal event queueing system cascaded to the Stripe API, Checkout, and Dashboard
On December 17, 2015, failures in Stripe's internal event queueing system caused a cascade that degraded the Stripe API for 9 minutes and then took it fully offline for an additional 44 minutes. During the outage window, merchants could not process payments via the API, Checkout, or the Dashboard. Stripe published an initial incident summary shortly after recovery while continuing to investigate the deeper root cause.
53mNot disclosed affectedAll Stripe API users globally; Stripe API, Checkout, and Dashboard were fully unavailable during the complete outage window
background degradationcascading failurefull outagequeue or stream

Production failure memory for your AI coding agent

API severely degraded twice in one day: a minor database version upgrade introduced a latent failover bug that triggered under a rare multi-node stall condition, and the rollback intended to fix it interacted with a recent config change to cause a second distinct outage

API fully unavailable for 44 minutes after failures in an internal event queueing system cascaded to the Stripe API, Checkout, and Dashboard

Production failure memory
for your AI coding agent