Open Playback · Free & MCP-native
Production failure memory
for your AI coding agent
Real incidents from GitHub, Cloudflare, Linear and others, structured and exposed over MCP. Plug it into Claude or Cursor and ask how production actually breaks
3 encores
Sorted by date
- SEV-1
Cloudflare
Feb 20, 2026
BYOIP prefixes withdrawn after Addressing API cleanup task misinterprets empty filter parameter
An automated cleanup sub-task in Cloudflare's Addressing API incorrectly queried the API with an empty pending_delete parameter, which the server interpreted as a request for all BYOIP prefixes. The task then began systematically deleting all matching prefixes and their service bindings, withdrawing about 1,100 BGP prefixes from the Internet. Engineers stopped the runaway sub-task within 50 minutes, but full restoration took over six hours because some customers had service bindings stripped from edge servers and required a global configuration rollout to repair.
5h 7mNot disclosed affectedBYOIP customers globally (about 25% of Cloudflare's BYOIP prefixes)bgp misconfigurationconfiguration errordeployhuman error - SEV-1
Cloudflare
Jan 22, 2026
Overly permissive routing policy causes IPv6 route leak from Miami router
A change pushed via Cloudflare's policy automation platform was meant to stop a Miami router from advertising prefixes for a Bogota data center after recent infrastructure upgrades made that path unnecessary. Removing the prefix-list reference left the export policy matching by route-type internal alone, which JunOS evaluates broadly enough to include all internal BGP routes. As a result, IPv6 prefixes Cloudflare redistributes internally were exported to external peers and providers in Miami, creating a Type 3/Type 4 route leak in the sense of RFC 7908.
25mNot disclosed affectedIPv6 traffic transiting Cloudflare's Miami edge and external networks whose prefixes were leakedbgp misconfigurationconfiguration changeconfiguration errornetwork - SEV-1
Cloudflare
Aug 21, 2025
Single-customer traffic surge saturates Cloudflare-AWS us-east-1 peering
At 16:27 UTC, a single Cloudflare customer began pulling cached objects from AWS us-east-1 at a rate that doubled total Cloudflare-to-AWS traffic and saturated all direct peering links into us-east-1. AWS attempted to alleviate the congestion by withdrawing BGP advertisements over the saturated links, which rerouted traffic to an offsite peering switch that promptly saturated as well. Two pre-existing infrastructure conditions made the impact worse: one direct peering link was at half capacity due to a known failure, and the Data Center Interconnect to the offsite switch was due for a capacity upgrade. After three hours of manual traffic engineering between Cloudflare and AWS plus rate-limiting the customer, congestion fully resolved at 20:18 UTC.
3h 51mNot disclosed affectedCustomers with origins in AWS us-east-1, primarily traffic transiting Cloudflare's Ashburn (IAD) edgebgp misconfigurationcapacity shortfallcustomer-facingnetwork
Your own encores
Turn your incidents into structured post-mortems.
Aftermath runs the whole show: Live Stage captures the incident, Encore writes the post-mortem. Private and structured, like these, but yours.
Get early access