Teams Integration unable to deliver GitHub notifications during upstream provider outage
GitHub · Source
- Started
- Mar 24, 2026
- Duration
- 3h 54m
- Users affected
- Not disclosed
- Revenue impact
- Not disclosed
- Blast radius
- Microsoft Teams Integration installs (~19% failed deliveries)
- Services
- teams-integration, teams-copilot-integration
Join the waitlist
Aftermath helps you ship structured post-mortems like this one for your own incidents. Encore keeps narrative, timeline, lessons, and action items in one place so the document stays useful after the incident is closed. Join the waitlist on the homepage when you want that workflow for your organization.
Summary
An outage at an upstream dependency caused HTTP 500 errors and connection resets on the path used to deliver GitHub event notifications to Microsoft Teams. The integration could not relay notifications during the impact window, with about 19 percent of integration installs affected. GitHub coordinated with the relevant service teams and the issue resolved when the upstream incident was mitigated.
Impact
Approximately 19 percent of Microsoft Teams Integration installs failed to receive GitHub-to-Teams notifications during the impact window. The error rate against the integration averaged 37.4 percent and peaked at 90.1 percent. Both the standard Teams Integration and the Teams Copilot Integration were affected.
Root cause
An upstream provider that the Teams Integration relies on for delivery experienced an outage.
The outage manifested as HTTP 500 responses and connection resets on requests from the integration to the upstream service.
GitHub had no automated failover path that could route notifications around the failing upstream during the window.
Observability and runbooks for this dependency did not allow for rapid mitigation, so recovery was bounded by the upstream's own time-to-mitigate.
Resolution
GitHub engineers coordinated with the upstream service team during the window. The incident resolved at 19:51 UTC when the upstream provider mitigated their own incident, restoring delivery for the Teams and Teams Copilot integrations.
Lessons
- An integration that has no failover path to an alternative delivery mechanism inherits the upstream's availability as a hard ceiling.
- Time-to-mitigate for vendor-resolved incidents is a function of the vendor's response, not your own; observability should at minimum let you detect and communicate quickly.
- Notification systems are a common surface for third-party dependency exposure because they sit at the edge of integration with external platforms.
Action items
- Update observability and runbooks for the Teams Integration so the team can detect, triage, and communicate faster during similar upstream outages.
- Evaluate alternative delivery paths or queueing strategies that allow notifications to recover after the upstream returns rather than failing during the window.