1.1.1.1 cache change reorders CNAME records and breaks legacy DNS clients
Cloudflare · Source
- Started
- Jan 8, 2026
- Duration
- 2h 15m
- Users affected
- Not disclosed
- Revenue impact
- Not disclosed
- Blast radius
- Subset of 1.1.1.1 users running glibc-based stub resolvers and certain Cisco switch DNS clients
- Services
- 1.1.1.1, dns-resolver, cache
Join the waitlist
Aftermath helps you ship structured post-mortems like this one for your own incidents. Encore keeps narrative, timeline, lessons, and action items in one place so the document stays useful after the incident is closed. Join the waitlist on the homepage when you want that workflow for your organization.
Summary
A memory-optimization change to the 1.1.1.1 resolver's cache merge logic switched the order of records returned in partially expired CNAME chains, causing CNAME records to appear after their target A records instead of before. While modern resolvers tolerate either order, glibc's getaddrinfo and certain Cisco Catalyst switch DNS processes use sequential parsing that requires CNAMEs to be listed first, leading to empty answers and, for the Cisco devices, spontaneous reboot loops. The change had been deployed on January 7 and reached 90 percent of servers before being reverted on January 8.
Impact
DNS clients that rely on sequential parsing of CNAME chains, including Linux glibc-based applications and three models of Cisco Catalyst switches configured to use 1.1.1.1, failed to resolve names that involved a partially cached CNAME chain. Affected Cisco devices entered DNSC reboot loops. Most modern resolvers (such as systemd-resolved) were unaffected.
Root cause
The cache merge function was rewritten from a build-new-list pattern to an extend-existing-list pattern; this swapped the order so cached CNAMEs were appended after, rather than prepended before, the freshly resolved answer records.
Tests asserted resolution correctness but did not assert record ordering, because RFC 1034 uses non-normative language ('possibly preface') and never formally requires CNAMEs first.
The 40-year-old DNS specification leaves the relative ordering of different RRsets within a message section unspecified, so widely deployed stub resolvers came to depend on de facto behavior that was never explicitly required.
Some affected DNS clients sit on long-lived hardware or operating systems that rarely receive updates, so the cost of changing wire-format behavior falls disproportionately on the upstream change.
Resolution
Once the incident was declared at 18:19 UTC, the team began rolling back the release at 18:27 UTC. The revert completed across the resolver fleet at 19:55 UTC, restoring the original CNAME-first ordering.
Lessons
- Behavioral compatibility for widely deployed clients is not the same as specification compliance; if a behavior has been stable for decades, removing it without a deprecation path will break someone.
- Performance optimizations that change wire-format output need ordering and shape assertions in tests, not only correctness assertions.
- Ambiguous specs are best treated as conservative requirements when deployed at Internet scale; you inherit the union of every implementation's expectations.
- Stub resolvers in firmware and infrastructure devices may never be patched in a meaningful timeframe, so resolver operators effectively own backwards compatibility forever.
Action items
- Restore the original CNAME-first record ordering in 1.1.1.1 responses and commit to maintaining that ordering going forward.
- Add explicit tests that assert CNAME records appear before other answer types in the merged response.
- Submit an Internet-Draft to the IETF DNSOP working group to formally specify CNAME ordering in DNS responses.
- Review other recent cache and parser optimizations for behavioral changes that may not be covered by existing test assertions.