- After 06:27 UTC, users began reporting Cloudflare downtime on DownDetector.
- 19 datacenters of Cloudflare were down.
- As a result of the outage, users started receiving 5xx error messages.
- Furthermore, Cloudflare acknowledged the issue and said it is working to resolve it.
- The first data center went online at 06:58 UTC.
- All data centers were online and functioning correctly by 07:42 UTC.
Introduction to Cloudflare Outage
The Cloudflare outage that took down so many sites on June 21, 2022 was big news, but the real drama was revealed later on as it was discovered that Amazon Web Services went down due to Cloudflare outage. According to Downdetector, a range of different websites and services were affected by the outage, including Discord, Shopify, Grindr, Fitbit, AWS, and Peloton. How did that happen? Was it really an accident ? I’ll tell you how it all went down and what that means for your business!
What is cloudflare?
Cloudflare provides a free content delivery network and security service, sitting between users and hosting providers, protecting websites from attacks. It has 150+ data centers around the world. They also offer other services including web acceleration, domain name registration and DNS management as well as analytics software called Cloudflare Spectrum.
Cloudflare Outage : A look back at what went wrong ?
On June 21, 2022, a widespread outage that originated with Cloudflare left multiple websites globally unavailable, causing congestion at key internet hubs like Amazon Web Services (AWS), Twitter, Zerodha, Shopify, Discord, and Canva. As of late, Cloudflare has been converting all of its busiest locations to a more flexible and resilient architecture.
The cause of this outage was a change to the network configuration within Cloudflare busiest locations as part of a long-term project to increase resilience.
Background of change in Network configuration
As part of Cloudflare’s ongoing effort to build a more flexible, resilient architecture, they have been converting their busiest locations over the last 18 months. This architecture has been used in 19 data centers: Amsterdam, Atlanta, Ashburn, Chicago, Frankfurt, London, Los Angeles, Madrid, Manchester, Miami, Milan, Mumbai, Newark, Osaka, San Jose, Singapore, Sydney, Tokyo.
Using this new architecture has improved reliability and allowed for maintenance to be performed without disrupting customer traffic. Since these locations also carry a significant share of Cloudflare’s traffic, any problem there can have a widespread effect. Unfortunately, this is what happened on Tuesday, June 21.
You may want to read this too: AWS Outage December 2021
What sites were affected?
According to the outage reporting website DownDetector, multiple websites were experiencing outages, including Discord, Zerodha, Shopify, Amazon Web Services, Twitter, Canva, and even Valorant, a popular battle royale game.
Along with WazirX, Coinbase, FTX, Bitfinex, and OKX, other websites, including Udemy, Splunk, Quora, and Crunchyroll, were unresponsive. The majority of these websites were re-accessible later.
03:56 UTC: The change was implemented to first location and no issues alerted as the datacenter was having old architecture.
06:17: Except MCP architecture location the change was deployed to all locations of cloudflare.
06:27: The rollout happened on MCP-enabled location and problem started ultimately affected 19 locations.
06:51: New incident was raised and root cause was verified.
06:58: The first data center went online.
07:42: All data centers were online and functioning correctly
08:00: The cloudflare closed the incident.
Though this just accounts for 4% of cloudflare’s total network, it’s 50% of requests impacted.
Why did this affect AWS too?
Cloudflare integrates quickly and easily with AWS. Host your websites and run applications on AWS while keeping them secure, fast, and reliable. Use Cloudflare as a unified control plane for consistent security policies, faster performance, and load balancing for your AWS S3 or EC2 deployment.
Therefore many companies use AWS with Cloudflare and they all got affected due to this cloudflare outage.
It’s hard to imagine a better time to be an engineer. We have all of these amazing tools at our disposal, and it’s up to us to decide how we will use them. One thing is for sure—things are going to get more interesting as companies continue investing in new ways for us to build things faster and more efficiently. If you are learning cloud computing please check out our AWS solution architect course also check our other blogs for more interesting blogs like this.