DEVELOPING STORY | AWS outage impacts over 80 Amazon apps and services, recovery underway

Facebook
Twitter
LinkedIn
Picture of Deborah Grey
By Deborah Grey
As w.media's Global Editor-in-Chief, Grey covers the cloud and data center industry and connectivity ecosystem across APAC and EMEA. In a career spanning over two decades, Grey has dabbled in television, print and online journalism, covering a variety of beats including human rights, health, environment, politics, business and economy.
Screenshot of AWS Health Dashboard updates showing impacted apps and services

Amazon Web Services (AWS) was hit by an outage on Monday that led to disruption of at least 80 related web applications and services. However, at the time of publishing this story AWS reports that the underlying issue had been “fully mitigated” and that “most AWS Service operations are succeeding normally now.”

The outage impacted services like AWS Application Migration Service, AWS CloudFormation, AWS CloudTrail, AWS Lambda, AWS Network Firewall, AWS Organizations, AWS DynamoDB, AWS Transit Gateway, Amazon Location Service, AWS Support Center among others. Media reports say that Perplexity AI, Snapchat and Alexa were also impacted.

 

Timeline of the outage

Trouble began shortly after minding Pacific Standard Time (PTD). In a post at 12:11 AM PTD on AWS’s Health Dashboard, AWS said that it was investigating increased error rates and latencies for multiple AWS services in the US-EAST-1 Region.

AWS engineers swung into action, and in a subsequent post at 2:01 PTD it said, “We have identified a potential root cause for error rates for the DynamoDB APIs in the US-EAST-1 Region. Based on our investigation, the issue appears to be related to DNS resolution of the DynamoDB API endpoint in US-EAST-1.”

AWS reported some improvement when at 3:35 AM, the update said, “The underlying DNS issue has been fully mitigated, and most AWS Service operations are succeeding normally now. Some requests may be throttled while we work toward full resolution. Additionally, some services are continuing to work through a backlog of events such as Cloudtrail and Lambda.” It cautioned, “While most operations are recovered, requests to launch new EC2 instances (or services that launch EC2 instances such as ECS) in the US-EAST-1 Region are still experiencing increased error rates. We continue to work toward full resolution. If you are still experiencing an issue resolving the DynamoDB service endpoints in US-EAST-1, we recommend flushing your DNS caches.”

Finally, four hours after the initial disruption, the latest update at 4:08 AM PTD, said, “We are continuing to work towards full recovery for EC2 launch errors, which may manifest as an Insufficient Capacity Error. Additionally, we continue to work toward mitigation for elevated polling delays for Lambda, specifically for Lambda Event Source Mappings for SQS.”

Screenshot of status updates from Coinbase from Coinbase website

Cryptocurrency platform Coinbase also reported disruptions. At 1:20 PTD it said, “A major Amazon Web Services (AWS) outage is causing widespread internet disruptions, impacting multiple apps and services, including Coinbase.” However, by 4:06 it also indicated that services were returning to normalcy. At 4:06 it posted, “Login is now restored, and sending and receiving are back to normal. Our team is continuing to assess any remaining user impacts. Rest assured, your funds are safe.”

We will update this story once further recovery is made and all systems and services are fully restored.

Related Posts
Other Popular Posts
Northeast Asia News
Northeast Asia News