Oracle Netsuite and Azure Sydney downtime: services now returning to normal but questions still remain

September 1, 2023 at 8:23 AM GMT+8

Oracle Netsuite and Microsoft Azure cloud services in Sydney seem to have recovered from the downtime that hit them in the evening of Wednesday 30th August with services now restored.

The Azure website posts the following description and explanation of the events of that evening:

Impact Statement: Starting at approximately 08:30 UTC on 30 August 2023, a utility power surge in the Australia East region tripped a subset of the cooling units offline in one datacenter, within one of the Availability Zones. While working to restore cooling, temperatures in the datacenter increased so we proactively powered down a small subset of selected compute and storage scale units, to avoid damage to hardware. Multiple downstream services were impacted, with targeted communications being distributed via Azure Service Health.

Current Status: Storage infrastructure has recovered. A subset of services still experiencing residual impact are on the path to mitigation.

Mitigation: We worked on recovering the failed cooling units and reducing the overall temperature within the impacted area. Once temperature levels were within operational thresholds, we began to restore power to the affected infrastructure and started a phased process to bring this infrastructure back online. Once storage infrastructure was fully restored, dependent compute scale units were then also restored to operation. As the underlying compute and storage scale units became healthy, compute and other dependent Azure services recovered.

While we have broadly recovered, a small subset of services are still working on post recovery checks, and we are closely monitoring the datacenter metrics for storage and compute resources to ensure they continue to show as healthy. For any residual customers with services still in the recovery process, we will communicate directly to them through Service Health in the Azure portal, which also triggers Service Health alerts.

The Oracle Netsuite website posted similar statements on an ongoing basis through the situation including the information that the outage was caused by “an interruption in the chiller plant as a result of a lightning storm” :

All services have been restored in the AP Sydney data center.
Customer Impact: All customers hosted in the AP Sydney data center were unable to log into the NetSuite service.
If you are still experiencing issues, please contact NetSuite Customer Support through your standard method.
Start Time: August 30, 09:06 PM AEST
End Time: August 31, 10:06 AM AEST

Posted 1 day ago. August 31, 10:08 AM AEST

Update

Customers in the AP Sydney data center are starting to experience service recovery, and we are continuing to monitor mitigation progress.
Customer Impact: All NetSuite customers hosted in the AP Sydney data center were offline.

Posted 1 day ago. August 31, 09:41 AM AEST

As at the time of writing, the ‘Current Status’ page indicates that the full range of services are available.

Downtime always raises questions especially when it affects the services provided by market-leading companies and here there appears to be some details still to be cleared up in relation to the cause of the downtime – the violent storm which caused 30,000 homes to lose power is mentioned in the Oracle Netsuite explanation but as the wettest of Australia’s major cities and subject increasingly to disruptive weather events, how are data centers in Sydney preparing for a wilder climate? The recovery may raise some questions of communication as the impacts began to be felt by these key data centers and their customers. It also raises the issues of what can be learned from what happened and whether there is the need for greater disclosure and discussion of the specific path an external event such as a storm takes to cause (if it was indeed the cause) downtime. In comparison with issues around sustainability, issues of threats to availability appear relatively less open for discussion?

ANZ

NEA

SEA

South Asia

Europe

Middle East

Africa

Cloud

Connectivity

Data Center

Oracle Netsuite and Azure Sydney downtime: services now returning to normal but questions still remain

w.media Audios

Top Reads

AI infrastructure demand pushes Alphabet’s 2026 capex guidance to US$ 205 billion

Firm bags US$240 million deal for construction of Phases 1 & 2 Kuala Lumpur DC

Firmus & DayOne to co-develop 360MW Nvidia AI factory in Batam

Microsoft Azure outage: Cloud services disrupted after West US network failure, now resolved

Oracle Netsuite and Azure Sydney downtime: services now returning to normal but questions still remain

w.media Audios

Related Stories

Top Reads