Microsoft’s Singapore data center experienced an outage due to failure in its cooling units. As a result of this outage. Windows Azure services suffered an outage.
A power surge in the South-east Asian region on Wednesday caused some cooling units to go offline, resulting in increased temperatures in that data centre, said Microsoft. The company “proactively powered down a number of compute and storage units to avoid damage to hardware and reduce cooling system load”, which was reported by The Straits Times.
The issue related to Windows Azure has been addressed and Microsoft said that power was restored to the affected infrastructure in its data centre after temperatures returned to normal operating limits. This is the second such outage in the past few weeks.
Multiple organizations including the Central Provident Fund (CPF) Board, EZ-Link, the Esplanade and Nanyang Technological University (NTU) saw disruptions to their Web services on Wednesday (Feb 8) as a result of the outage of the Microsoft Azure cloud service. This was reported by The Business Times.
Subsequently, services were restored. Microsoft Azure said that cooling systems in the impacted areas of the data centre were successfully restored and that temperatures had returned to normal operating thresholds. A structured power-up sequence was being done on the affected compute and storage resources, the statement added.
Industry watchers opine that this outage could be due to a combination of factors that led to the failure of the cooling units. One of the main factors could be an issue with the Tier 3 design.
Many of the hyperscalers may not have Tier 4 certified data centers and there could be an issue with the design. Also, there is a strong possibility that data centers do not pay enough attention to issues of quality of power and instead stay fixated on quantity of power supplied to a data center.
Data Center Tier Ratings Explained
|Tier 1||A data center with a single path for power and cooling, and no backup components. This tier has an expected uptime of 99.671% per year|
|Tier 2||A data center with a single path for power and cooling, and some redundant and backup components. This tier offers an expected uptime of 99.741% per year|
|Tier 3||A data center with multiple paths for power and cooling, and redundant systems that allow the staff to work on the setup without taking it offline. This tier has an expected uptime of 99.982% per year|
|Tier 4||A completely fault-tolerant data center with redundancy for every component. This tier comes with an expected uptime of 99.995% per year|
Source: Uptime Institute