The severity of data-center outages appears to be falling, while the cost of outages continues to climb. Power failures are the biggest cause of significant site outages. Network failures and IT system glitches also bring down data centers, and human error often contributes.
Those are some of the problems pinpointed in the most recent Uptime Institute data-center outage report, which analyzes types of outages, their frequency, and what they cost both in money and consequences.
Uptime cautions that data relating to outages should be treated skeptically given the lack of transparency of some outage victims and the quality of reporting mechanisms. Outage information is opaque and unreliable, said Andy Lawrence, executive director of research at Uptime, during a briefing about Uptimes Annual Outages Analysis 2023.
While some industries, such as airlines, have mandatory reporting requirements, theres limited reporting in other industries, Lawrence said. So we have to rely on our own means and methods to get the data. And as we all know, not everybody wants to share details about outages for a whole variety of reasons. Sometimes you get a very detailed root-cause analysis, and other times you get pretty well nothing, he said.
The Uptime report culled data from three main sources: Uptimes Abnormal Incident Report (AIRs) database; its own surveys; and public reports, which include news stories, social media, outage trackers, and company statements. The accuracy of each varies. Public reports may lack details and sources might not be trustworthy, for example. Uptime rates its own surveys as producing fair/good data, since the respondents are anonymous, and their job roles vary. AIRs quality is deemed very good, since it comprises detailed, facility-level data voluntarily shared by data-center owners and operators among their peers.
Theres evidence that outage rates have been gradually falling in recent years, according to Uptime.
That doesnt mean the total number of outages is shrinkingin fact, the number of outages globally increases each year as the data-center industry expands. This can give the false impression that the rate of outages relative to IT load is growing, whereas the opposite is the case, Uptime reported. The frequency of outages is not growing as fast as the expansion of IT or the global data-center footprint.
Overall, Uptime has observed a steady decline in the outage rate per site, as tracked through four of its own surveys of data-center managers and operators conducted from 2020 to 2022. In 2022, 60% of survey respondents said they had an outage in the past three years, down from 69% in 2021 and 78% in 2020.
There seems to be a gently, gently improving picture of the outage rate, Lawrence said.
While 60% of data-center sites have experienced an outage in the past three years, only a small proportion are rated serious or severe.
Uptime measures the severity of outages on a scale of one to five, with five being the most severe. Level 1 outages are negligible and cause no service disruptions. Level five mission-critical outages involve major and damaging disruption of services and/or operations and often include large financial losses, safety issues, compliance breaches, customer losses. and reputational damage.
Level 5 and Level 4 (serious) outages historically account for about 20% of all outages. In 2022, outages in the serious/severe categories fell to 14%.
A key reason is that data-center operators are better equipped to handle unexpected events, according to Chris Brown, chief technical officer at Uptime. Weve become much better at designing systems and managing operations to a point where a single fault or failure does not necessarily result in a severe or serious outage, he said.
Todays systems are built with redundancy, and operators are more disciplined about creating systems that are capable of responding to abnormal incidences and averting outages, Brown said.
When outages do occur, they are becoming more expensivea trend that is likely to continue as dependency on digital services grows.
Looking at the last four years of Uptimes own survey data, the proportion of major outages that cost more than $100,000 in direct and indirect costs is increasing. In 2019, 60% of outages fell under $100,000 in terms of recovery costs. In 2022, just 39% of outages cost less than $100,000.
Also in 2022, 25% of respondents said their most recent outage cost more than $1 million, and 45% said their most recent outage cost between $100,000 and $1 million.
Inflation is part of the reason, Brown said; the cost of replacement equipment and labor are higher.
More significant is the degree to which companies depend on digital services to run their businesses. The loss of a critical IT service can be tied directly to disrupted business and lost revenue. Any of these outages, especially the serious and severe outages, have the ability to impact multiple organizations, and a larger swath of people, Brown said, and the cost of having to mitigate that is ever increasing.
As more workloads are outsourced to external service providers, the reliability of third-party digital infrastructure companies is increasingly important to enterprise customers, and these providers tend to suffer the most public outages.
Third-party commercial operators of IT and data centerscloud providers, digital service providers, telecommunications providersaccounted for 66% of all the public outages tracked since 2016, Uptime reported. Looked at year-by-year, the percentage has been creeping up. In 2021 the proportion of outages caused by cloud, colocation, telecommunications, and hosting companies was 70%, and in 2022 it was up to 81%.
The more that companies push their IT services into other peoples domain, theyre going to have to do their due diligenceand also continue to do their due diligence even after the deal is struck, Brown said.
While its rarely the single or root cause of an outage, human error plays some role in 66% to 80% of all outages, according to Uptimes estimate based on 25 years of data. But it acknowledges that analyzing human error is challenging. Shortcomings such as improper training, operator fatigue, and a lack of resources can be difficult to pinpoint.
Uptime found that human error-related outages are mostly caused either by staff failing to follow procedures (cited by 47% of respondents) or by the procedures themselves being faulty (40%). Other common causes include in-service issues (27%), installation issues (20%), insufficient staff (14%), preventative maintenance-frequency issues (12%), and data-center design or omissions (12%).
On the positive side, investing in good training and management processes can go a long way toward reducing outages without costing too much.
You dont need to go to a banker and get a bunch of capital money to solve these problems, Brown said. People need to make the effort to create the procedures, test them, make sure theyre correct, train their staff to follow them, and then have the oversight to ensure that they truly are following them.
This is the low hanging fruit to prevent outages, because human error is implicated in so many, Lawrence said.
Uptime said its current survey findings are consistent with previous years and show that on-site power problems remain the biggest cause of significant site outages by a large margin. This despite the fact that most outages have several causes, and that the quality of reporting about them varies.
In 2022, 44% of respondents said power was the primary cause of their most recent impactful incident or outage. Power was also the leading cause of significant outages in 2021 (cited by 43%) and 2020 (37%)
Network issues, IT system errors, and cooling failures also stand out as troubling causes, Uptime said.
Uptime used its own data, from its2023 Uptime resiliency survey, to dig into network outage trends. Among survey respondents, 44% said their organization had experienced a major outage caused by network or connectivity issues over the past three years. Another 45% said no, and 12% didnt know.
The two most common causes of networking- and connectivity-related outages are configuration or change management failure (cited by 45% of respondents) and a third-party network providers failure (39%).
Uptime attributed the trend to todays network complexity. In modern, dynamically switched and software-defined environments, programs to manage and optimize networks are constantly revised or reconfigured. Errors become inevitable, and in such a complex and high-throughput environment, frequent small errors can propagate across networks, resulting in cascading failures that can be difficult to stop, diagnose, and fix, Uptime reported.
Other common causes of major network-related outages include:
When Uptime asked respondents toits resiliency survey if their organization experienced a major outage caused by an IT systems or software failure over the past three years, 36% said yes, 50% said no, and 15% didnt know. The most common causes of outages related to IT systems and software are:
Publicly recorded outages, which include outages that are reported in the media, reveal a wide range of causes. The causes can differ from what data-center operators and IT teams report, since the media sources knowledge and understanding of outages depends on their perspective. Whats really interesting is the sheer variety of causes, and thats partly because this is how the public and the media perceive them, Lawrence said.
Fire is one cause that showed up among publicly reported outages but didnt rank highly among IT-related sources. Specifically, Uptime found that 7% of publicly reported data-center outages were caused by fires. In the web briefing, Uptime researchers related the incidence of data-center fires to increasing use of lithium-ion (Li-ion) batteries.
Li-ion batteries have a smaller footprint, simpler maintenance, and longer lifespan compared to lead-acid batteries. However, Li-ion batteries present a greater fire risk. A Maxnod data center in France suffered a devasting fire on March 28, 2023, and we believe its caused by lithium-ion battery fire, Lawrence said. A lithium-ion battery fire is also the reported cause of a major fire on Oct. 15, 2022, at a South Korea colocation facility owned by SK Group and operated by its C&C subsidiary.
We find, every time we do these surveys, fire doesnt go away, Lawrence said.
Read more here:
10 things to know about data-center outages - Network World
- Box for Android - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- eUKhost - eNlight Cloud Hosting! - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Computing -- Oracle is Ready to Take You There - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- What is Cloud Computing? - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Webinar - Cloud Computing: Why You Should Care - 2010-10-14 - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- What is Cloud Hosting? - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Computing Misconceptions and Benefits - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Hosting and How it is Set to Change Internet Commerce - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Awesome Cloud Computing Explained with Animation - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Rackspace Cloud Race - UK cloud hosting - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Improved Cloud Service Delivery And Hosting | IBM - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Computing Explained - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Software companies turn to Savvis for cloud hosting and other SaaS services - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Sky News Tech Report on Cloud Computing - Macquarie Telecom Interview - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- BitNami Cloud Hosting Demo - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Fully managed Cloud Computing solution using your current IT infrastructure (Closed Caption) - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Hosting Server Provisioning - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- iomart Hosting Provides Cloud Storage and Backup for new Branding Network [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Harris plans to stop offering remote cloud hosting [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- iomart Hosting provides cloud storage and backup for new UK branding network [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- DynamicOps Debuts "Fastest Path to Cloud" Seminar and Webinar [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Harris Corporation to Discontinue Cyber Hosting Operation; Will Continue Providing Advanced Cyber Security and Cloud ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Tutorial! Amazon Cloud Minecraft Server Hosting! - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- MachPanel 4.3 - SaaS and Cloud Hosting Control Panel for Windows - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Webair Carrier Neutral Cloud: Open Network Access in the Cloud [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- iomart Hosting Takes UK Digital Media Agency Into the Cloud [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- FireHost Grows Executive Team on Heels of European Expansion; Appoints Jim Ciampaglio as Sr. Vice President of Global ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- INetU Managed Hosting is SOC 2 and SOC 3 Compliant [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Web Host Webair Adds Carrier Neutral Cloud Services [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- FireHost Appoints Jim Ciampaglio as Sr. Vice President of Global Sales and Marketing [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- BitRock CEO on BitNami Cloud Hosting - Video [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Harris kills remote hosting service as customers shun cloud storage [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Understand Cloud computing in 60secs - Video [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Systech Integrators® Forms Strategic Relationship With Rackspace Hosting® to Offer Cloud Hosting Services for SAP® ... [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Dedicated & Cloud Hosting Provider Codero Names Industry Veteran Emil Sayegh, President & CEO [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Cloud Computing and Technology Mobility - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Cloud Hosting Providers - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Online Education Innovator Gives Virtual Internet Cloud Services an A+ [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- SingleHop Introduces the Hosting Industry's First Customer Bill of Rights [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cloud Services Provider Intermedia Launches Integrated Partner Program [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Cloud Services Provider Intermedia Now Offering Microsoft Office 365 [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Inside IT Cloud Computing Security - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Lansing Cloud Host Introduces Faster ‘Storm SSD’ [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Leading Industry Analyst Firm positions Hosting.com as a Challenger in Managed Hosting Magic Quadrant [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Hosting.com Positioned as Challenger in Managed Hosting in Gartner's Magic Quadrant [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- ServInt Announces the First Finalist for Its Inaugural Sextant Award, Recognizing the Most Effective Use of the ... [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Leading Analyst Firm Recognizes Savvis as a Leader in Two Cloud-Focused Magic Quadrants [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- UK Cloud Computing Company iomart Hosting Recruits Scotland Footballers to Kick off New Campaign [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Rackspace Hosting Positioned as a Leader in the Leaders Quadrant of the Magic Quadrant for Managed Hosting Providers [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- 4t Networks Offers Red Hat Enterprise Linux 6 for Cloud Hosting [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- elchemyv2.wmv - Video [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Steve VanRoekel Keynote, NIST Cloud Computing Forum and Workshop IV - Video [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- Hosting.com Enhances Backup Capabilities to Deliver Leading-Edge Data Recovery Solution for Businesses Any Size ... [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Online Tech Hosts Webinar on Cloud Computing in EHR/RCM Systems [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Hosting.com Enhances Backup & Data Recovery [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- ServInt Introduces Its New Flex Line of High-Performance, Fully Managed Dedicated Servers [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Telefonica targets LatAm with business cloud [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- TCWH Announces New InMotion Hosting Review 2012 [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Lokahi Expands Cloud Offering to Include Managed Security Services Through Partnership With StillSecure [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Eco Cloud Hosting IPv6 Ready with Web Application Firewall and Load Balancer - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Private SharePoint Cloud Beats Other Cloud Hosting Options for Enterprises on Price, Practicality [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Private SharePoint Cloud Beats Other Cloud Hosting Options for Enterprises, Says AISN [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- CaymanSecurity.com Introduces Secure Cloud Hosting Services [Last Updated On: March 19th, 2012] [Originally Added On: March 19th, 2012]
- Storm On Demand Introduces Windows Cloud Hosting [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- Citrix Streamlines Delivery of Cloud-Hosted Apps and Desktops [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- Cloud Computing Explained.mp4 - Video [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- AMD Opteron 3200 Chips Target Cloud, Web Hosting [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- Understanding the Cloud Computing Stack: SaaS, PaaS and IaaS | CloudU - Video [Last Updated On: March 21st, 2012] [Originally Added On: March 21st, 2012]
- Racemi Joins Rackspace Cloud Tools Program [Last Updated On: March 22nd, 2012] [Originally Added On: March 22nd, 2012]
- iNetRadio Adds User Music Cloud Hosting [Last Updated On: April 18th, 2012] [Originally Added On: April 18th, 2012]
- Managed Hosting Company, OneNeck IT Services, Selected by Southwest Home Builder for Cloud Services [Last Updated On: April 18th, 2012] [Originally Added On: April 18th, 2012]
- What is Cloud Hosting? - Australian Cloud Hosting Providers - Video [Last Updated On: April 18th, 2012] [Originally Added On: April 18th, 2012]
- Courion Leverages NaviSite's Enterprise Cloud to Deliver Identity and Access Management Software-as-a-Service [Last Updated On: April 24th, 2012] [Originally Added On: April 24th, 2012]
- TLD Solutions Launches Next Generation "4GH" Web Hosting [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- ElasticHosts unveils simple cloud web hosting for SMEs [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Rackspace Hosting 1Q net income up on higher sales [Last Updated On: May 8th, 2012] [Originally Added On: May 8th, 2012]
- Infinitely Virtual Announces Support for Microsoft SQL Server 2012, Providing Cloud-Ready Hosting with Mission ... [Last Updated On: May 8th, 2012] [Originally Added On: May 8th, 2012]
- Kore Domains Launches Revolutionary New "4GH" Web Hosting Solution [Last Updated On: May 8th, 2012] [Originally Added On: May 8th, 2012]
- 4GH Web Hosting Europa Launches 4GH Cloud Web Hosting Solution in European Data Center [Last Updated On: May 10th, 2012] [Originally Added On: May 10th, 2012]
- Hughes Cloud Services & Hosting Showcases Its Comprehensive Enterprise IT Offering At ... [Last Updated On: May 12th, 2012] [Originally Added On: May 12th, 2012]