10 Reasons Data Centers Fail

Operators sometimes make common mistakes that can lead to data center outages. Most outages can be avoided through proper maintenance, procedures, and common sense.

{image 1}

An "unplanned data center outage" is a polite way to say that a data center failed. Whether the root cause is a hardware failure, software bug, or human error, most failures can -- and should -- be prevented. With the high level of redundancy built into today's data center architectures, prevention is very much possible.

The interesting thing is, data center failures still happen all the time. Considering the incredible cost per minute lost during a full outage, you'd think that they would be far more rare. If data center managers simply focused on fixing the main reasons failures commonly occur, they would significantly reduce the risk of catastrophic outage.

The problem is that so many data center operators are heavily focused on growth instead of the care and feeding of what's already in place. If you watch administrators in many public and private data centers these days, you'll find that they are focused largely on increasing capacity, boosting server density, and retrofitting aging server rooms into more modern facilities with more efficient cooling systems. While all this is fantastic and shows the incredible growth in the data center industry, it also highlights why we commonly see outages.

On the following pages, we're going to get back to data center basics. We'll present 10 common reasons why data centers fail. Click through and think about how these common outages might one day surface in your data center. While not every failure scenario may match your data center architecture, we're confident that at least a few topics we mention will hit home and make you think about what you can do to shore up your facility.

And if you have any additional thoughts, tips, or stories that may help your fellow administrators avoid an outage, please share them in the comments below.

(Image: 123Net / Wikimedia Commons)

About the Author(s)

Andrew Froehlich, President, West Gate Networks

President, West Gate Networks

As a highly experienced network architect and trusted IT consultant with worldwide contacts, particularly in the United States and Southeast Asia, Andrew Froehlich has nearly two decades of experience and possesses multiple industry certifications in the field of enterprise networking. Froehlich has participated in the design and maintenance of networks for State Farm Insurance, United Airlines, Chicago-area schools and the University of Chicago Medical Center. He is the founder and president of Loveland, Colo.-based West Gate Networks, which specializes in enterprise network architectures and data center build outs. The author of two Cisco certification study guides published by Sybex, he is a regular contributor to multiple enterprise IT related websites and trade journals with insights into rapidly changing developments in the IT industry.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights