Recent Posts

Blog Archive



Saturday, October 4, 2008

Data Centre Failure

By Amy Nutt

There are many reasons in which a data centre can fail. It can be quite frustrating, so it is important to know what those reasons are so that you can prevent them or know what to do when they happen. Because a lot of data centres exist on a site separate from the computers that are retrieving data from them, it is important that data centres continue running. When they go down, all computers pulling from them from various places around a single country or even the world are not going to be able to retrieve any information. Such is true for customer service jobs in which customer information must be retrieved from a secure data centre that resides elsewhere. If that data centre is not operating, money is lost because employees are unable to do their jobs and customers become very unhappy that their situation cannot be resolved.

Reasons why data centres fail

Some of the reasons why data centres fail can be prevented and then there are some ways in which they cannot. However, it is important to do what is necessary to ensure as few outages as possible.

Here are some reasons for failure:

- The "wear-in" phase - This is the point in time in which the data centre has just become operational. It is typical that certain things may fail as they are trying to become fully operational. It is like a toddler learning how to walk. The data centres has to walk too, so it is good to let it run with minimal use and gradually build until it has gotten its legs. This involves comprehensive testing as the system usage increases in order to fix problems before they become a problem.

- The "wear-out" phase - This is when the data center is reaching the end of its life. Regular maintenance and care will slow this process, but major parts will eventually wear out. It is ideal to consistently monitor the system in order to predict failure and avoid catastrophe.

- Power failure - Power failure is devastating to anything that relies on it for operation. It is especially devastating to a data center. That is why it is important to have a generator or two ready to take over in case the power goes.

- Generator failure - Generators need care too and they need to be tested. Power goes out and generators take over in order to keep the data center running. Generators have been known to go out and cause data center failure.

- Metal whiskers - If the data center hardware is sitting on a metallic surface, then that metallic surface could grow zinc whiskers. These have been known to cause short circuits, especially in data centers. There are also tin whiskers that grow out from tin and they too cause shorts. Silver whiskers that grow on silver electrical contacts and gold whiskers that develop on gold plated services are also known to cause short outs. Large fans that suck the whiskers in can be used and the elements that produce them can be replaced.

Prevention is key

Prevention is the key to keeping your data centre up and running. If any of these elements occur, it is good to stay calm and do what needs to be done to keep the problem from occurring again. If you have experienced generator failure, you may wish to invest in a backup to your main generator. If you're building a new data center, be sure to use flooring that does not produce any type of metal whiskers that can short out your hardware. By being vigilant, you can ensure that your data center uptime will be at or near 100%.


Providing a network of state-of-the-art data centres. We are CICA 5970 and SAS 70 Certified, which is the highest available standards for measuring and improving data center operations and management.

0 comments:

 

GooContents | Jump to TOP