A survey to analyse the key factors that cause major SME IT incidents and service failures. The findings show that human error accounted for 47 percent of incidents, followed by server failures at 29 percent and power and communications provider failure at 15 percent. Fire, flood or ‘Acts of God’ accounted for 9 percent of outages.
Human error can include anything from placing a server under an air conditioner - that then leaks, to classic finger trouble - where operators irretrievably break a server and don't have a backup. Other impacting factors identified included a second disk failure - after its mirror has previously failed and not been fixed, or issues, such as deployment failures or bugs in custom code.
Survey results show that human error causes the highest occurrence of service failures, whilst incidents like fire and flood are understandably less common, but do still occur. It was also found that quite a lot of incidents, which initially appear to be related to pure hardware or software failure, actually have an element of human error involved with them.
Power and communications failures proved to be reasonably common but are often quite short lived, and because most companies don’t have a recovery service that can get them working again very quickly, they tend to just tough them out.
A key problem for companies is predicting how long the service is likely to be out of action and then deciding when it’s worth trying to initiate a recovery process.
The key message from the survey results is, perhaps that prolonged outages do happen and are more often caused by the every-day rather than the rarer fire, flood or acts of God.
Many smaller and medium sized companies who have limited IT support, have less ability to respond quickly and effectively to an IT outage. It is therefore advised that they consider the risks of a prolonged IT outage carefully, and look to develop and implement a fully managed disaster recovery (DR) service from a specialist provider who can guarantee to restore their systems within an acceptable period of time.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment