To Avoid Network Downtime Perform Risk Assessment
By Gail Caros
Today, businesses have become inexorably dependent on business critical applications that
are comprised of software, hardware and services. These components are interdependent,
creating delicate links in a chain that has been termed the "Application Delivery Chain".
What this means is that many of your applications are dependent on each other and most
assuredly the servers and networks that support them.
Keeping these mission critical applications at peak performance is the primary directive
for IT Professionals. The demand for new applications, coupled with burgeoning use of edge
computing, and the need to keep the current systems operational may lead to overlooking the
importance of day to day preventative steps and/or underlying infrastructure requirements.
Early in my career, while a support specialist to both customers and their end-users
we found that 85% of all tickets were the result of what we refer to as Layer 1 issues: Cable/Power/Connectivity?
With nostalgia, I could tell stories that would make you smile. You know, the ones where you
ask them what position the "ON" Slash "OFF button is in, or what, if any lights are illuminated
on the device? And while we have come a long way since those days, end user experience and
user interface is still king.
Below is a list of the top causes of communications outages:
1. Faults, errors or discards in network devices
2. Device configuration changes
3. Operational human errors and mismanagement of devices - (22%)
4. Link failure caused due to fiber cable cuts or network congestion
5. Power outages
6. Server hardware failure (55%)
7. Security attacks such as denial of service (DoS)
8. Failed software and firmware upgrade or patches (18%)
9. Incompatibility between firmware and hardware device
10. Unprecedented natural disasters and ad hoc mishaps on the network such as a minor accidents,
or even as unrelated as a rodent chewing through a network line, etc.
One of the major vendors found while researching their customer support history that
they had five major causes for communications outages.
1. Power outage
2. Lack of routine maintenance
3. Hardware failure 55%
4. Software bug/Corruption
5. Network issue/outages
What's more important is the percentage of time the outage could have been prevented
had standard best practices been followed:
1. Power outage (81%)
2. Lack of routine maintenance (78%)
3. Hardware failure (52%)
4. Software bug / Corruption (34%)
5. Network issue/Outage (27%)
This downtime is costly and can be defined in both hard and soft dollars. For the purposes
of this discussion, let's define hard dollars as the expenses incurred directly to bring the
systems back online: Hardware, labor, tech support, etc. and soft dollars as the indirect
costs such as loss of employee productivity, loss of business, dissatisfied customers, customer
perceptions, customer loss of confidence, etc.. With these definitions in mind, consider
the impact to your business in soft dollars in the event of an outage.
|