By Christoph Goldenstern, Kepner-Tregoe
Why do we still struggle with stability?
It is over forty years since IBM announced its Systems Network Architecture, a uniform set of rules and procedures for computer communications to free computer users from the technical complexities of communicating through local, national, and international computer networks, and its nearly 30 years since contributing the Information Systems Management Architecture to the initial ITIL project. Why then do we still struggle with maintaining stability in our IT systems?
A study carried out by Forrester in early 2016 shows that in 57% of organizations surveyed, business-critical applications had issues with performance or availability every week, and half of these struggling organizations had issues every day. Are we any better at IT Service Management (ITSM) than we were in the 70s? With over half of organizations struggling, we still have a long way to go.
The key to improving ITSM is rapid root-cause analysis
Reliance on IT increases frustration
How often have you gone into a business and been told, “sorry, our computers are running very slowly today” or worse still, “our computers are down”? With the total reliance on technology to get virtually any job done, the level of frustration when things do not go well has multiplied. The perception, right or wrong, is that IT is a necessary evil that consistently lets us down.
It is the job of ITSM to improve the reputation of IT so that the technology capability that supports modern life is seen as an enabler rather than a disaster waiting to happen. We may never prevent all IT problems from happening, but we can make sure that we get things back up and running faster and we can prevent the same problems from disrupting services time and time again. The key to improving ITSM is rapid root-cause analysis.
Not surprisingly, the same Forrester study also reveals that rapid root cause analysis is the capability that is most desired by the organizations surveyed.
The symptoms of IT instability are easy to recognize:
- Recurring incidents – this is related directly to being unable to find the root cause effective
- Significant productivity losses due to downtime – this is often a result of a lack of structure, processes and standards for resolving issues
- Complaints about inconsistent problem handling – this happens when organizations have a heavy reliance on individual technical expert
- Time consuming trial and error – when you don’t know what questions to ask or jump quickly to cause without analysis, it is easy to go down multiple wrong paths; this can make the problem worse and ultimately, harder to solve
- Poor collaboration and communication – when there is no accepted or documented, common approach to problem solving, information is lost, uncollected and unavailable
Multi-vendor environments increase the value of RCA
Enterprises rely on multiple vendors for IT capabilities, compounding problem-solving challenges within an increasingly complex IT environment and creating problems never seen before. Experience is not enough when faced with complex new problems. Without a standardized approach to root cause analysis, problems will go unsolved, repeat incidents will continue to grow, and the frustration levels of your IT specialists will rise rapidly.
As customer demands and expectations continue to grow, the modern IT organization must make rapid root cause analysis the cornerstone of its capabilities. Without a way to quickly identify and remove problems, customer satisfaction will plummet, costs will rise and the workload of service desks and technical teams will grow at a rate that is unsustainable.
The greater the world’s reliance on IT, the more important systematic root cause analysis becomes to the success of IT service management.
Kepner-Tregoe’s problem solving approach is used worldwide to improve IT stability
 “Digital Business Requires Application Performance Management,” a commissioned study conducted by Forrester Consulting, January 2016