Email this site to... Join our group Follow us on Twitter  there is more …
booklet
   

Stop the boomerang!

End the curse of recurring incidents

   
A boomerang – when thrown correctly – comes back to you. The same happens with a lot of incidents: they are coming back, sometimes over and over again. The negative impact of such boomerang incidents on user satisfaction and business process availability gets too little attention within IT Service Management. Wake-up! There is a real business case to stop recurring incidents from happening. So: STOP the boomerang!

Recurring incidents defined. Generally, recurring incidents fit three conditions: 1) the incident has not been identified properly and happens again, 2) the incident is identified but not resolved properly, and 3) the resolving party has responded to the incident within their means, but the customer is not provided with additional support or corrective procedures. In all cases the root cause is not addressed and an incident occurs again (and again). It is all a lot of waste (time, money and business impact) and therefore should be eliminated.

Gear-up problem management. The ITIL definition of a ‘problem’ is the cause of one or more incidents. The process to manage problems is called Problem Management: it is too often ‘something to be done when time allows’. Problem Management primary aim is I) to prevent problems and thus incidents from occurring, II) to eliminate recurring incidents, and III) to minimize the impact of incidents that cannot be solved. In order to achieve this Problem Management, one must get to the root cause of incidents and then initiate actions to improve or correct the situation. Without a more mature level of Problem Management, it is difficult to stop the tsunami of recurring incidents.

Get rid of the wrong incentives. To meet a SLA ratio like average resolving time, it is tempting to close an incident even when it is not completely solved. The user then has to call back and a new ticket is opened to handle the recurring incident. In outsourcing recurring incidents can be a profitable ‘tactic’ for an external service provider: the customer pays for each incident. In plain talk, more incidents lead to more expenses. The real costs of incidents are not just the costs to handle a call, but the total impact of that incident on the business. How many lost production hours are caused by recurring incidents? This is – as rule of the thumb - at least a tenfold of the direct spend on the first attempt to handle the incident. How to solve this? When users can overrule the status ‘ticket closed’ when they are not satisfied, the level of ‘tickets closed’ will match the level of ‘solved according to the user’.  At life science multinational DSM that simple rule reduced the total amount of low priority incidents by 40 percent.

Construct smarter metrics. The percentage of recurring incidents should be as low as possible: aim for zero. For high priority incidents effective Problem Management should lead to: a) a minimal percentage of recurring business critical incidents with the same route cause, known as single point of failures, and b) the avoidance of high priority incidents. Smarter use of service management data makes it possible to be more predictive and therefore pre-emptive. Prevention is always better than curing, so at least one key metric in Problem Management has to be prevention-based. With smarter metrics and the right behaviour you prevent the boomerang to return.