Effective Service Desk and Incident Management Metrics

Under the IT Infrastructure Library (ITIL) v3 Best Practice Framework, the service desk function and incident management process are closely intertwined within the service operations environment, with the efficient and effective operation of both the function and the process vital to the delivery of highly-available IT services. Consequently, their performance needs to be monitored and tightly managed to ensure that IT can deliver against critical success factors such as ‘Quickly Resolve Incidents’, ‘Maintain IT Service Quality’, and ‘Improve Business and IT Productivity’. Effective metrics are key to this and should be considered an IT organisation’s navigational compass on the proverbial journey to IT Service Management (ITSM).

Whilst nearly every organisation has, or has access to, a service desk for the reporting of incidents and logging of service requests, how many service desks are viewed as responsive and customer-focused (by the corporate users of IT services)? From an IT provider perspective, how do their service desk and incident management process perform against business requirements? Are incidents consistently resolved within SLA targets and with the required level of priority? Only a well-thought-out and flexible set of performance metrics can ensure that the service desk and incident management process are delivering value to the business.

As with many other corporate functions, IT management often espouses management rhetoric such as “if you don’t measure it, you can’t manage it”, “if you don’t measure it, you can’t improve it”, or “a process is not truly implemented until measured” but, whilst the sentiment is right, implemented metrics often end up being for the sake of having metrics rather than serving a practical purpose such as supporting process assurance and improvement, and informed decision making. Unfortunately, it is all too easy for IT functions to fall under the misconception that tracking metrics and beating targets is enough and for the utilisation of inappropriate metrics to adversely affect process and individual performance causing misalignment with IT, and ultimately business objectives.

So why does it go so wrong? The list of potential metric pitfalls is long but common factors include taking the easy option – basing metrics on easily accessible data (“what can we measure?” rather than “what should we measure?”) or simply using performance metrics that the corporate ITSM tool(s) can readily deliver. Or the reverse, overcomplicating matters such that it costs more to derive metric information than the benefits realised from its utilisation; potentially compounded by not having the right tool(s) to collect, report, and analyse the metric information. It is also easy to focus on quantity over quality, with too many metrics in play – the average service desk tracks more than twenty metrics – possibly a symptom of the ‘what can we measure’ approach.

Metric suitability and effectiveness is further eroded by being parochial (looking at particular subsets of activities rather than the whole) and by being inwardly focused on IT, rather than business, needs; potentially neglecting the fact that an inappropriate mix of metrics can adversely influence employee behaviour. A good, and oft quoted, example of this is the tension between two common service desk metrics – Average Call Handling Time and First Contact Resolution. Scoring highly against one metric can adversely impact the other, and the utilisation of just one of them (without a balancing measure) for performance measurement can be at best worthless, and at worst dangerous.

With just an Average Call Handling Time focus, service desk agents are encouraged to adopt a ‘quantity rather than quality’ approach – taking as many calls as possible with little emphasis on incident resolution – passing the majority of calls onto level 2 support. With just a First Contact Resolution focus, service desk operatives can be reticent to pass a call onto level 2 support and can spend an inappropriate amount of time trying to resolve an incident that is probably beyond their level of knowledge and expertise. In both instances, the metric has inappropriately driven employee behaviour at the expense of the user, Service Level Agreement (SLA) targets, and the continuity of business operations. So, when selecting a metric, an IT function should always consider the negative behaviours it might encourage.

Lack of understanding is another cause of metric misery. This can be a lack of understanding in one or more of the following areas: the need for metrics (at both an organisational and employee level), business needs and expectations, or what measured performance actually means in the context of business impact – potentially resulting in metrics that are not used, are inappropriate to the intended recipients, or are just not understood by the recipients. An IT function should not report metrics that do not contribute to management thinking and decision making.

With the above in mind, whilst there is no silver bullet in terms of a basket of service desk and incident management metrics that fits all IT functions, there are good practices that can be used to focus metric selection and utilisation for business benefit. The first is that metrics should be aligned with business requirements, dovetailing into Service Level Agreement (SLA) targets, with the ability to demonstrate both the value that IT adds to the business, and the business impact of improvements in IT delivery. Metrics should also be reported in context – a good example being that 99.9% availability looks great until the reader sees that the 0.1% nonavailability affected a business-critical process during a period of critical business activity.

Chosen metrics should not be viewed in isolation. IT functions should understand the correlation between metrics such as First Contact Resolution and Customer Satisfaction or First Contact Resolution and Service Desk Operative Utilisation. Metrics should also be viewed across time periods, with metric trends at least as important as static values, given that a persistently exceeded target when trended may show projected failure in the next six months as performance slowly declines. The core performance metrics can also be supplemented by more internal, trend-based ‘Top 10s’ that facilitate problem management activity such as ‘Top 10 used incident classifications’ or ‘Top 10 applications by incident volume’.

Metrics must provide a launch pad for improvement. With the ability to identify both IT and business opportunities such as improving service quality, cost reduction, increased customer satisfaction, people capability enhancement, or technical innovation. The adoption of industry standard metrics allows for the benchmarking of internal performance against industry standards or the service desks of other organisations, e.g. cost per call. Finally, IT organisations should not underestimate the value of softer measures and appreciate that metrics are never a substitute for ongoing service-based conversations with customers.

Traditional service desk metrics include number of calls received (via all channels), number of calls handled by service desk operative, number of service requests and incidents, number of calls handled within and outside SLA targets, number of tickets resolved during first contact, number of tickets escalated (by cause), average caller waiting times, caller abandonment rates, and customer satisfaction. Traditional incident management metrics include number of incidents, number of incidents resolved within SLA targets (for each level of priority), number of incidents escalated (to each level of support), average time to resolve incidents by priority, number of incidents incorrectly recorded, and number of incidents incorrectly assigned to the wrong resolution group.

Unfortunately, the number of incidents received (say) is not a good indicator of service desk and incident management performance. For example, a service desk might be tasked with lowering the volume of incidents received but in doing a better job at resolving incidents, they might actually increase volumes as more users choose to contact them. Conversely, incident volumes might drop the poorer the service they provide, as users either struggle on or seek resolution through alternative channels.

In Butler Group’s opinion, service desk and incident management metrics should focus on how the service desk and level 2 and 3 support add value to the business – through the minimisation of the impact of user (and business) productivity-affecting incidents, at an acceptable and ideally optimal cost. Metrics should also reflect the entire process, not just a subset of activity and, when it comes to the number of service desk and incident management metrics, less is definitely more.

So Butler Group recommends a small basket of weighted metrics that have been agreed with key business stakeholders. As stated earlier, there is no magical out-of-the-box set of metrics that applies to all IT functions. There is, however, a common core that can deliver greater insight, and consequently greater value, to both IT and the parent business. It should be no surprise that these relate to the traditional business goal of achieving the best possible quality at the lowest possible cost.

The first metric is Customer Satisfaction, which is still probably the best indicator of the quality of IT support – both on a transactional and a periodic review basis. The next metrics are key drivers of Customer Satisfaction – First Contact Resolution and Average Speed of Answer – these are also good indicators of IT’s ability to maximise employee and business productivity. In terms of Service Desk efficiency, there are two key metrics – Service Desk Operative Utilisation and Cost Per Call – with the former strongly influencing the latter. The final metric – Service Desk Operative Satisfaction – is easily forgotten or neglected by IT functions. But without it, the persistent drive to deliver higher levels of Customer Satisfaction and Service Desk efficiency may take a heavy toll on Service Desk staff resulting in higher levels of sickness, absenteeism, and ultimately turnover. This can have a knock-on effect that new, less experienced staff will probably adversely affect performance against other key metrics such as Customer Satisfaction, First Contact Resolution, and Cost per Call.

To summarise, in implementing service desk and incident management metrics, an IT function should firstly identify the users of the metrics and their purpose, identify and agree the desired metrics, and set up an appropriate measurement system that allows them to easily monitor performance. The volume and type of metrics used will vary by organisation but a focus on quality, rather than quantity, of metrics is recommended. It is then only through the continual review of such metrics that an IT organisation can demonstrate business alignment and value, and continue to improve its operation – tweaking processes, filling human capability gaps, and improving inter-team communications and co-operation. This ongoing review should also encapsulate the metrics themselves, as there will be occasions where an IT function should not only change its targets, but maybe the metrics themselves.

Republished from http://www.butlergroup.com