Death by Trouble Tickets? Or Death to Trouble Tickets?

Our industry is biased in how we measure and develop Service Level Agreements. We seem to be driven by the lowest common denominator: Help desk Tickets, to determine how well a service is being delivered. We assign thresholds to a reactive action that is only reported half the time by customers that are now a bit grumpy because they’re having problems. This should be the last and final catchall for issues to be reported. Yet, near 100% of the SLA’s that cross my field of view include Tickets as a measure of how well a Service is performed and operating.

Make no mistake about it, modern day compliance to your Service Levels is determined by how often the customer complains and in reality its not how often the customer complains but how often with verifiable or actionable information the customer provides. Basically, if they fill out the ticket properly and completely, we may be able to verify that the service is truly broken. To add insult to injury those same service levels declare that if the ticket is closed or resolved based on when a ticket is closed or resolved as the verifiable end resolution meeting compliance for the SLA. This is a leap of unverifiable faith that the customers problem has in fact been resolved.

As a field consultant, I’m often brought into a customer engagement to design and build out a CMDB or to architect a solution that provides dashboards for real time monitoring of business critical applications. Among the first pieces of information that I ask for in these engagements are Service Level Agreements. Unfortunately though, I usually find that these Agreements are littered with the lowest common denominator or in fact written so poorly that they don’t include Service Dependencies, or even articulate what the service is or least of all how to measure it. They talk only of Tickets. Tickets are not a service. And I defy you to ask a customer to clarify how THEY know the Service is down? Shouldn't you know first?

Irrespective of how your services are managed we should always have a visibility to Root Cause, or more likely a Root Symptom from a tool or solution outside of our ticketing systems that can report the issue as a trigger to determine compliance to a Service Level. This methodology maintains trust and increases your visibility and the fidelity to your verifiable data. This IS important! This also shows that IT as the service provider is actively looking at their services. The goal is to be aware of issues before our customers are. Sure, outages and errors will still occur but at least IT is informing the customer of problems rather than the other way around.

I thought the industry had come a long way since the dawn of Trouble Ticket systems, and the advent of Business Service Level Monitoring, but I question the sincerity of IT Culture's ability to wage war on all those things that cause their end users pain. I still don’t see the kind of progress or productivity gains true Service Level Monitoring can achieve.


Identity & Access Mgmt