Big news! The community will be moving to a new platform April 21. Read more.
Big news! The community will be moving to a new platform April 21. Read more.

The trouble with Incident vs. Problem Management

Micro Focus Expert
Micro Focus Expert
0 0 11.2K
An interesting debate has been raging in a LinkedIn ITIL group (note: you will need to register to see it). A poster asked a simple question, "Does a Major Incident automatically lead to the creation of a problem record?". You would think that ITIL would have a simple answer to such a question, but given the eight pages of replies (as of Jan. 29th), it is clear that there is a need for more clarity in the ITIL guidance and ISO 20000, whether that clarity comes in the form of specifically saying that it is up to each organization to decide, or it decides to prescribe the "right" answer, as is the trend more and more in the evolution of ITIL.

For those of you who are not ITIL experts, in general terms, Incident Management is a discipline tasked with restoring normal operations to a service when it is degraded or down, as quickly as possible. Problem Management is tasked with preventing problem recurrence. In other words, Incident Management wants to "fix it now" by just about any means necessary, including work-arounds. Problem Management is a more methodical discipline, looking for patterns that will indicate some systemic problem and setting in motion actions that will prevent it from happening again. Given the speed vs. analysis dichotomy, there should be some natural tension between these two disciplines.

The confusion creeps in for several reasons:

1. There is a fundamental misunderstanding of the difference between an incident and a problem. Problems are not "really big incidents" as I heard a speaker at an itSMF Local Interest Group once say. Theoretically, all incidents have an underlying cause, which is a problem. Multiple incidents could point to one underlying problem, such as the proverbial network cable that gets pulled in the data center, shutting down several services to end users. Or, a problem could just be a user that needs some training.

2. The people doing Incident Management should not be the same people doing Problem Management. If they are, proper Problem Management will simply not happen, due to the constant firefighting. Problem Management takes time, special technical skills and tools, and is a major investment that few really appreciate. But the payoff in reduced outages and reduced need for Incident Management resources will eventually result in a return on that investment.

3. There is no clear guidance on who handles a Major Incident. If you really want to read the academic debate about what ITIL says or doesn't say on this topic, it's there in the post I linked to earlier. The result is a grey area that leaves it up to each organization to decide how to recover from Major Incidents.

So, where does that leave those of us who need to craft functioning Incident and Problem Management processes? My advice is that the biggest item to address is to ensure that everyone in your organization knows what is expected of them in a time of crisis. Consistency that ensures effectiveness is more important than academic adherence to ITIL.

IT Process Automation (ITPA) has a role to play in driving consistency by ensuring that incidents and events are properly correlated, documented and escalated to the right teams at the right time. At NetIQ we sometimes refer to this as "event enrichment", something we've been receiving more and more interest in from our customers. We will be producing additional content and releasing new workflow templates for event enrichment processes in our ITPA platform, NetIQ Aegis, over the coming months. Stay subscribed to this blog, to keep updated on the topic.
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.