Steps and Tips on Implementing ITIL Problem Management

ITIL defines an "Incident" as any unplanned interruption to an IT service or reduction in the quality of an IT service and ITIL defines a "Problem" as the cause of one or more of those incidents. The primary objectives of taking on Problem Management are to prevent problems and resulting incidents from happening, to eliminate recurring incidents and to minimize the impact of incidents that cannot be prevented. Problem Management is dependent on a mature Incident Management process.

Although it is possible to start early with Problem Management, this process is highly integrated with Incident Management. So, it is best to implement Problem Management after you have implemented Incident Management. You will require incident data, impact, frequency and incident trends to help identify relevant and worthwhile Problems to work on eventually.

Motivational Techniques

It is often possible to start with Problem Management activities, without having a formally defined Problem Management process. Rather than getting bogged down with the activities related to process design, implementation of supporting tools and documentation at the start of the project, consider going for quick wins. You could start with actions like the following:

* Identify the top 5 to 10 incidents

* If needed, provide guidance to incident management/service desk on how to record - incidents

* Find some problems and solve them!

A key activity in Problem Management is to look for the root cause of one or more incidents and recommend a permanent fix. Choosing the right people for the job is crucial. Analytical people with the right technology background are best given such roles. This need not be a permanent role. If fact, most organisation do not assign someone to be "THE Problem Manager". Problem managers are best identified and assigned based on the problem(s) at hand. Sometimes, a task force could be appointed, instead of a single person. Besides technical skills, the assigned Problem Manager(s) would preferably have problem-solving skills and experience with techniques like Kepner Tregoe, Pain-Value Analysis and using of Ishikawa diagrams to perform fault isolation and problem solving.

At some stage, the process would need to be designed, documented and formally rollout throughout the organisation. IT Infrastructure Library (ITIL) would provide an excellent framework and guidance for defining the process activities and steps. Roles and Responsibility for Problem Management needs to be formally defined and a process owner needs to be assigned for this process. The responsibility of the process owner would be to ensure that the process is documented, role and responsibilities are clear and well communicated, people are using the process and there is focus on continual improvement to the process. Reports and metrics have to be defined. Examples include:

* Number of Problems and Known Errors in a period by status, service or category.

* Percentage of Problems which have been solved per category and period.

* Average time taken to find root cause per category.

* Average resolution time of Problems and Known Errors per category.

* Effort invested in Problems pending resolution and expected effort required for closure per period (as measured by resolution time).

* Number of Problems that re-occur. Unlike Incident Management metrics like "percentage solved within target time",

Problem Management metrics are typically not included explicitly in Service Level Agreements (SLAs).

Setting up a Known Error Database (KEDB) is another key activity. A Known Error is a Problem that has a documented root cause and workaround or solution. The KEDB maintains information about problems (i.e., isolation and resolution procedures) and the appropriate workarounds, scripts, references to patches, FAQs and resolutions. The KEDB or knowledge database must facilitate flexible retrieval of information, preferably by keyword search.

However, the KEDB may not add much value if the Incident Management process is too immature to efficiently use them. Many organizations have set up a KEDB system, without real success, due to the fact that the Incident Management or Service Desk staff was too immature to help capture information and use the system to aid in first-line diagnostics. So, setting up a KEDB system in itself is not enough. A knowledge management mindset and culture is needed as well. Incentives and metrics would have to be introduced to motivate the right behaviour in Incident and Problem management staff.

Implement a tool to support the creation and tracking of Problem and Known Error records should be considered.  Given the close dependency between the Incident and Problem Management, integration of incident and problem management workflow and data records in the tool is important. Most commercially available tools like BMC's Remedy or HP's Service Manager comes with separately purchasable but integrated modules for Incident Management, Problem Management, Change Management and a Configuration Management Database (CMDB) to store the system management records and also Configuration Item (CI) information.

Lastly, like any other ITIL processes, the Problem Management process should then go through the Plan-Do-Check-Act cycles and improved and refined over time.

Steps and Tips on Implementing ITIL Problem Management

Jeffrey Lee is a IT Service Management (ITSM) consultant and ITIL trainer.

Visit his website at http://askme4itsm.blogspot.com for more articles on implementing ITSM and ITIL training

See Also : How to time management Management Concept Style Motivational Techniques