Incident Management - Incident Management Process, As Defined By ITIL

Incident Management Process, As Defined By ITIL

Incident management can be defined as : “Incident Definition as per V3” An unplanned interruption to an IT Service or a reduction in the Quality of an IT Service. Failure of a Configuration Item that has not yet impacted Service is also an Incident. For example, Failure of one disk from a mirror set. An “Incident Definition as per V2” An event which is not part of the standard operation of a service and which causes or may cause disruption to or a reduction in the quality of services and Customer productivity. The objective of incident management is to restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price.

The Incident Manager is a functional role, rather than a position of employment, however both may be true dependent upon the hiring organisation. Incident management provides to the external customer a focal point for leadership and drive during an event by ensuring adherence to follow-up on commitments and adequate information flow. This means, presenting to the customer an entity that accepts ownership of their problem.

The objective of Incident Management during an incident is service restoration as quickly as possible. The objective is not to make a system perfect. If service can be restored by a temporary workaround quicker than by correcting the underlying root cause of the issue then that is acceptable. After service restoration, correction of underlying root causes is done by the Problem Management team by a process called Root Cause Analysis (RCA). An example of service restoration by temporary workaround is that done on the Apollo 13.

The primary focus of Incident Management is to ensure a prompt recovery of the system, supervising and directing the internal or external resources. Prompt system recovery and minimization of any impact to the customer’s, has priority over unreasonably long and intensive data collection for the event root cause investigation.

Incidents can be classified into three primary categories: Software (applications), hardware, and service requests. (Note that service requests are not always regarded as an incident, but rather a request for change. However, the handling of failures and the handling of service requests are similar and therefore are included in the definition and scope of the process of incident management.)

ITIL separates incident management into six basic components:

  • Incident detection and recording
  • Classification and initial support
  • Investigation and diagnosis
  • Resolution and recovery
  • Incident closure
  • Ownership, monitoring, tracking, and communication (monitoring the progress of the resolution of the incident and keeping those who are affected by the incident up to date with the status)

From ITIL point of view, the activities of Incident Management are:

  • Take ownership for an incident and act as the primary level of escalation
  • Provide a prompt recovery of the business within the specified Service level agreement or SLA
  • Assure that the focus on the incident resolution is not taken away by other activities
  • Escalating incidents: functional (the support of a higher technical skills are needed to solve the problem) and hierarchical (a manager with more authority to be consulted in order to take decision that are beyond the competencies assigned to this level)
  • Send incident notifications to the customer (documents that contains detail information)
  • Setting-up and leading conference call or bridge communication between all involved parties
  • Keep tracking and records of the time lines
  • Act as an interface towards other technicians, customer technical staff and other groups within the organisation.

An Incident Manager should be able to:

  • understand any incident/fault on a basic level (at least) in order to use the appropriate competences (resources)
  • drive the restoration team to gather sufficient information to start an analysis
  • maintain a general overview of the incident (keeping the focusing on the restoration via a workaround)
  • understand the functionality of multiple areas (RAN, Core Network, VAS, BSS/OSS)
  • obtain guidance on priorities to the teams starting the immediate urgent unexpected recovery work

Read more about this topic:  Incident Management

Famous quotes containing the words incident, management and/or defined:

    Every incident connected with the breaking up of the rivers and ponds and the settling of the weather is particularly interesting to us who live in a climate of so great extremes. When the warmer days come, they who dwell near the river hear the ice crack at night with a startling whoop as loud as artillery, as if its icy fetters were rent from end to end, and within a few days see it rapidly going out. So the alligator comes out of the mud with quakings of the earth.
    Henry David Thoreau (1817–1862)

    The care of a house, the conduct of a home, the management of children, the instruction and government of servants, are as deserving of scientific treatment and scientific professors and lectureships as are the care of farms, the management of manure and crops, and the raising and care of stock.
    Catherine E. Beecher (1800–1878)

    Coming to terms with the rhythms of women’s lives means coming to terms with life itself, accepting the imperatives of the body rather than the imperatives of an artificial, man-made, perhaps transcendentally beautiful civilization. Emphasis on the male work-rhythm is an emphasis on infinite possibilities; emphasis on the female rhythms is an emphasis on a defined pattern, on limitation.
    Margaret Mead (1901–1978)