Incident Management Process, As Defined By ITIL
Incident management can be defined as : “Incident Definition as per V3” An unplanned interruption to an IT Service or a reduction in the Quality of an IT Service. Failure of a Configuration Item that has not yet impacted Service is also an Incident. For example, Failure of one disk from a mirror set. An “Incident Definition as per V2” An event which is not part of the standard operation of a service and which causes or may cause disruption to or a reduction in the quality of services and Customer productivity. The objective of incident management is to restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price.
The Incident Manager is a functional role, rather than a position of employment, however both may be true dependent upon the hiring organisation. Incident management provides to the external customer a focal point for leadership and drive during an event by ensuring adherence to follow-up on commitments and adequate information flow. This means, presenting to the customer an entity that accepts ownership of their problem.
The objective of Incident Management during an incident is service restoration as quickly as possible. The objective is not to make a system perfect. If service can be restored by a temporary workaround quicker than by correcting the underlying root cause of the issue then that is acceptable. After service restoration, correction of underlying root causes is done by the Problem Management team by a process called Root Cause Analysis (RCA). An example of service restoration by temporary workaround is that done on the Apollo 13.
The primary focus of Incident Management is to ensure a prompt recovery of the system, supervising and directing the internal or external resources. Prompt system recovery and minimization of any impact to the customer’s, has priority over unreasonably long and intensive data collection for the event root cause investigation.
Incidents can be classified into three primary categories: Software (applications), hardware, and service requests. (Note that service requests are not always regarded as an incident, but rather a request for change. However, the handling of failures and the handling of service requests are similar and therefore are included in the definition and scope of the process of incident management.)
ITIL separates incident management into six basic components:
- Incident detection and recording
- Classification and initial support
- Investigation and diagnosis
- Resolution and recovery
- Incident closure
- Ownership, monitoring, tracking, and communication (monitoring the progress of the resolution of the incident and keeping those who are affected by the incident up to date with the status)
From ITIL point of view, the activities of Incident Management are:
- Take ownership for an incident and act as the primary level of escalation
- Provide a prompt recovery of the business within the specified Service level agreement or SLA
- Assure that the focus on the incident resolution is not taken away by other activities
- Escalating incidents: functional (the support of a higher technical skills are needed to solve the problem) and hierarchical (a manager with more authority to be consulted in order to take decision that are beyond the competencies assigned to this level)
- Send incident notifications to the customer (documents that contains detail information)
- Setting-up and leading conference call or bridge communication between all involved parties
- Keep tracking and records of the time lines
- Act as an interface towards other technicians, customer technical staff and other groups within the organisation.
An Incident Manager should be able to:
- understand any incident/fault on a basic level (at least) in order to use the appropriate competences (resources)
- drive the restoration team to gather sufficient information to start an analysis
- maintain a general overview of the incident (keeping the focusing on the restoration via a workaround)
- understand the functionality of multiple areas (RAN, Core Network, VAS, BSS/OSS)
- obtain guidance on priorities to the teams starting the immediate urgent unexpected recovery work
Read more about this topic: Incident Management
Famous quotes containing the words incident, management and/or defined:
“It is of the highest importance in the art of detection to be able to recognise out of a number of facts which are incidental and which are vital.... I would call your attention to the curious incident of the dog in the night-time.
The dog did nothing in the night-time.
That was the curious incident.”
—Sir Arthur Conan Doyle (18591930)
“The care of a house, the conduct of a home, the management of children, the instruction and government of servants, are as deserving of scientific treatment and scientific professors and lectureships as are the care of farms, the management of manure and crops, and the raising and care of stock.”
—Catherine E. Beecher (18001878)
“Long before Einstein told us that matter is energy, Machiavelli and Hobbes and other modern political philosophers defined man as a lump of matter whose most politically relevant attribute is a form of energy called self-interestedness. This was not a portrait of man warts and all. It was all wart.”
—George F. Will (b. 1941)