How To Manage Incidents In IT Environments Using ITIL Best Practices
It was great to hear from several readers who liked and appreciated my previous blog post on risk management. I am now more committed to keep this blog series going and will continue to share my experience, knowledge, and exposure with all readers. As usual, I am grateful to our content expert who always helps me to make these blogs possible. Without further ado, let’s get into incident management and what it entails.
What Is Incident Management?
Incident management is another important process of ITIL. It is part of the service operation phase of the IT service lifecycle. As the name indicates, incident management is about handling incidents in IT environments.
Incidents are a common thing in today’s IT environments and are considered part and parcel of everyday operations/business. ITIL discusses 26 processes and four functions in its foundation certification and some of these processes are used more frequently than others due to their vast applications. Incident management is one of the more popular processes. It is also addressed in well-known information security certifications like CISSP and CompTIA Security+ from a security perspective. I also teach these two certifications so I will also add some security related incident perspective here.
Incident (like risks) are applicable in various environments but we will limit our discussion to incidents associated with typical IT environments only, as we are following best practices recommended by ITIL.
An incident can be defined as an unplanned interruption to an IT service or degradation in the quality of an IT service. Examples of incident can be outage of printing service (unable to print) or reduced bandwidth of your internet service due to a damaged subsea cable. It doesn’t matter if you are an external customer or internal customer, incidents result in interruptions, and interruptions are always to be taken seriously.
Incident management is closely related to another ITIL process which is problem management. The purpose of having incident management measures in place is to ensure that a committed level of service quality is maintained and services are available as per agreed levels. Incident management results in increased user satisfaction and confidence over the quality of provided services.
How Does It Work?
The question is – how, and to whom, is an incident reported? Let’s say an IT service, such as a file-transfer service, is disrupted at your workplace. What should be your next steps? Well, the service desk is responsible (as per ITIL framework) for handling these incidents and it serves as a channel for customers to communicate their service related matters including incidents.
The service desk is a functional unit (one of the four functions of ITIL) consisting of a team that deals with everyday matters, concerns, issues, and complaints regarding incidents and service requests. They record all incidents in a database and categorize and prioritize them. The service desk incorporates an escalation system for proper support and timely resolution. They can track the status of any incident at any desired time whether it is still unresolved or closed. In most cases, users communicate with the service desk and notify them of service specific issues. However, these incidents can also be tracked by the service desk using various network management tools and alert methods.
As far as timelines of resolving incidents is concerned, it varies and depends on the understanding between customer and service-provider based on SLA (Service Level Agreement document). You may find an absolute urgency demonstrated by the service desk in resolving incidents of certain customers due to strict SLA penalties/timelines and these are often referred to as major incidents.
An incident manager, who takes ownership and accountability of incident from notification to resolution, closely monitors the timely progress of these major incidents. The objective and main purpose of incident management is to find the root cause and take appropriate actions in order to avoid future disruption pertaining to that service and provide agreed upon quality to customers.
When Does An Incident Turn Into A Problem?
When a similar incident occurs more than once, it is no more considered as an incident but rather a problem and is treated by another process called problem management. Like incident management, problem management also belongs to the service operation phase of service lifecycle and its purpose is to investigate and get rid of recurring incidents. In some cases, it can also help us reduce the impact of incidents which are unavoidable.
In network security, incidents are considered as events/actions which may cause harm/loss or damage to your IT assets, security breach, unauthorized access to resources, information etc. In those situations, an incident response team takes control of that incident. However, as stated earlier, this is a security related incident scenario and ITIL focuses more from a general perspective of service disruption. In short, incidents should be handled as per best practices of ITIL framework so that customers not only obtain agreed upon service outcomes but also find value in their acquired service(s).
In my next blog post, I will be discussing CSI (Continual Service Improvement) which is a phase of service lifecycle defined by ITIL framework. I hope you will continue to read my blogs and won’t hesitate to share your feedback with me. Please do visit our community portal itexperts.quickstart.com for more resources, blogs, and discussions about all things IT.