IT incident management by analyzing incident relations Conference

Liu, R, Lee, J. (2012). IT incident management by analyzing incident relations . EURO-PAR 2011 PARALLEL PROCESSING, PT 1, 7636 LNCS 631-638. 10.1007/978-3-642-34321-6_49

cited authors

  • Liu, R; Lee, J

authors

abstract

  • IT incident management aims to maintain high levels of service quality and availability by restoring normal service operations as quickly as possible and minimizing business impact. Enterprises often maintain many applications to support their business. It is a significant challenge to diagnose incidents at application level due to complicated causes often aggregated from the shared IT environment, network, hardware, software, and changes. In this paper, we present a new approach to diagnosing application incidents by effectively searching for relevant co-occurring and reoccurring incidents. These relevant incidents reveal patterns of application failures and provide insights into incident resolution and prevention. This paper also provides a case study where we implement this approach and evaluate its performance in terms of search accuracy. © Springer-Verlag Berlin Heidelberg 2012.

publication date

  • January 1, 2012

published in

Digital Object Identifier (DOI)

start page

  • 631

end page

  • 638

volume

  • 7636 LNCS