Handling data imperfections in classification problems using a belief theory approach Conference

Weerakkody, S, Zhang, J, Roelant, D et al. (2008). Handling data imperfections in classification problems using a belief theory approach . 246-253.

cited authors

  • Weerakkody, S; Zhang, J; Roelant, D; Zhu, H; Yen, K

authors

abstract

  • Imperfections in the data collected form real-world applications is an important factor that affect the performance of a classifier. In this paper we identify three major categories of imperfect information that directly affect the performance of a classifier. Namely, ambiguous class labels in the training data set, feature values corrupted with noise and missing feature values in the unclassified feature vectors. Class label ambiguities are common when the training data set has been developed by domain experts while feature noise and missing features are resulted when the information is gathered from sensing devices. In this paper, we propose a novel classification algorithm which is capable of handling all the three categories of imperfect data internally within the classification algorithm. To address ambiguities in class labeling, we adopt a classifier based on belief theoretic notions which has intrinsic capabilities of handling ambiguous information. By introducing a new method for mass function calculation, based on quality of the feature values in the unclassified feature vector, the proposed classifier achieves the capability to handle missing and noisy features. Experiments on several databases selected from the UCI data repository demonstrate that the proposed classifier is capable of achieving higher classification accuracy, compared to other popular methods, in the presence of imperfect information.

publication date

  • December 1, 2008

International Standard Book Number (ISBN) 13

start page

  • 246

end page

  • 253