Medical datamining with a new algorithm for feature selection and Naïve Bayesian classifier Conference

Abraham, R, Simha, JB, Iyengar, SS. (2007). Medical datamining with a new algorithm for feature selection and Naïve Bayesian classifier . 44-49. 10.1109/ICOIT.2007.4418266

cited authors

  • Abraham, R; Simha, JB; Iyengar, SS

authors

abstract

  • Much research work in datamining has gone into improving the predictive accuracy of statistical classifiers by applying the techniques of discretization and feature selection. As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity despite its assumption that attributes are conditionally mutually independent given the class label. In this paper we propose a new feature selection algorithm to improve the classification accuracy of Naïve Bayes with respect to medical dataseis. Our experimental results with 17 medical dataseis suggest that on an average the new CHI-WSS algorithm gave best results. The proposed algorithm utilizes discretization and simplifies the ' wrapper ' approach based feature selection by reducing the feature dimensionality through the elimination of irrelevant and least relevant features using chi-square statistics. For our experiments we utilize two established measures to compare the performance of statistical classifiers namely; classification accuracy (or error rate) and the area under ROC to demonstrate that the proposed algorithm using generative Naïve Bayesian classifier on the average is more efficient than using discriminative models namely Logistic Regression and Support Vector Machine. © 2007 IEEE.

publication date

  • December 1, 2007

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 10

International Standard Book Number (ISBN) 13

start page

  • 44

end page

  • 49