Feature selection aims at finding the most relevant features of a problem domain. It is very helpful in improving computational speed and prediction accuracy. However, identification of useful features from hundreds or even thousands of related features is a nontrivial task. In this paper, we introduce a hybrid feature selection method which combines two feature selection methods – the filters and the wrappers. Candidate features are first selected from the original feature set via computationally-efficient filters. The candidate feature set is further refined by more accurate wrappers. This hybrid mechanism takes advantage of both the filters and the wrappers. The mechanism is examined by two bioinformatics problems, namely, protein disordered region prediction and gene selection in microarray cancer data. Experimental results show that equal or better prediction accuracy can be achieved with a smaller feature set. These feature subsets can be obtained in a reasonable time period.
Expert Systems with Applications 38(7), pp.8144-8150