In DNA microarray research, the increase in gene expression samples and feature dimensions become a challenge for feature selection. This makes it necessary that a more efficient and improved classification algorithm be developed so as to select optimal features in gene expression data. This study presents a new feature selection algorithm that combines the Correlation Feature Selection (CFS) and the Velocity Clamping Particle Swarm Optimization (VCPSO) algorithm. This hybrid model takes advantage of both the filters and the wrappers. It also selects the subsets with optimal features to classify genes by using different classifiers such as Support Vector Machine (SVM), Random Forest(RF),Naïve Bayes(NB) and Decision Tree(DT). Two bioinformatics problems become the basis of evaluation for hybrid mechanisms. These are neurodegenerative brain disorder protein data and microarray cancer data. Reducing the redundancy and finding optimal gene features is the need of the hour. Our experiments show that CFS-VCPSO-SVM selection method eliminates the redundant features and classifies the gene expression data with maximum accuracy.
Microarray data analysis, Correlated Feature Selection, Velocity Clamping Particle Swarm Optimization, Fusion Feature Selection.
Rajangam ATHILAKSHMI, Ramadoss RAJAVEL, Shomona Gracia JACOB, "Fusion Feature Selection: New Insights into Feature Subset Detection in Biological Data Mining", Studies in Informatics and Control, ISSN 1220-1766, vol. 28(3), pp. 327-336, 2019. https://doi.org/10.24846/v28i3y201909