Rajangam ATHILAKSHMI1*, Ramadoss RAJAVEL1 , Shomona Gracia JACOB2
1 SSN College of Engineering, Department of ECE, Kalavakkam, Chennai 603110, India
firstname.lastname@example.org (*Corresponding author), RajavelR@ssn.edu.in
2 Independent Research Advisor, Computers and Biological Applications, Oman
ABSTRACT: In DNA microarray research, the increase in gene expression samples and feature dimensions become a challenge for feature selection. This makes it necessary that a more efficient and improved classification algorithm be developed so as to select optimal features in gene expression data. This study presents a new feature selection algorithm that combines the Correlation Feature Selection (CFS) and the Velocity Clamping Particle Swarm Optimization (VCPSO) algorithm. This hybrid model takes advantage of both the filters and the wrappers. It also selects the subsets with optimal features to classify genes by using different classifiers such as Support Vector Machine (SVM), Random Forest(RF), Naïve Bayes(NB) and Decision Tree(DT). Two bioinformatics problems become the basis of evaluation for hybrid mechanisms. These are neurodegenerative brain disorder protein data and microarray cancer data. Reducing the redundancy and finding optimal gene features is the need of the hour. Our experiments show that CFS-VCPSO-SVM selection method eliminates the redundant features and classifies the gene expression data with maximum accuracy.
KEYWORDS: Microarray data analysis, Correlated Feature Selection, Velocity Clamping Particle Swarm Optimization, Fusion Feature Selection.
>>FULL TEXT: PDF
CITE THIS PAPER AS:
Rajangam ATHILAKSHMI, Ramadoss RAJAVEL, Shomona Gracia JACOB, Fusion Feature Selection: New Insights into Feature Subset Detection in Biological Data Mining, Studies in Informatics and Control, ISSN 1220-1766, vol. 28(3), pp. 327-336, 2019. https://doi.org/10.24846/v28i3y201909