Past Issues

Studies in Informatics and Control
Vol. 16, No. 4, 2007

Simultaneous Feature Selection and Clustering for Gene Expression Data Using Nonnegative Matrix Factorizations with Offset

Liviu Badea
Abstract

In this paper we show that adding offset terms to standard Nonnegative Matrix Factorization can improve clustering even without an explicit feature (gene) selection step. Given that most cancer subtypes are very heterogeneous diseases, we apply our algorithm to a large public colon cancer gene expression dataset to differentiate the main genomic-level subtypes of the disease.

Keywords

bioinformatics, gene expression data analysis

View full article