Validating clustering for gene expression data bioinformatics

Posted by / 12-Feb-2020 06:10

We use cookies to make interactions with our website easy and meaningful, to better understand the use of our services, and to tailor advertising.For further information, including about cookie settings, please read our Cookie Policy .

At the moment there do not seem to be any clear-cut guidelines regarding the choice of a clustering algorithm to be used for grouping genes based on their expression profiles.

In contrast to classical clustering techniques such as hierarchical clustering (Sokal and Michener, 1958) and -means clustering (Hartigan and Wong, 1979), biclustering does not require genes in the same cluster to behave similarly over all experimental conditions.

Instead, a bicluster is defined as a subset of genes that exhibit compatible expression patterns over a subset of conditions.

While the ‘best’ method is dependent on the exact validation strategy and the number of clusters to be used, overall appears to be a solid performer.

Interestingly, the performance of correlation-based hierarchical clustering and model-based clustering (another method that has been advocated by a number of researchers) appear to be on opposite extremes, depending on what validation measure one employs.

validating clustering for gene expression data bioinformatics-47validating clustering for gene expression data bioinformatics-63validating clustering for gene expression data bioinformatics-65

Results: In this paper, we consider six clustering algorithms (of various flavors!

One thought on “validating clustering for gene expression data bioinformatics”