KMC: K-Means/K-Medians

Parameter Information


Sample Selection

The sample selection option indicates whether to cluster genes or experiments.

Distance Metric Selection

This area allows the selection of the metric to be used to assess gene-to-gene or sample-to-sample distances. The initial metric displayed (choosen) corresponds to the global setting in the Multiple Array Viewer's 'Metrics' menu. Alterations to the chosen metric in this dialog will only alter the metric used for the current algorithm run. The global setting in the main 'Metrics' menu will remain unchanged.

Euclidean Distance and Pearson Correllation tend to be the most frequently used options. An appendix in the MeV manual describes the distance metrics offered in MeV.

Means/Medians option

The Means or Medians option indicates whether each cluster's centroid vector should be calculated a mean or a median of the member expression patterns.

Number of Clusters

This positive integer value indicates the number of clusters to be created. Note that FOM can be used to estimate an appropriate value.

Number of Iterations

This positive integer value is the maximum number of times that all the elements in the data set will be tested for cluster fit. On each iteration each element is associated with the cluster with the closest mean (or median).

Note that the algorithm will terminate when either no elements require migration (reassignment) to new clusters or when the maximum number of iterations has been reached.

Hierarchical Clustering

This check box selects whether to perform hierarchical clustering on the elements in each cluster created.