OP-SME: Lecture: Cluster Analysis: Cluster Analysis

Cluster Analysis

common name for a whole collection of computational statistical procedures
aim: to decompose the data into several homogeneous groups – clusters.
the objects inside a cluster are as similar as possible;
the objects from different clusters should resemble as little as possible

Definition

Let \(\mathbf X = \{\mathbf x_1, \mathbf x_2, \dotsc, \mathbf x_n \}\) be a set of objects, and some coefficient \(D\) of dissimilarity between objects. The cluster is a subset \(C \subseteq \mathbf X\) of objects such that \[\max D(\mathbf x_i, \mathbf x_j) < D(\mathbf x_k, \mathbf x_l)\] for each \(x_i, x_j, x_l \in C\) and each \(x_k \not\in C\).

not constructive:
- describes the property which the cluster has to satisfy
- but does not explain how the cluster should be constructed.
many clustering methods (see below)

Lecture: Cluster Analysis

Cluster Analysis

Definition

Tips for users

Tips for teachers

Contact