Multidimensional Scaling

Multidimensional scaling (MDS) has been widely used as a data visualization and dimension reduction tool in many fields including psychometrics. The goal of MDS is to locate objects in space according to their pairwise dissimilarities in such a way that similar objects are close together, while less similar objects are far apart. Mathematically, applying MDS to objects with dissimilarity matrix \(\boldsymbol D\) minimizes the objective function \[ \sum_{i<j} \left(d_{ij} - \|\boldsymbol x_i - \boldsymbol x_j\|\right)^2 \] with respect to \(\boldsymbol X = (\boldsymbol x_1, \ldots, \boldsymbol x_n)^T\), where \(\boldsymbol x_i \in \mathbb{R}^K\) is the coordinates of the \(i\)th object in \(K\)-dimensional Euclidean space and \(\| \boldsymbol x_i - \boldsymbol x_j\| = \sqrt{(\boldsymbol x_i - \boldsymbol x_j)^T (\boldsymbol x_i - \boldsymbol x_j)}\).

Each dimension of the coordinates is a feature describing the differences among the objects. To extract features from process data, we need to find a dissimilarity measure that appropriately summarizes the differences between two action sequences.

Dissimilarity between sequences

At least three aspects should be taken into account when choosing a dissimilarity measure for action sequences. . The dissimilarity measure should work for sequences whose elements are categorical. . The dissimilarity measure should work for sequences with unequal lengths. . The dissimilarity measure can account for the difference in the order of actions.

Examples of dissimilarity meansures that meet these requirements are


The procedure to extract features from process data by multidimensional scaling:

  1. Form the dissimilarity matrix \(\boldsymbol D\) of \(n\) action sequences \(\boldsymbol s_1, \boldsymbol s_2, \ldots, \boldsymbol s_n\) by calculating the pairwise dissimilarities \(d_{ij}, 1\leq i, j \leq n\) according to a chosen dissimilarity measure.

  2. Obtain \(K\) raw features \(\tilde{\boldsymbol x}_1, \ldots, \tilde{\boldsymbol x}_K\) by applying multidimensional scaling on \(\boldsymbol{D}\).

  3. Obtain \(K\) principal features \(\boldsymbol x_1, \ldots, \boldsymbol x_K\) by performing principal component analysis (PCA) on the \(K\) raw features.