Speaker: Michael Trosset, Professor, Department of Statistics, Indiana University
Title: Continuous Multidimensional Scaling
Abstract:
Multidimensional Scaling (MDS) is a collection of techniques for constructing Euclidean representations of proximity data, i.e., information about the pairwise (dis)similarities of a collection of objects. Traditional MDS is concerned with embedding a fixed set of proximities associated with a fixed set of n objects, and is widely used to visualize the low-dimensional structure of small data sets. In contrast, modern applications such as graph embedding and large language models are typically more concerned with constructing representation spaces suitable for subsequent inference. Developing asymptotic theories for statistical inference on random graphs necessitates studying the limiting behavior of MDS on a sequence of proximities associated with an increasing set of objects.
We study the asymptotic behavior of embedded structures obtained by minimizing Kruskal's (1964) raw stress criterion, which approximates dissimilarities with Euclidean distances. Standard results from the theory of point-to-set maps imply that, if n is fixed, then the limit of the embedded structures is the embedded structure of the limiting proximities. But what if n tends to infinity? It then becomes necessary to reformulate MDS so that the entire sequence of embedding problems can be viewed as a sequence of optimization problems in a common space. We present such a formulation and show that the limit of the embedded structures is again the embedded structure of the limiting proximities. Finally, to encourage smooth behavior and uniform convergence, we introduce approximate Lipschitz constraints and explore the implications of imposing them. Returning to the case of fixed n, we conclude by proposing Approximate Lipschitz Embedding (ALE) and discussing some computational issues associated with it.
This is joint work with Carey E. Priebe.