Spring 2016 Colloquium Series

Thursday, January 14, 2016

4:00 PM



Grigory Yaroslavtsev, Warren Center for Network and Data Sciences, University of Pennsylvania


I will introduce a framework for systematic studies of sublinear algorithms for approximately testing properties of real-valued data with respect to Lp distances for p ≥ 1. Such algorithms distinguish datasets which either have (or are close to having) a certain property from datasets which are far from having it with respect to Lp distance. For applications involving noisy real-valued data, using Lp distances allows algorithms to withstand noise of bounded Lp norm. While the classical property testing framework developed with respect to Hamming distance has been studied extensively, testing with respect to Lp distances has received little attention.

We use our framework to design simple and fast algorithms for classic problems, such as testing monotonicity, convexity and the Lipschitz property, and also distance approximation to monotonicity. In particular, for functions over the hypergrid domains [n]^d , the complexity of our algorithms for all these properties does not depend on the linear dimension n. I will also describe applications of our Lp-testing algorithms to testing assumptions used in the analysis of gradient descent methods for convex optimization and suggest multiple directions for future work.

Joint work with Piotr Berman and Sofya Raskhodnikova (STOC'14)

Grigory Yaroslavtsev is a fellow at the Warren Center for Network and Data Sciences at the University of Pennsylvania hosted by the departments of Computer and Information Sciences and Statistics at the Wharton School of Business. He works on theoretical foundations of large-scale algorithm design with applications to graph processing, machine learning and data privacy. Grigory has been continuously supported through various fellowships since he started graduate school in 2010 and has been a research intern and visiting consultant for AT&T Labs (NJ), IBM Research (CA), Microsoft Research (CA & WA) and Google (NYC).



Assistant Professor Andrea Chomistek, Department of Epidemiology and Biostatistics, School of Public Health, Indiana University

Development of a multi-dimensional physical activity profile for cardiovascular health

Although there is agreement that lack of physical activity is a major, modifiable risk factor for cardiovascular disease (CVD), there is lack of consensus on which aspects of physical activity have the most impact on health. Physical activity is a complex behavior with multiple dimensions that may have independent biological effects. Prior research has primarily focused on the effects of moderate-to-vigorous physical activity (MVPA), but a single physical activity component is not an adequate representation of an individual’s overall physical activity level. It is a critical to examine multiple aspects of physical activity simultaneously to ascertain whether an individual is exhibiting an overall active lifestyle pattern to benefit cardiovascular health. Furthermore, in addition to recent physical activity, it may be important to consider how long-term activity patterns influence cardiovascular risk. Thus, we aim to use contemporary methods to develop a physical activity profile that incorporates multiple dimensions of recent physical activity in addition to long-term physical activity patterns by examining associations with biomarkers and clinical endpoints.



Robert Zinkov, Department of Computer Science, Indiana University

Composing inference methods for probabilistic models

Probabilistic inference procedures are usually coded painstakingly from scratch, for each target model and each inference algorithm. In this talk, I will show how inference procedures can be posed as program transformations which transform one probabilistic model into another probabilistic model. These transformations allow us to generate programs which express exact and approximate inference, and allow us to compose multiple inference procedures over a single model. The resulting inference procedure runs in time comparable to a handwritten procedure.



Amanda F. Mejia, Department of Biostatistics, Johns Hopkins University

Statistical Methods for Brain Connectivity Analysis

Outlier detection for high-dimensional data is a popular topic in modern statistical research. However, one source of high-dimensional data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of 100,000 or more measurements, sampled at hundreds to thousands of time points. At a time when the availability of fMRI data is rapidly growing—primarily through large, publicly available grassroots datasets consisting of hundreds of subjects collected at different sites—automated quality control and outlier detection methods are greatly needed. I propose PCA leverage and demonstrate how it can be used to identify outlying time points in an fMRI scan. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which forms the basis of independent component analysis (ICA) and seed connectivity, two of the most widely used methods for analyzing resting-state fMRI data. I also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. A reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange (ABIDE) shows that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using ICA.  With time permitting, I will also discuss the development and performance of empirical Bayes shrinkage methods for resting-state connectivity, which improve reliability of subject-level estimates by borrowing strength from the large populations of subjects for whom resting-state fMRI data is now available.



Professor Jerry Zhu, Department of Computer Sciences, University of Wisconsin-Madison

Machine Teaching

Machine teaching is an inverse problem to machine learning: Given a learning algorithm and a target model, it seeks the smallest training set D such that when given D, the learning algorithm learns the target model.  Machine teaching is useful in applications where the teacher already has a target model, but cannot directly hardwire the target model into the learner, instead must communicate the target model via training examples.  Example applications include optimal lesson design for education, data-poisoning attacks in computer security, and interactive machine learning.  I will present an overview of machine teaching and highlight several open questions, ranging from theory (quantifying teaching dimension), optimization (solving Stackelberg games), to human-computer interaction.

Jerry Zhu is a professor in the Department of Computer Sciences at the University of Wisconsin-Madison.  He received his Ph.D. from Carnegie Mellon University in 2005.  He is a recipient of a National Science Foundation CAREER Award in 2010 and best paper awards from ICML, AAAI, ECML/PKDD, and SIGSOFT.  His research interests include mathematical foundations of human and machine learning, topological methods for machine learning, and mining social media for social good.



Associate Professor James Flegal, Department of Statistics, University of California, Riverside

A practical sequential stopping rule for high-dimensional MCMC

A current challenge for many Bayesian analyses is determining when to terminate high-dimensional Markov chain Monte Carlo simulations. To this end, we propose using an automated sequential stopping procedure that terminates the simulation when the computational uncertainty is small relative to the posterior uncertainty. Further, we show this stopping rule is equivalent to stopping when the effective sample size is sufficiently large. Such a stopping rule has previously been shown to work well in settings with posteriors of moderate dimension. In this talk, we illustrate its utility in high-dimensional simulations while overcoming some current computational issues. As an example, we consider a Bayesian dynamic space-time model on weather station data. Our results show the sequential stopping rule is easy to implement, provides uncertainty estimates, and performs well in high-dimensional settings.



Professor Jun Zhang, Department of Psychology and Department of Mathematics, University of Michigan-Ann Arbor. 

Information Geometry: From Divergence Functions to Geometric Structures

Information Geometry is the differential geometric study of the manifold of probability density functions. Divergence functions (such as Bregman or KL divergence), as measure of proximity on this manifold, play an important role in machine learning, statistical inference, optimization, etc. This talk will explain the various geometric structures induced from any divergence function. Most importantly, a Riemannian metric (Fisher information) with a family of torsion-free affine connections (alpha-connections) can be induced on the manifold, this is the so-called the “statistical structure” in Information Geometry. Divergence functions can induce other important structures/quantities, such as bi-orthogonal coordinates (namely expectation and natural parameters), parallel volume form (in modeling Bayesian priors), symplectic structure (for Hamiltonian dynamics as in HMC algorithm), and (work in progress) Kahler/para-Kahler structure on the manifold.