Title: Statistical network modeling via exchangeable interaction processes
Abstract: Many modern network datasets arise from processes of interactions in a population, such as phone calls, e-mail exchanges, co-authorships, social network posts, and professional collaborations. In such interaction networks, the interactions comprise the fundamental statistical units, making a framework for interaction-labeled networks more appropriate for statistical analysis. In this talk, we present exchangeable interaction network models and explore their statistical properties. These models allow for sparsity and power law degree distributions, both of which are widely observed empirical network properties. I will start by presenting the simple Hollywood model, which is computationally tractable, admits a clear interpretation, exhibits good theoretical properties, and performs reasonably well in estimation and prediction.

In many settings, the series of interactions exhibit additional structure. E-mail exchanges, for example, have a single sender and potentially multiple receivers. User posts on a social network occur over time and potentially exhibit community structure. I will briefly introduce three extensions that fall within the edge exchangeable framework. In particular, I will introduce extensions of the Hollywood model (1) that partially pools information via a latent, shared population-level distribution to account for hierarchical structure; (2) that accounts for temporal information; and (3) that accounts for node-level latent community structure. Simulation studies and supporting theoretical analyses are presented. Computationally tractable MCMC sampling algorithms are derived. Inferences are shown on the Enron e-mail, ArXiv, and TalkLife (peer support network) datasets. I will end with a discussion of how to perform posterior predictive checks on interaction data. Using these proposed checks, I will show that the edge exchangeable framework leads to models that fit interaction datasets well.