14:00 – 15:00 Tom Berrett

Title: On robustness and local differential privacy

Abstract: It is of soaring demand to develop statistical analysis tools that are robust against contamination as well as preserving individual data owners’ privacy. In spite of the fact that both topics host a rich body of literature, to the best of our knowledge, we are the first to systematically study the connections between the optimality under Huber’s contamination model and the local differential privacy (LDP) constraints. In this paper, we start with a general minimax lower bound result, which disentangles the costs of being robust against Huber’s contamination and preserving LDP. We further study four concrete examples: a two-point testing problem, a potentially-diverging mean estimation problem, a nonparametric density estimation problem and a univariate median estimation problem. For each problem, we demonstrate procedures that are optimal in the presence of both contamination and LDP constraints, comment on the connections with the state-of-the-art methods that are only studied under either contamination or privacy constraints, and unveil the connections between robustness and LDP via partially answering whether LDP procedures are robust and whether robust procedures can be efficiently privatised. Overall, our work showcases a promising prospect of joint study for robustness and local differential privacy. This is joint work with Mengchu Li and Yi Yu.

15:30 – 16:30 Ian Gallagher

Title: Adjacency spectral embedding beyond unweighted, undirected networks

Abstract: Many large real-world datasets can be considered as pairwise interactions between objects, from world-wide connections online between computers to the microscopic interaction between proteins. These interactions can be represented using graphs with nodes representing the objects and edges representing the interaction. However, in many scenarios, this simple graph model is not sufficient for complex networks. In cybersecurity applications, we may also be interested in the number of packets being sent between computers, which direction they are being sent, to which ports and at what times.

Graph embedding techniques produce a representation of the nodes in a network in a low-dimensional space that preserves aspects of the original structure. These embeddings can be used for a range of downstream tasks, such as node clustering, link prediction and anomaly detection. Spectral embedding algorithms based on the singular value decomposition of the adjacency matrix can be computed efficiently for large graphs, desirable for large networks.

This talk describes the properties of the adjacency spectral embedding for networks with weighted edges evolving over time. Based on the statistical behaviour, we offer practical advice for analysing real-world networks, including how to temper edge weights with power law distributions with follow-on classification and clustering tasks in mind, and producing stable embeddings over time for dynamic networks.

Refreshments available between 15:00 – 15:30

Getting here