Imperial College London

Dr Francesco Sanna Passino

Faculty of Natural SciencesDepartment of Mathematics

Lecturer in Statistics
 
 
 
//

Contact

 

f.sannapassino Website

 
 
//

Location

 

552Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

12 results found

September MAK, Sanna Passino F, Goldmann L, Hinel Aet al., 2024, Extended Deep Adaptive Input Normalization for Preprocessing Time Series Data for Neural Networks, 27th International Conference on Artificial Intelligence and Statistics (AISTATS)

Conference paper

Sanna Passino F, Che Y, Cardoso Correia Perello C, 2024, Graph-based mutually exciting point processes for modelling event times in docked bike-sharing systems, Stat, Vol: 13, ISSN: 2049-1573

This paper introduces graph-based mutually exciting processes (GB-MEP) to model event times in network point processes, focusing on an application to docked bike-sharing systems. GB-MEP incorporates known relationships between nodes in a graph within the intensity function of a node-based multivariate Hawkes process. This approach reduces the number of parameters to a quantity proportional to the number of nodes in the network, resulting in significant advantages for computational scalability when compared with traditional methods. The model is applied on event data observed on the Santander Cycles network in central London, demonstrating that exploiting network-wide information related to geographical location of the stations is beneficial to improve the performance of node-based models for applications in bike-sharing systems. The proposed GB-MEP framework is more generally applicable to any network point process where a distance function between nodes is available, demonstrating wider applicability.

Journal article

Sanna Passino F, Adams N, Cohen E, Evangelou M, Heard NAet al., 2023, Statistical cybersecurity: a brief discussion of challenges, data structures, and future directions, Harvard Data Science Review, Vol: 5, Pages: 1-10, ISSN: 2644-2353

Journal article

Sanna Passino F, Heard NA, 2023, Mutually exciting point process graphs for modelling dynamic networks, Journal of Computational and Graphical Statistics, Vol: 32, Pages: 116-130, ISSN: 1061-8600

A new class of models for dynamic networks is proposed, called mutually exciting point process graphs (MEG). MEG is a scalable network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection when assessing the significance of future events, including previously unobserved connections between nodes. The model combines mutually exciting point processes to estimate dependencies between events and latent space models to infer relationships between the nodes. The intensity functions for each network edge are characterized exclusively by node-specific parameters, which allows information to be shared across the network. This construction enables estimation of intensities even for unobserved edges, which is particularly important in real world applications, such as computer networks arising in cyber-security. A recursive form of the log-likelihood function for MEG is obtained, which is used to derive fast inferential procedures via modern gradient ascent algorithms. An alternative EM algorithm is also derived. The model and algorithms are tested on simulated graphs and real world datasets, demonstrating excellent performance. Supplementary materials for this article are available online.

Journal article

Sanna Passino F, Heard N, 2022, Latent structure blockmodels for Bayesian spectral graph clustering, Statistics and Computing, Vol: 32, ISSN: 0960-3174

Spectral embedding of network adjacency matrices often produces node representations living approximately around low-dimensional submanifold structures. In particular, hidden substructure is expected to arise when the graph is generated from a latent position model. Furthermore, the presence of communities within the network might generate community-specific submanifold structures in the embedding, but this is not explicitly accounted for in most statistical models for networks. In this article, a class of models called latent structure block models (LSBM) is proposed to address such scenarios, allowing for graph clustering when community-specific one dimensional manifold structure is present. LSBMs focus on a specific class of latent space model, the random dot product graph (RDPG), and assign a latent submanifold to the latent positions of each community. A Bayesian model for the embeddings arising from LSBMs is discussed, and shown to have a good performance on simulated and real world network data. The model is able to correctly recover the underlying communities living in a one-dimensional manifold, even when the parametric form of the underlying curves is unknown, achieving remarkable results on a variety of real data.

Journal article

Sanna Passino F, Heard NA, Rubin-Delanchy P, 2022, Spectral clustering on spherical coordinates under the degree-corrected stochastic blockmodel, Technometrics, Vol: 64, Pages: 346-357, ISSN: 0040-1706

Spectral clustering is a popular method for community detection in networksunder the assumption of the standard stochastic blockmodel. Taking a matrix representation of the graph such as the adjacency matrix, the nodes are clustered on a low dimensional projection obtained from a truncated spectral decomposition of the matrix. Estimating the number of communities and the dimension of the reduced latent space well is crucial for good performance of spectral clustering algorithms. Real-world networks, such as computer networks studied in cyber-security applications, often present heterogeneous within-community degree distributions which are better addressed by the degree-corrected stochastic blockmodel. A novel, model-based method is proposed in this article for simultaneous and automated selection of the number of communities and latent dimension for spectral clustering under the degree-corrected stochastic blockmodel. The method is based on a transformation to spherical coordinates of the spectral embedding, and on a novel modelling assumption in the transformed space, which is then embedded into an existing model selection framework for estimating the number of communities and the latent dimension. Results show improved performance over competing methods on simulated and real-world computer network data.

Journal article

Sanna Passino F, Turcotte MJM, Heard NA, 2021, Graph link prediction in computer networks using Poisson matrix factorisation, Annals of Applied Statistics, Vol: 16, Pages: 1313-1332, ISSN: 1932-6157

Graph link prediction is an important task in cyber-security: relationshipsbetween entities within a computer network, such as users interacting withcomputers, or system libraries and the corresponding processes that use them,can provide key insights into adversary behaviour. Poisson matrix factorisation(PMF) is a popular model for link prediction in large networks, particularlyuseful for its scalability. In this article, PMF is extended to includescenarios that are commonly encountered in cyber-security applications.Specifically, an extension is proposed to explicitly handle binary adjacencymatrices and include known covariates associated with the graph nodes. Aseasonal PMF model is also presented to handle dynamic networks. To allow themethods to scale to large graphs, variational methods are discussed forperforming fast inference. The results show an improved performance over thestandard PMF model and other common link prediction techniques.

Journal article

Sanna Passino F, Bertiger AS, Neil JC, Heard NAet al., 2021, Link prediction in dynamic networks using random dot product graphs, Data Mining and Knowledge Discovery, Vol: 35, Pages: 2168-2199, ISSN: 1384-5810

The problem of predicting links in large networks is a crucial task in avariety of practical applications, including social sciences, biology andcomputer security. In this paper, statistical techniques for link predictionbased on the popular random dot product graph model are carefully presented,analysed and extended to dynamic settings. Motivated by a practical applicationin cyber-security, this paper demonstrates that random dot product graphs notonly represent a powerful tool for inferring differences between multiplenetworks, but are also efficient for prediction purposes and for understandingthe temporal evolution of the network. The probabilities of links are obtainedby fusing information at multiple levels of resolution: time series models areused to score connections at the edge level, and spectral methods provideestimates of latent positions for each node. In this way, traditional linkprediction methods, usually based on decompositions of the entire networkadjacency matrix, are extended using edge-specific information. The methodspresented in this article are applied to a number of simulated and real-worldcomputer network graphs, showing promising results.

Journal article

Sanna Passino F, Maystre L, Moor D, Anderson A, Lalmas Met al., 2021, Where To Next? A Dynamic Model of User Preferences, WWW '21: The Web Conference 2021, Publisher: ACM

Conference paper

Sanna Passino F, Heard N, 2020, Bayesian estimation of the latent dimension and communities in stochastic blockmodels, Statistics and Computing, Vol: 30, Pages: 1291-1307, ISSN: 0960-3174

Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing a network in a lower dimensional latent space, with optimal theoretical guarantees. The embedding can be used to estimate the community structure of the network, with strong consistency results in the stochastic blockmodel framework. One of the main practical limitations of standard algorithms for community detection from spectral embeddings is that the number of communities and the latent dimension of the embedding must be specified in advance. In this article, a novel Bayesian model for simultaneous and automatic selection of the appropriate dimension of the latent space and the number of blocks is proposed. Extensions to directed and bipartite graphs are discussed. The model is tested on simulated and real world network data, showing promising performance for recovering latent community structure.

Journal article

Sanna Passino F, Heard NA, 2020, Classification of periodic arrivals in event time data for filtering computer network traffic, Statistics and Computing, Vol: 30, Pages: 1241-1254, ISSN: 0960-3174

Periodic patterns can often be observed in real-world event time data, possibly mixed with non-periodic arrival times. For modelling purposes, it is necessary to correctly distinguish the two types of events. This task has particularly important implications in computer network security; there, separating automated polling traffic and human-generated activity in a computer network is important for building realistic statistical models for normal activity, which in turn can be used for anomaly detection. Since automated events commonly occur at a fixed periodicity, statistical tests using Fourier analysis can efficiently detect whether the arrival times present an automated component. In this article, sequences of arrival times which contain automated events are further examined, to separate polling and non-periodic activity. This is first achieved using a simple mixture model on the unit circle based on the angular positions of each event time on the p-clock, where p represents the main periodicity associated with the automated activity; this model is then extended by combining a second source of information, the time of day of each event. Efficient implementations exploiting conjugate Bayesian models are discussed, and performance is assessed on real network flow data collected at Imperial College London.

Journal article

Sanna Passino F, Heard NA, 2019, Modelling dynamic network evolution as a Pitman-Yor process, Foundations of Data Science, Vol: 1, Pages: 293-306, ISSN: 2639-8001

Dynamic interaction networks frequently arise in biology, communications technology and the social sciences, representing, for example, neuronal connectivity in the brain, internet connections between computers and human interactions within social networks. The evolution and strengthening of the links in such networks can be observed through sequences of connection events occurring between network nodes over time. In some of these applications, the identity and size of the network may be unknown a priori and may change over time. In this article, a model for the evolution of dynamic networks based on the Pitman-Yor process is proposed. This model explicitly admits power-laws in the number of connections on each edge, often present in real world networks, and, for careful choices of the parameters, power-laws for the degree distribution of the nodes. A novel empirical method for the estimation of the hyperparameters of the Pitman-Yor process is proposed, and some necessary corrections for uniform discrete base distributions are carefully addressed. The methodology is tested on synthetic data and in an anomaly detection study on the enterprise computer network of the Los Alamos National Laboratory, and successfully detects connections from a red-team penetration test.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=01244463&limit=30&person=true