Imperial College London

ProfessorMichaelBronstein

Faculty of EngineeringDepartment of Computing

Visiting Professor
 
 
 
//

Contact

 

m.bronstein Website

 
 
//

Location

 

569Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

249 results found

Andreas J, Beguš G, Bronstein MM, Diamant R, Delaney D, Gero S, Goldwasser S, Gruber DF, de Haas S, Malkin P, Pavlov N, Payne R, Petri G, Rus D, Sharma P, Tchernov D, Tønnesen P, Torralba A, Vogt D, Wood RJet al., 2022, Toward understanding the communication in sperm whales, iScience, Vol: 25

Machine learning has been advancing dramatically over the past decade. Most strides are human-based applications due to the availability of large-scale datasets; however, opportunities are ripe to apply this technology to more deeply understand non-human communication. We detail a scientific roadmap for advancing the understanding of communication of whales that can be built further upon as a template to decipher other forms of animal and non-human communication. Sperm whales, with their highly developed neuroanatomical features, cognitive abilities, social structures, and discrete click-based encoding make for an excellent model for advanced tools that can be applied to other animals in the future. We outline the key elements required for the collection and processing of massive datasets, detecting basic communication units and language-like higher-level structures, and validating models through interactive playback experiments. The technological capabilities developed by such an undertaking hold potential for cross-applications in broader communities investigating non-human communication and behavioral research.

Journal article

Cosmo L, Minello G, Bronstein M, Rodola E, Rossi L, Torsello Aet al., 2022, 3D Shape Analysis Through a Quantum Lens: the Average Mixing Kernel Signature, INTERNATIONAL JOURNAL OF COMPUTER VISION, Vol: 130, Pages: 1474-1493, ISSN: 0920-5691

Journal article

Mahdi SS, Nauwelaers N, Joris P, Bouritsas G, Gong S, Walsh S, Shriver MD, Bronstein M, Claes Pet al., 2022, Matching 3D Facial Shape to Demographic Properties by Geometric Metric Learning: A Part-Based Approach, IEEE Transactions on Biometrics, Behavior, and Identity Science, Vol: 4, Pages: 163-172

Face recognition is a widely accepted biometric identifier, as the face contains a lot of information about the identity of a person. The goal of this study is to match the 3D face of an individual to a set of demographic properties (sex, age, BMI, and genomic background) that are extracted from unidentified genetic material. We introduce a triplet loss metric learner that compresses facial shape into a lower dimensional embedding while preserving information about the property of interest. The metric learner is trained for multiple facial segments to allow a global-to-local part-based analysis of the face. To learn directly from 3D mesh data, spiral convolutions are used along with a novel mesh-sampling scheme, which retains uniformly sampled points at different resolutions. The capacity of the model for establishing identity from facial shape against a list of probe demographics is evaluated by enrolling the embeddings for all properties into a support vector machine classifier or regressor and then combining them using a naive Bayes score fuser. Results obtained by a 10-fold cross-validation for biometric verification and identification show that part-based learning significantly improves the systems performance for both encoding with our geometric metric learner or with principal component analysis.

Journal article

Zafeiriou S, Bronstein M, Cohen T, Vinyals O, Song L, Leskovec J, Lio P, Bruna J, Gori Met al., 2022, Guest Editorial: Non-Euclidean Machine Learning, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 44, Pages: 723-726, ISSN: 0162-8828

Journal article

Cretu A-M, Monti F, Maronne S, Dong X, Bronstein M, de Montjoye Yet al., 2022, Interaction data are identifiable even across long periods of time, Nature Communications, Vol: 13, Pages: 1-11, ISSN: 2041-1723

Fine-grained records of people’s interactions, both offline and online, arecollected at large scale. These data contain sensitive information about whom wemeet, talk to, and when. We demonstrate here how people’s interaction behavioris stable over long periods of time and can be used to identify individuals inanonymous datasets. Our attack learns the profile of an individual using geometric deep learning and triplet loss optimization. In a mobile phone metadatadataset of more than 40k people, it correctly identifies 52% of individuals basedon their 2-hop interaction graph. We further show that the profiles learned byour method are stable over time and that 24% of people are still identifiableafter 20 weeks. Our results suggest that people with well-balanced interactiongraphs are more identifiable. Applying our attack to Bluetooth close-proximitynetworks, we show that even 1-hop interaction graphs are enough to identifypeople more than 26% of the time. Our results provide strong evidence thatdisconnected and even re-pseudonymized interaction data can be linked together making them personal data under the European Union’s General DataProtection Regulation.

Journal article

Bouritsas G, Frasca F, Zafeiriou SP, Bronstein Met al., 2022, Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting, IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN: 0162-8828

While Graph Neural Networks (GNNs) have achieved remarkable results in a variety of applications, recent studies exposed important shortcomings in their ability to capture the structure of the underlying graph. It has been shown that the expressive power of standard GNNs is bounded by the Weisfeiler-Leman (WL) graph isomorphism test, from which they inherit proven limitations such as the inability to detect and count graph substructures. On the other hand, there is significant empirical evidence, e.g. in network science and bioinformatics, that substructures are often intimately related to downstream tasks. To this end, we propose Graph Substructure Networks (GSN), a topologically-aware message passing scheme based on substructure encoding. We theoretically analyse the expressive power of our architecture, showing that it is strictly more expressive than the WL test, and provide sufficient conditions for universality. Importantly, we do not attempt to adhere to the WL hierarchy; this allows us to retain multiple attractive properties of standard GNNs such as locality and linear network complexity, while being able to disambiguate even hard instances of graph isomorphism. We perform an extensive experimental evaluation on graph classification and regression tasks and obtain state-of-the-art results in diverse real-world settings including molecular graphs and social networks.

Journal article

Mahdi SS, Matthews H, Nauwelaers N, Vanneste M, Gong S, Bouritsas G, Baynam GS, Hammond P, Spritz R, Klein OD, Hallgrimsson B, Peeters H, Bronstein M, Claes Pet al., 2022, Multi-Scale Part-Based Syndrome Classification of 3D Facial Images, IEEE ACCESS, Vol: 10, Pages: 23450-23462, ISSN: 2169-3536

Journal article

Kazi A, Cosmo L, Ahmadi SA, Navab N, Bronstein Met al., 2022, Differentiable Graph Module (DGM) for Graph Convolutional Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN: 0162-8828

Graph deep learning has recently emerged as a powerful ML concept allowing to generalize successful deep neural architectures to non-Euclidean structured data. One of the limitations of the majority of current graph neural network architectures is that they are often restricted to the transductive setting and rely on the assumption that the underlying graph is known and fixed. Often, this assumption is not true since the graph may be noisy, or partially and even completely unknown. In such cases, it would be helpful to infer the graph directly from the data, especially in inductive settings where some nodes were not present in the graph at training time. Furthermore, learning a graph may become an end in itself, as the inferred structure may provide complementary insights next to the downstream task. In this paper, we introduce Differentiable Graph Module (DGM), a learnable function that predicts edge probabilities in the graph which are optimal for the downstream task. DGM can be combined with convolutional graph neural network layers and trained in an end-to-end fashion. We provide an extensive evaluation on applications in healthcare, brain imaging, computer graphics, and computer vision showing a significant improvement over baselines both in transductive and inductive settings.

Journal article

Gaudelet T, Day B, Jamasb AR, Soman J, Regep C, Liu G, Hayter JBR, Vickers R, Roberts C, Tang J, Roblin D, Blundell TL, Bronstein MM, Taylor-King JPet al., 2021, Utilizing graph machine learning within drug discovery and development., Brief Bioinform, Vol: 22

Graph machine learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development. After introducing key terms and modelling approaches, we move chronologically through the drug development pipeline to identify and summarize work incorporating: target identification, design of small molecules and biologics, and drug repurposing. Whilst the field is still emerging, key milestones including repurposed drugs entering in vivo studies, suggest GML will become a modelling framework of choice within biomedical machine learning.

Journal article

Anelli VW, Kalloori S, Ferwerda B, Belli L, Tejani A, Portman F, Lung-Yut-Fong A, Chamberlain B, Xie Y, Hunt J, Bronstein M, Shi Wet al., 2021, RecSys 2021 challenge workshop: Fairness-aware engagement prediction at scale on Twiter's Home Timeline, Pages: 819-824

The workshop features presentations of accepted contributions to the RecSys Challenge 2021, organized by Politecnico di Bari, ETH Zürich, Jönköping University, and the data set is provided by Twitter. The challenge focuses on a real-world task of tweet engagement prediction in a dynamic environment. For 2021, the challenge considers four different engagement types: Likes, Retweet, Quote, and replies. This year's challenge brings the problem even closer to Twitter's real recommender systems by introducing latency constraints. We also increases the data size to encourage novel methods. Also, the data density is increased in terms of the graph where users are considered to be nodes and interactions as edges. The goal is twofold: to predict the probability of different engagement types of a target user for a set of Tweets based on heterogeneous input data while providing fair recommendations. In fact, multi-goal optimization considering accuracy and fairness is particularly challenging. However, we believed that the recommendation community was nowadays mature enough to face the challenge of providing accurate and, at the same time, fair recommendations. To this end, Twitter has released a public dataset of close to 1 billion data points, > 40 million each day over 28 days. Week 1-3 will be used for training and week 4 for evaluation and testing. Each datapoint contains the tweet along with engagement features, user features, and tweet features. A peculiarity of this challenge is related to keeping the dataset updated with the platform: if a user deletes a Tweet, or their data from Twitter, the dataset is promptly updated. Moreover, each change in the dataset implied new evaluations of all submissions and the update of the leaderboard metrics. The challenge was well received with 578 registered users, and 386 submissions.

Conference paper

Nauwelaers N, Matthews H, Fan Y, Croquet B, Hoskens H, Mahdi S, El Sergani A, Gong S, Xu T, Bronstein M, Marazita M, Weinberg S, Claes Pet al., 2021, Exploring palatal and dental shape variation with 3D shape analysis and geometric deep learning, ORTHODONTICS & CRANIOFACIAL RESEARCH, Vol: 24, Pages: 134-143, ISSN: 1601-6335

Journal article

Croquet B, Matthews H, Mertens J, Fan Y, Nauwelaers N, Mahdi S, Hoskens H, El Sergani A, Xu T, Vandermeulen D, Bronstein M, Marazita M, Weinberg S, Claes Pet al., 2021, Automated landmarking for palatal shape analysis using geometric deep learning, ORTHODONTICS & CRANIOFACIAL RESEARCH, Vol: 24, Pages: 144-152, ISSN: 1601-6335

Journal article

Bahri M, O' Sullivan E, Gong S, Liu F, Liu X, Bronstein MM, Zafeiriou Set al., 2021, Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation, INTERNATIONAL JOURNAL OF COMPUTER VISION, Vol: 129, Pages: 2680-2713, ISSN: 0920-5691

Journal article

Gonzalez G, Gong S, Laponogov I, Bronstein M, Veselkov Ket al., 2021, Predicting anticancer hyperfoods with graph convolutional networks, Human Genomics, Vol: 15, ISSN: 1479-7364

Background:Recent efforts in the field of nutritional science have allowed the discovery of disease-beating molecules within foods based on the commonality of bioactive food molecules to FDA-approved drugs. The pioneering work in this field used an unsupervised network propagation algorithm to learn the systemic-wide effect on the human interactome of 1962 FDA-approved drugs and a supervised algorithm to predict anticancer therapeutics using the learned representations. Then, a set of bioactive molecules within foods was fed into the model, which predicted molecules with cancer-beating potential.The employed methodology consisted of disjoint unsupervised feature generation and classification tasks, which can result in sub-optimal learned drug representations with respect to the classification task. Additionally, due to the disjoint nature of the tasks, the employed approach proved cumbersome to optimize, requiring testing of thousands of hyperparameter combinations and significant computational resources.To overcome the technical limitations highlighted above, we represent each drug as a graph (human interactome) with its targets as binary node features on the graph and formulate the problem as a graph classification task. To solve this task, inspired by the success of graph neural networks in graph classification problems, we use an end-to-end graph neural network model operating directly on the graphs, which learns drug representations to optimize model performance in the prediction of anticancer therapeutics.Results:The proposed model outperforms the baseline approach in the anticancer therapeutic prediction task, achieving an F1 score of 67.99%±2.52% and an AUPR of 73.91%±3.49%. It is also shown that the model is able to capture knowledge of biological pathways to predict anticancer molecules based on the molecules’ effects on cancer-related pathways.Conclusions:We introduce an end-to-end graph convolutional model to predict cancer-beating mo

Journal article

Maggioli F, Melzi S, Ovsjanikov M, Bronstein MM, Rodola Eet al., 2021, Orthogonalized Fourier Polynomials for Signal Approximation and Transfer, COMPUTER GRAPHICS FORUM, Vol: 40, Pages: 435-447, ISSN: 0167-7055

Journal article

Schonsheck SC, Bronstein MM, Lai R, 2021, Nonisometric Surface Registration via Conformal Laplace–Beltrami Basis Pursuit, Journal of Scientific Computing, Vol: 86, ISSN: 0885-7474

Surface registration is one of the most fundamental problems in geometry processing. Many approaches have been developed to tackle this problem in cases where the surfaces are nearly isometric. However, it is much more challenging to compute correspondence between surfaces which are intrinsically less similar. In this paper, we propose a variational model to align the Laplace-Beltrami (LB) eigensytems of two non-isometric genus zero shapes via conformal deformations. This method enables us to compute geometrically meaningful point-to-point maps between non-isometric shapes. Our model is based on a novel basis pursuit scheme whereby we simultaneously compute a conformal deformation of a ’target shape’ and its deformed LB eigensystem. We solve the model using a proximal alternating minimization algorithm hybridized with the augmented Lagrangian method which produces accurate correspondences given only a few landmark points. We also propose a re-initialization scheme to overcome some of the difficulties caused by the non-convexity of the variational problem. Intensive numerical experiments illustrate the effectiveness and robustness of the proposed method to handle non-isometric surfaces with large deformation with respect to both noises on the underlying manifolds and errors within the given landmarks or feature functions.

Journal article

Laponogov I, Gonzalez G, Shepherd M, Qureshi A, Veselkov D, Charkoftaki G, Vasiliou V, Youssef J, Mirnezami R, Bronstein M, Veselkov Ket al., 2021, Network machine learning maps phytochemically rich "Hyperfoods" to fight COVID-19, Human Genomics, Vol: 15, Pages: 1-1, ISSN: 1479-7364

In this paper, we introduce a network machine learning method to identify potential bioactive anti-COVID-19 molecules in foods based on their capacity to target the SARS-CoV-2-host gene-gene (protein-protein) interactome. Our analyses were performed using a supercomputing DreamLab App platform, harnessing the idle computational power of thousands of smartphones. Machine learning models were initially calibrated by demonstrating that the proposed method can predict anti-COVID-19 candidates among experimental and clinically approved drugs (5658 in total) targeting COVID-19 interactomics with the balanced classification accuracy of 80-85% in 5-fold cross-validated settings. This identified the most promising drug candidates that can be potentially "repurposed" against COVID-19 including common drugs used to combat cardiovascular and metabolic disorders, such as simvastatin, atorvastatin and metformin. A database of 7694 bioactive food-based molecules was run through the calibrated machine learning algorithm, which identified 52 biologically active molecules, from varied chemical classes, including flavonoids, terpenoids, coumarins and indoles predicted to target SARS-CoV-2-host interactome networks. This in turn was used to construct a "food map" with the theoretical anti-COVID-19 potential of each ingredient estimated based on the diversity and relative levels of candidate compounds with antiviral properties. We expect this in silico predicted food map to play an important role in future clinical studies of precision nutrition interventions against COVID-19 and other viral diseases.

Journal article

Bodnar C, Frasca F, Otter N, Wang YG, Liò P, Montúfar G, Bronstein Met al., 2021, Weisfeiler and Lehman Go Cellular: CW Networks, Pages: 2625-2640, ISSN: 1049-5258

Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message passing on the clique complex of the graph. Nevertheless, these models can be severely constrained by the rigid combinatorial structure of Simplicial Complexes (SCs). In this work, we extend recent theoretical results on SCs to regular Cell Complexes, topological objects that flexibly subsume SCs and graphs. We show that this generalisation provides a powerful set of graph “lifting” transformations, each leading to a unique hierarchical message passing procedure. The resulting methods, which we collectively call CW Networks (CWNs), are strictly more powerful than the WL test and not less powerful than the 3-WL test. In particular, we demonstrate the effectiveness of one such scheme, based on rings, when applied to molecular graph problems. The proposed architecture benefits from provably larger expressivity than commonly used GNNs, principled modelling of higher-order signals and from compressing the distances between nodes. We demonstrate that our model achieves state-of-the-art results on a variety of molecular datasets.

Conference paper

Bouritsas G, Loukas A, Karalias N, Bronstein MMet al., 2021, Partition and Code: Learning how to compress graphs, Pages: 18603-18619, ISSN: 1049-5258

Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to discover relevant patterns. On the other hand, most graph compression approaches rely on domain-dependent handcrafted representations and cannot adapt to different underlying graph distributions. This work aims to establish the necessary principles a lossless graph compression method should follow to approach the entropy storage lower bound. Instead of making rigid assumptions about the graph distribution, we formulate the compressor as a probabilistic model that can be learned from data and generalise to unseen instances. Our "Partition and Code" framework entails three steps: first, a partitioning algorithm decomposes the graph into subgraphs, then these are mapped to the elements of a small dictionary on which we learn a probability distribution, and finally, an entropy encoder translates the representation into bits. All the components (partitioning, dictionary and distribution) are parametric and can be trained with gradient descent. We theoretically compare the compression quality of several graph encodings and prove, under mild conditions, that PnC achieves compression gains that grow either linearly or quadratically with the number of vertices. Empirically, PnC yields significant compression improvements on diverse real-world networks.

Conference paper

Chamberlain BP, Rowbottom J, Eynard D, Di Giovanni F, Dong X, Bronstein MMet al., 2021, Beltrami Flow and Neural Diffusion on Graphs, Pages: 1594-1609, ISSN: 1049-5258

We propose a novel class of graph neural networks based on the discretised Beltrami flow, a non-Euclidean diffusion PDE. In our model, node features are supplemented with positional encodings derived from the graph topology and jointly evolved by the Beltrami flow, producing simultaneously continuous feature learning and topology evolution. The resulting model generalises many popular graph neural networks and achieves state-of-the-art results on several benchmarks.

Conference paper

Levie R, Huang W, Bucci L, Bronstein M, Kutyniok Get al., 2021, Transferability of spectral graph convolutional neural networks, Journal of Machine Learning Research, Vol: 22, ISSN: 1532-4435

This paper focuses on spectral graph convolutional neural networks (ConvNets), where filters are defined as elementwise multiplication in the frequency domain of a graph. In machine learning settings where the data set consists of signals defined on many different graphs, the trained ConvNet should generalize to signals on graphs unseen in the training set. It is thus important to transfer ConvNets between graphs. Transferability, which is a certain type of generalization capability, can be loosely defined as follows: if two graphs describe the same phenomenon, then a single filter or ConvNet should have similar repercussions on both graphs. This paper aims at debunking the common misconception that spectral filters are not transferable. We show that if two graphs discretize the same “continuous” space, then a spectral filter or ConvNet has approximately the same repercussion on both graphs. Our analysis is more permissive than the standard analysis. Transferability is typically described as the robustness of the filter to small graph perturbations and re-indexing of the vertices. Our analysis accounts also for large graph perturbations. We prove transferability between graphs that can have completely different dimensions and topologies, only requiring that both graphs discretize the same underlying space in some generic sense.

Journal article

Croquet B, Christiaens D, Weinberg SM, Bronstein M, Vandermeulen D, Claes Pet al., 2021, Unsupervised Diffeomorphic Surface Registration and Non-linear Modelling, International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 118-128, ISSN: 0302-9743

Conference paper

Thonet T, Clinchant S, Lassance C, Isufi E, Ma J, Xie Y, Renders J-M, Bronstein Met al., 2021, GReS: Workshop on Graph Neural Networks for Recommendation and Search, 15th ACM Conference on Recommender Systems (RECSYS), Publisher: ASSOC COMPUTING MACHINERY, Pages: 780-782

Conference paper

Sverrisson F, Feydy J, Correia BE, Bronstein MMet al., 2021, Fast end-to-end learning on protein surfaces, Pages: 15267-15276, ISSN: 1063-6919

Proteins' biological functions are defined by the geometric and chemical structure of their 3D molecular surfaces. Recent works have shown that geometric deep learning can be used on mesh-based representations of proteins to identify potential functional sites, such as binding targets for potential drugs. Unfortunately though, the use of meshes as the underlying representation for protein structure has multiple drawbacks including the need to pre-compute the input features and mesh connectivities. This becomes a bottleneck for many important tasks in protein science. In this paper, we present a new framework for deep learning on protein structures that addresses these limitations. Among the key advantages of our method are the computation and sampling of the molecular surface on-the-fly from the underlying atomic point cloud and a novel efficient geometric convolutional layer. As a result, we are able to process large collections of proteins in an end-to-end fashion, taking as the sole input the raw 3D coordinates and chemical types of their atoms, eliminating the need for any hand-crafted pre-computed features. To showcase the performance of our approach, we test it on two tasks in the field of protein structural bioinformatics: the identification of interaction sites and the prediction of protein-protein interactions. On both tasks, we achieve state-of-the-art performance with much faster run times and fewer parameters than previous models. These results will considerably ease the deployment of deep learning methods in protein science and open the door for end-to-end differentiable approaches in protein modeling tasks such as function prediction and design.

Conference paper

Mahdi SS, Nauwelaers N, Joris P, Bouritsas G, Gong S, Bokhnyak S, Walsh S, Shriver MD, Bronstein M, Claes Pet al., 2021, 3D Facial Matching by Spiral Convolutional Metric Learning and a Biometric Fusion-Net of Demographic Properties, 25th International Conference on Pattern Recognition (ICPR), Publisher: IEEE COMPUTER SOC, Pages: 1757-1764, ISSN: 1051-4651

Conference paper

Bronstein MV, Pennycook G, Buonomano L, Cannon TDet al., 2021, Belief in fake news, responsiveness to cognitive conflict, and analytic reasoning engagement, Thinking and Reasoning, Vol: 27, Pages: 510-535, ISSN: 1354-6783

Analytic and intuitive reasoning processes have been implicated as important determinants of belief in (or skepticism of) fake news. However, the underlying cognitive mechanisms that encourage endorsement of fake news remain unclear. The present study investigated cognitive decoupling/response inhibition and the potential role of conflict processing in the initiation of analytic thought about fake news as factors that may facilitate skepticism. A base-rate task was used to test the hypotheses that conflict processing deficits and inefficient response inhibition would be related to stronger endorsement of fake news. In support of these hypotheses, increased belief in fake (but not real) news was associated with a smaller decrease in response confidence in the presence (vs. absence) of conflict and with inefficient (in terms of response latency) inhibition of prepotent responses. Through its support for these hypotheses, the present study advances efforts to determine who will fall for fake news, and why.

Journal article

Dong X, Thanou D, Toni L, Bronstein M, Frossard Pet al., 2020, Graph Signal Processing for Machine Learning: A Review and New Perspectives, IEEE SIGNAL PROCESSING MAGAZINE, Vol: 37, Pages: 117-127, ISSN: 1053-5888

Journal article

Zabatani A, Surazhsky V, Sperling E, Ben Moshe S, Menashe O, Silver DH, Karni Z, Bronstein AM, Bronstein MM, Kimmel Ret al., 2020, Intel (R) RealSense (TM) SR300 Coded Light Depth Camera, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 42, Pages: 2333-2345, ISSN: 0162-8828

Journal article

Chamberlain BP, Rossi E, Shiebler D, Sedhain S, Bronstein MMet al., 2020, Tuning Word2vec for Large Scale Recommendation Systems, Pages: 732-737

Word2vec is a powerful machine learning tool that emerged from Natural Language Processing (NLP) and is now applied in multiple domains, including recommender systems, forecasting, and network analysis. As Word2vec is often used off the shelf, we address the question of whether the default hyperparameters are suitable for recommender systems. The answer is emphatically no. In this paper, we first elucidate the importance of hyperparameter optimization and show that unconstrained optimization yields an average 221% improvement in hit rate over the default parameters. However, unconstrained optimization leads to hyperparameter settings that are very expensive and not feasible for large scale recommendation tasks. To this end, we demonstrate 138% average improvement in hit rate with a runtime budget-constrained hyperparameter optimization. Furthermore, to make hyperparameter optimization applicable for large scale recommendation problems where the target dataset is too large to search over, we investigate generalizing hyperparameters settings from samples. We show that applying constrained hyperparameter optimization using only a 10% sample of the data still yields a 91% average improvement in hit rate over the default parameters when applied to the full datasets. Finally, we apply hyperparameters learned using our method of constrained optimization on a sample to the Who To Follow recommendation service at Twitter and are able to increase follow rates by 15%.

Conference paper

Anelli VW, Delic A, Sottocornola G, Smith J, Andrade N, Belli L, Bronstein M, Gupta A, Ira Ktena S, Lung-Yut-Fong A, Portman F, Tejani A, Xie Y, Zhu X, Shi Wet al., 2020, RecSys 2020 ChallengeWorkshop: Engagement Prediction on Twitter's Home Timeline, Pages: 623-627

The workshop features presentations of accepted contributions to the RecSys Challenge 2020, organized by Politecnico di Bari, Free University of Bozen-Bolzano, TU Wien, University of Colorado, Boulder, and Universidade Federal de Campina Grande, and sponsored by Twitter. The challenge focuses on a real-world task of Tweet engagement prediction in a dynamic environment. The goal is to predict the probability for different types of engagement (Like, Reply, Retweet, and Retweet with comment) of a target user for a set of Tweets, based on heterogeneous input data. To this end, Twitter has released a large public dataset of ~160M public Tweets, obtained by subsampling within ~2 weeks, that contains engagement features, user features, and Tweet features. A peculiarity of this challenge is related to the recent regulations on data protection and privacy. The challenge data set was compliant: if a user deleted a Tweet, or their data from Twitter, the dataset was promptly updated. Moreover, each change in the dataset implied new evaluations of all submissions and the update of the leaderboard metrics. The challenge was well received with 1,131 registered users. In the final phase, 20 teams were competing for the winning position. These teams had an average size of approximately 4 participants and developed an overall number of 127 different methods.

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00961504&limit=30&person=true