262 results found
Taylor-King JP, Bronstein M, Roblin D, 2024, The Future of Machine Learning Within Target Identification: Causality, Reversibility, and Druggability, Clinical Pharmacology and Therapeutics, ISSN: 0009-9236
Khakzad H, Igashov I, Schneuing A, et al., 2023, A new age in protein design empowered by deep learning., Cell Syst, Vol: 14, Pages: 925-939
The rapid progress in the field of deep learning has had a significant impact on protein design. Deep learning methods have recently produced a breakthrough in protein structure prediction, leading to the availability of high-quality models for millions of proteins. Along with novel architectures for generative modeling and sequence analysis, they have revolutionized the protein design field in the past few years remarkably by improving the accuracy and ability to identify novel protein sequences and structures. Deep neural networks can now learn and extract the fundamental features of protein structures, predict how they interact with other biomolecules, and have the potential to create new effective drugs for treating disease. As their applicability in protein design is rapidly growing, we review the recent developments and technology in deep learning methods and provide examples of their performance to generate novel functional proteins.
Bertin P, Rector-Brooks J, Sharma D, et al., 2023, RECOVER identifies synergistic drug combinations in vitro through sequential model optimization., Cell Rep Methods, Vol: 3
For large libraries of small molecules, exhaustive combinatorial chemical screens become infeasible to perform when considering a range of disease models, assay conditions, and dose ranges. Deep learning models have achieved state-of-the-art results in silico for the prediction of synergy scores. However, databases of drug combinations are biased toward synergistic agents and results do not generalize out of distribution. During 5 rounds of experimentation, we employ sequential model optimization with a deep learning model to select drug combinations increasingly enriched for synergism and active against a cancer cell line-evaluating only ∼5% of the total search space. Moreover, we find that learned drug embeddings (using structural information) begin to reflect biological mechanisms. In silico benchmarking suggests search queries are ∼5-10× enriched for highly synergistic drug combinations by using sequential rounds of evaluation when compared with random selection or ∼3× when using a pretrained model.
Veselkov K, Southern J, Gonzalez Pigorini G, et al., 2023, Genomic-driven nutritional interventions for radiotherapy-resistant rectal cancer patient, Scientific Reports, Vol: 13, Pages: 1-9, ISSN: 2045-2322
Radiotherapy response of rectal cancer patients is dependent on a myriad of molecular mechanisms including response to stress, cell death, and cell metabolism. Modulation of lipid metabolism emerges as a unique strategy to improve radiotherapy outcomes due to its accessibility by bioactive molecules within foods. Even though a few radioresponse modulators have been identified using experimental techniques, trying to experimentally identify all potential modulators is intractable. Here we introduce a machine learning (ML) approach to interrogate the space of bioactive molecules within food for potential modulators of radiotherapy response and provide phytochemically-enriched recipes that encapsulate the benefits of discovered radiotherapy modulators. Potential radioresponse modulators were identified using a genomic-driven network ML approach, metric learning and domain knowledge. Then, recipes from the Recipe1M database were optimized to provide ingredient substitutions maximizing the number of predicted modulators whilst preserving the recipe’s culinary attributes. This work provides a pipeline for the design of genomic-driven nutritional interventions to improve outcomes of rectal cancer patients undergoing radiotherapy.
Zaripova K, Cosmo L, Kazi A, et al., 2023, Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for biological and healthcare applications., Med Image Anal, Vol: 88
Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain. Two prominent examples are molecule property prediction and brain connectome analysis. Importantly, recent works have shown that considering relationships between input data samples has a positive regularizing effect on the downstream task in healthcare applications. These relationships are naturally modeled by a (possibly unknown) graph structure between input samples. In this work, we propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications that exploits the graph representation of the input data samples and their latent relation. We assume an initially unknown latent-graph structure between graph-valued input data and propose to learn a parametric model for message passing within and across input graph samples, end-to-end along with the latent structure connecting the input graphs. Further, we introduce a Node Degree Distribution Loss (NDDL) that regularizes the predicted latent relationships structure. This regularization can significantly improve the downstream task. Moreover, the obtained latent graph can represent patient population models or networks of molecule clusters, providing a level of interpretability and knowledge discovery in the input domain, which is of particular value in healthcare.
Rutz C, Bronstein M, Raskin A, et al., 2023, Using machine learning to decode animal communication., Science, Vol: 381, Pages: 152-155
New methods promise transformative insights and conservation benefits.
Mutti M, De Santi R, Rossi E, et al., 2023, Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization, Pages: 9251-9259
In the sequential decision making setting, an agent aims to achieve systematic generalization over a large, possibly infinite, set of environments. Such environments are modeled as discrete Markov decision processes with both states and actions represented through a feature vector. The underlying structure of the environments allows the transition dynamics to be factored into two components: one that is environment-specific and another that is shared. Consider a set of environments that share the laws of motion as an example. In this setting, the agent can take a finite amount of reward-free interactions from a subset of these environments. The agent then must be able to approximately solve any planning task defined over any environment in the original set, relying on the above interactions only. Can we design a provably efficient algorithm that achieves this ambitious goal of systematic generalization? In this paper, we give a partially positive answer to this question. First, we provide a tractable formulation of systematic generalization by employing a causal viewpoint. Then, under specific structural assumptions, we provide a simple learning algorithm that guarantees any desired planning error up to an unavoidable sub-optimality term, while showcasing a polynomial sample complexity.
Gainza P, Wehrle S, Van Hall-Beauvais A, et al., 2023, De novo design of protein interactions with learned surface fingerprints, NATURE, ISSN: 0028-0836
Kazi A, Cosmo L, Ahmadi S-A, et al., 2023, Differentiable Graph Module (DGM) for Graph Convolutional Networks., IEEE Trans Pattern Anal Mach Intell, Vol: 45, Pages: 1606-1617
Graph deep learning has recently emerged as a powerful ML concept allowing to generalize successful deep neural architectures to non-euclidean structured data. Such methods have shown promising results on a broad spectrum of applications ranging from social science, biomedicine, and particle physics to computer vision, graphics, and chemistry. One of the limitations of the majority of current graph neural network architectures is that they are often restricted to the transductive setting and rely on the assumption that the underlying graph is known and fixed. Often, this assumption is not true since the graph may be noisy, or partially and even completely unknown. In such cases, it would be helpful to infer the graph directly from the data, especially in inductive settings where some nodes were not present in the graph at training time. Furthermore, learning a graph may become an end in itself, as the inferred structure may provide complementary insights next to the downstream task. In this paper, we introduce Differentiable Graph Module (DGM), a learnable function that predicts edge probabilities in the graph which are optimal for the downstream task. DGM can be combined with convolutional graph neural network layers and trained in an end-to-end fashion. We provide an extensive evaluation of applications from the domains of healthcare (disease prediction), brain imaging (age prediction), computer graphics (3D point cloud segmentation), and computer vision (zero-shot learning). We show that our model provides a significant improvement over baselines both in transductive and inductive settings and achieves state-of-the-art results.
Sripathmanathan B, Dong X, Bronstein M, 2023, On the Impact of Sample Size in Reconstructing Graph Signals
Reconstructing a signal on a graph from observations on a subset of the vertices is a fundamental problem in the field of graph signal processing. It is often assumed that adding additional observations to an observation set will reduce the expected reconstruction error. We show that under the setting of noisy observation and least-squares reconstruction this is not always the case, characterising the behaviour both theoretically and experimentally.
Di Giovanni F, Giusti L, Barbero F, et al., 2023, On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology, Pages: 7865-7885
Message Passing Neural Networks (MPNNs) are instances of Graph Neural Networks that leverage the graph to send messages over the edges. This inductive bias leads to a phenomenon known as over-squashing, where a node feature is insensitive to information contained at distant nodes. Despite recent methods introduced to mitigate this issue, an understanding of the causes for oversquashing and of possible solutions are lacking. In this theoretical work, we prove that: (i) Neural network width can mitigate over-squashing, but at the cost of making the whole network more sensitive; (ii) Conversely, depth cannot help mitigate over-squashing: increasing the number of layers leads to over-squashing being dominated by vanishing gradients; (iii) The graph topology plays the greatest role, since over-squashing occurs between nodes at high commute time. Our analysis provides a unified framework to study different recent methods introduced to cope with over-squashing and serves as a justification for a class of methods that fall under graph rewiring.
Gutteridge B, Dong X, Bronstein M, et al., 2023, DRew: Dynamically Rewired Message Passing with Delay, Pages: 12252-12267
Message passing neural networks (MPNNs) have been shown to suffer from the phenomenon of over-squashing that causes poor performance for tasks relying on long-range interactions. This can be largely attributed to message passing only occurring locally, over a node's immediate neighbours. Rewiring approaches attempting to make graphs 'more connected', and supposedly better suited to long-range tasks, often lose the inductive bias provided by distance on the graph since they make distant nodes communicate instantly at every layer. In this paper we propose a framework, applicable to any MPNN architecture, that performs a layer-dependent rewiring to ensure gradual densification of the graph. We also propose a delay mechanism that permits skip connections between nodes depending on the layer and their mutual distance. We validate our approach on several long-range tasks and show that it outperforms graph Transformers and multi-hop MPNNs.
Bouritsas G, Frasca F, Zafeiriou S, et al., 2023, Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 45, Pages: 657-668, ISSN: 0162-8828
Eijkelboom F, Bekkers E, Bronstein M, et al., 2023, Can strong structural encoding reduce the importance of Message Passing?, Pages: 278-288
The most prevalent class of neural networks operating on graphs are message passing neural networks (MPNNs), in which the representation of a node is updated iteratively by aggregating information in the 1-hop neighborhood. Since this paradigm for computing node embeddings may prevent the model from learning coarse topological structures, the initial features are often augmented with structural information of the graph, typically in the form of Laplacian eigenvectors or Random Walk transition probabilities. In this work, we explore the contribution of message passing when strong structural encodings are provided. We introduce a novel way of modeling the interaction between feature and structural information based on their tensor product rather than the standard concatenation. The choice of interaction is compared in common scenarios and in settings where the capacity of the message-passing layer is severely reduced and ultimately the message-passing phase is removed altogether. Our results indicate that using tensor-based encodings is always at least on par with the concatenation-based encoding and that it makes the model much more robust when the message passing layers are removed, on some tasks incurring almost no drop in performance. This suggests that the importance of message passing is limited when the model can construct strong structural encodings.
El-Kishky A, Bronstein M, Xiao Y, et al., 2022, Graph-based Representation Learning for Web-scale Recommender Systems, Pages: 4784-4785
Recommender systems are fundamental building blocks of modern consumer web applications that seek to predict user preferences to better serve relevant items. As such, high-quality user and item representations as inputs to recommender systems are crucial for personalized recommendation. To construct these user and item representations, self-supervised graph embedding has emerged as a principled approach to embed relational data such as user social graphs, user membership graphs, user-item engagements, and other heterogeneous graphs. In this tutorial we discuss different families of approaches to self-supervised graph embedding. Within each family, we outline a variety of techniques, their merits and disadvantages, and expound on latest works. Finally, we demonstrate how to effectively utilize the resultant large embedding tables to improve candidate retrieval and ranking in modern industry-scale deep-learning recommender systems.
Andreas J, Beguš G, Bronstein MM, et al., 2022, Toward understanding the communication in sperm whales, iScience, Vol: 25
Machine learning has been advancing dramatically over the past decade. Most strides are human-based applications due to the availability of large-scale datasets; however, opportunities are ripe to apply this technology to more deeply understand non-human communication. We detail a scientific roadmap for advancing the understanding of communication of whales that can be built further upon as a template to decipher other forms of animal and non-human communication. Sperm whales, with their highly developed neuroanatomical features, cognitive abilities, social structures, and discrete click-based encoding make for an excellent model for advanced tools that can be applied to other animals in the future. We outline the key elements required for the collection and processing of massive datasets, detecting basic communication units and language-like higher-level structures, and validating models through interactive playback experiments. The technological capabilities developed by such an undertaking hold potential for cross-applications in broader communities investigating non-human communication and behavioral research.
Cosmo L, Minello G, Bronstein M, et al., 2022, 3D Shape Analysis Through a Quantum Lens: the Average Mixing Kernel Signature, INTERNATIONAL JOURNAL OF COMPUTER VISION, Vol: 130, Pages: 1474-1493, ISSN: 0920-5691
Mahdi SS, Nauwelaers N, Joris P, et al., 2022, Matching 3D Facial Shape to Demographic Properties by Geometric Metric Learning: A Part-Based Approach., IEEE Trans Biom Behav Identity Sci, Vol: 4, Pages: 163-172
Face recognition is a widely accepted biometric identifier, as the face contains a lot of information about the identity of a person. The goal of this study is to match the 3D face of an individual to a set of demographic properties (sex, age, BMI, and genomic background) that are extracted from unidentified genetic material. We introduce a triplet loss metric learner that compresses facial shape into a lower dimensional embedding while preserving information about the property of interest. The metric learner is trained for multiple facial segments to allow a global-to-local part-based analysis of the face. To learn directly from 3D mesh data, spiral convolutions are used along with a novel mesh-sampling scheme, which retains uniformly sampled points at different resolutions. The capacity of the model for establishing identity from facial shape against a list of probe demographics is evaluated by enrolling the embeddings for all properties into a support vector machine classifier or regressor and then combining them using a naive Bayes score fuser. Results obtained by a 10-fold cross-validation for biometric verification and identification show that part-based learning significantly improves the systems performance for both encoding with our geometric metric learner or with principal component analysis.
Zafeiriou S, Bronstein M, Cohen T, et al., 2022, Guest Editorial: Non-Euclidean Machine Learning, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol: 44, Pages: 723-726, ISSN: 0162-8828
Cretu A-M, Monti F, Maronne S, et al., 2022, Interaction data are identifiable even across long periods of time, Nature Communications, Vol: 13, Pages: 1-11, ISSN: 2041-1723
Fine-grained records of people’s interactions, both offline and online, arecollected at large scale. These data contain sensitive information about whom wemeet, talk to, and when. We demonstrate here how people’s interaction behavioris stable over long periods of time and can be used to identify individuals inanonymous datasets. Our attack learns the profile of an individual using geometric deep learning and triplet loss optimization. In a mobile phone metadatadataset of more than 40k people, it correctly identifies 52% of individuals basedon their 2-hop interaction graph. We further show that the profiles learned byour method are stable over time and that 24% of people are still identifiableafter 20 weeks. Our results suggest that people with well-balanced interactiongraphs are more identifiable. Applying our attack to Bluetooth close-proximitynetworks, we show that even 1-hop interaction graphs are enough to identifypeople more than 26% of the time. Our results provide strong evidence thatdisconnected and even re-pseudonymized interaction data can be linked together making them personal data under the European Union’s General DataProtection Regulation.
Ahmadi SA, Kazi A, Papiez B, et al., 2022, Preface GRAIL 2022, ISBN: 9783031210822
Rossi E, Kenlay H, Gorinova MI, et al., 2022, On the Unreasonable Effectiveness of Feature Propagation in Learning on Graphs with Missing Node Features
While Graph Neural Networks (GNNs) have recently become the de facto standard for modeling relational data, they impose a strong assumption on the availability of the node or edge features of the graph. In many real-world applications, however, features are only partially available; for example, in social networks, age and gender are available only for a small subset of users. We present a general approach for handling missing features in graph machine learning applications that is based on minimization of the Dirichlet energy and leads to a diffusion-type differential equation on the graph. The discretization of this equation produces a simple, fast and scalable algorithm which we call Feature Propagation. We experimentally show that the proposed approach outperforms previous methods on seven common node-classification benchmarks and can withstand surprisingly high rates of missing features: on average we observe only around 4% relative accuracy drop when 99% of the features are missing. Moreover, it takes only 10 seconds to run on a graph with ∼2.5M nodes and ∼123M edges on a single GPU. The code is available at https://github.com/twitter-research/feature-propagation.
Bevilacqua B, Frasca F, Lim D, et al., 2022, EQUIVARIANT SUBGRAPH AGGREGATION NETWORKS
Message-passing neural networks (MPNNs) are the leading architecture for deep learning on graph-structured data, in large part due to their simplicity and scalability. Unfortunately, it was shown that these architectures are limited in their expressive power. This paper proposes a novel framework called Equivariant Subgraph Aggregation Networks (ESAN) to address this issue. Our main observation is that while two graphs may not be distinguishable by an MPNN, they often contain distinguishable subgraphs. Thus, we propose to represent each graph as a set of subgraphs derived by some predefined policy, and to process it using a suitable equivariant architecture. We develop novel variants of the 1-dimensional Weisfeiler-Leman (1-WL) test for graph isomorphism, and prove lower bounds on the expressiveness of ESAN in terms of these new WL variants. We further prove that our approach increases the expressive power of both MPNNs and more expressive architectures. Moreover, we provide theoretical results that describe how design choices such as the subgraph selection policy and equivariant neural architecture affect our architecture's expressive power. To deal with the increased computational cost, we propose a subgraph sampling scheme, which can be viewed as a stochastic version of our framework. A comprehensive set of experiments on real and synthetic datasets demonstrates that our framework improves the expressive power and overall performance of popular GNN architectures.
Topping J, Di Giovanni F, Chamberlain BP, et al., 2022, UNDERSTANDING OVER-SQUASHING AND BOTTLENECKS ON GRAPHS VIA CURVATURE
Most graph neural networks (GNNs) use the message passing paradigm, in which node features are propagated on the input graph. Recent works pointed to the distortion of information flowing from distant nodes as a factor limiting the efficiency of message passing for tasks relying on long-distance interactions. This phenomenon, referred to as 'over-squashing', has been heuristically attributed to graph bottlenecks where the number of k-hop neighbors grows rapidly with k. We provide a precise description of the over-squashing phenomenon in GNNs and analyze how it arises from bottlenecks in the graph. For this purpose, we introduce a new edge-based combinatorial curvature and prove that negatively curved edges are responsible for the over-squashing issue. We also propose and experimentally test a curvature-based graph rewiring method to alleviate the over-squashing.
Rossi E, Monti F, Leng Y, et al., 2022, Learning to Infer Structures of Network Games, Pages: 18809-18827
Strategic interactions between a group of individuals or organisations can be modelled as games played on networks, where a player's payoff depends not only on their actions but also on those of their neighbours. Inferring the network structure from observed game outcomes (equilibrium actions) is an important problem with numerous potential applications in economics and social sciences. Existing methods mostly require the knowledge of the utility function associated with the game, which is often unrealistic to obtain in real-world scenarios. We adopt a transformer-like architecture which correctly accounts for the symmetries of the problem and learns a mapping from the equilibrium actions to the network structure of the game without explicit knowledge of the utility function. We test our method on three different types of network games using both synthetic and real-world data, and demonstrate its effectiveness in network structure inference and superior performance over existing methods.
Barbero F, Bodnar C, de Ocáriz Borde HS, et al., 2022, SH EA F NEU RA L NETWO RK S W ITH CO NN ECTIO N LAPLACIANS, Pages: 28-36
A Sheaf Neural Network (SNN) is a type of Graph Neural Network (GNN) that operates on a sheaf, an object that equips a graph with vector spaces over its nodes and edges and linear maps between these spaces. SNNs have been shown to have useful theoretical properties that help tackle issues arising from heterophily and over-smoothing. One complication intrinsic to these models is finding a good sheaf for the task to be solved. Previous works proposed two diametrically opposed approaches: manually constructing the sheaf based on domain knowledge and learning the sheaf end-to-end using gradient-based methods. However, domain knowledge is often insufficient, while learning a sheaf could lead to overfitting and significant computational overhead. In this work, we propose a novel way of computing sheaves drawing inspiration from Riemannian geometry: we leverage the manifold assumption to compute manifold-and-graph-aware orthogonal maps, which optimally align the tangent spaces of neighbouring data points. We show that this approach achieves promising results with less computational overhead when compared to previous SNN models. Overall, this work provides an interesting connection between algebraic topology and differential geometry, and we hope that it will spark future research in this direction.
Bodnar C, Di Giovanni F, Chamberlain BP, et al., 2022, Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs, ISSN: 1049-5258
Cellular sheaves equip graphs with a “geometrical” structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that discretise this equation. In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. At the same time, we prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their asymptotic behaviour. On the practical side, we study how sheaves can be learned from data. The resulting sheaf diffusion models have many desirable properties that address the limitations of classical graph diffusion equations (and corresponding GNN models) and obtain competitive results in heterophilic settings. Overall, our work provides new connections between GNNs and algebraic topology and would be of interest to both fields.
Rusch TK, Chamberlain BP, Rowbottom J, et al., 2022, Graph-Coupled Oscillator Networks, Pages: 18888-18909
We propose Graph-Coupled Oscillator Networks (GraphCON), a novel framework for deep learning on graphs. It is based on discretizations of a second-order system of ordinary differential equations (ODEs), which model a network of nonlinear controlled and damped oscillators, coupled via the adjacency structure of the underlying graph. The flexibility of our framework permits any basic GNN layer (e.g. convolutional or attentional) as the coupling function, from which a multi-layer deep neural network is built up via the dynamics of the proposed ODEs. We relate the oversmoothing problem, commonly encountered in GNNs, to the stability of steady states of the underlying ODE and show that zero-Dirichlet energy steady states are not stable for our proposed ODEs. This demonstrates that the proposed framework mitigates the oversmoothing problem. Moreover, we prove that GraphCON mitigates the exploding and vanishing gradients problem to facilitate training of deep multi-layer GNNs. Finally, we show that our approach offers competitive performance with respect to the state-of-the-art on a variety of graph-based learning tasks.
Frasca F, Bevilacqua B, Bronstein MM, et al., 2022, Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries, ISSN: 1049-5258
Subgraph GNNs are a recent class of expressive Graph Neural Networks (GNNs) which model graphs as collections of subgraphs. So far, the design space of possible Subgraph GNN architectures as well as their basic theoretical properties are still largely unexplored. In this paper, we study the most prominent form of subgraph methods, which employs node-based subgraph selection policies such as ego-networks or node marking and deletion. We address two central questions: (1) What is the upper-bound of the expressive power of these methods? and (2) What is the family of equivariant message passing layers on these sets of subgraphs?. Our first step in answering these questions is a novel symmetry analysis which shows that modelling the symmetries of node-based subgraph collections requires a significantly smaller symmetry group than the one adopted in previous works. This analysis is then used to establish a link between Subgraph GNNs and Invariant Graph Networks (IGNs). We answer the questions above by first bounding the expressive power of subgraph methods by 3-WL, and then proposing a general family of message-passing layers for subgraph methods that generalises all previous node-based Subgraph GNNs. Finally, we design a novel Subgraph GNN dubbed SUN, which theoretically unifies previous architectures while providing better empirical performance on multiple benchmarks.
Mahdi SS, Matthews H, Nauwelaers N, et al., 2022, Multi-Scale Part-Based Syndrome Classification of 3D Facial Images, IEEE ACCESS, Vol: 10, Pages: 23450-23462, ISSN: 2169-3536
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.