Imperial College London

ProfessorSophiaYaliraki

Faculty of Natural SciencesDepartment of Chemistry

Professor of Theoretical Chemistry
 
 
 
//

Contact

 

s.yaliraki

 
 
//

Location

 

Molecular Sciences Research HubWhite City Campus

//

Summary

 

Publications

Publication Type
Year
to

71 results found

Wu N, Barahona M, Yaliraki S, 2024, Allosteric communication and signal transduction in proteins, Current Opinion in Structural Biology, ISSN: 0959-440X

Allostery is one of the cornerstones of biological function, as it plays a fundamental role in regulating protein activity. The modelling of allostery has gradually moved from a conformation-based frame-work, linked to structural changes, to dynamics-based allostery, whereby the effects of ligand binding propagate via signal transduction from the allosteric site to other regions of the protein via inter-residue interactions. Characterising such allosteric signalling pathways, which do not necessarily lead to conformational changes, has been pursued experimentally and complemented by computational analysis of protein networks to detect subtle dynamic propagation paths. Considering allostery from the perspective of signal transduction broadens the understanding of allosteric mechanisms, underscores the importance of protein topology, and can provide insights into allosteric drug design.

Journal article

Greenaway RLL, Jelfs KEE, Spivey ACC, Yaliraki SNNet al., 2023, From alchemist to AI chemist, NATURE REVIEWS CHEMISTRY, Vol: 7, Pages: 527-528

Journal article

Sapienza R, Barahona M, Saxena D, alexis A, Yaliraki Set al., 2022, Sensitivity and spectral control of network lasers, Nature Communications, Vol: 13, Pages: 1-7, ISSN: 2041-1723

Recently, random lasing in complex networks has shown efficient lasing over more than 50 localised modes, promoted by multiple scattering over the underlying graph. If controlled, these network lasers can lead to fast-switching multifunctional light sources with synthesised spectrum. Here, we observe both in experiment and theory high sensitivity of the network laser spectrum to the spatial shape of the pump profile, with some modes for example increasing in intensity by 280% when switching off 7% of the pump beam. We solve the nonlinear equations within the steady state ab-initio laser theory (SALT) approximation over a graph and we show selective lasing of around 90% of the strongest intensity modes, effectively programming the spectrum of the lasing networks. In our experiments with polymer networks, this high sensitivity enables control of the lasing spectrum through non-uniform pump patterns. We propose the underlying complexity of the network modes as the key element behind efficient spectral control opening the way for the development of optical devices with wide impact for on-chip photonics for communication, sensing, and computation.

Journal article

Wu N, Yaliraki S, Barahona M, 2022, Prediction of protein allosteric signalling pathways and functional residues through paths of optimised propensity, Journal of Molecular Biology, Vol: 434, Pages: 1-16, ISSN: 0022-2836

Allostery commonly refers to the mechanism that regulates protein activity through the binding of a molecule at a different, usually distal, site from the orthosteric site. The omnipresence of allosteric regulation in nature and its potential for drug design and screening render the study of allostery invaluable. Nevertheless, challenges remain as few computational methods are available to effectively predict allosteric sites, identify signalling pathways involved in allostery, or to aid with the design of suitable molecules targeting such sites. Recently, bond-to-bond propensity analysis has been shown successful at identifying allosteric sites for a large and diverse group of proteins from knowledge of the orthosteric sites and its ligands alone by using network analysis applied to energy-weighted atomistic protein graphs. To address the identification of signalling pathways, we propose here a method to compute and score paths of optimised propensity that link the orthosteric site with the identified allosteric sites, and identifies crucial residues that contribute to those paths. We showcase the approach with three well-studied allosteric proteins: h-Ras, caspase-1, and 3-phosphoinositide-dependent kinase-1 (PDK1). Key residues in both orthosteric and allosteric sites were identified and showed agreement with experimental results, and pivotal signalling residues along the pathway were also revealed, thus providing alternative targets for drug design. By using the computed path scores, we were also able to differentiate the activity of different allosteric modulators.

Journal article

Strömich L, Wu N, Barahona M, Yaliraki Set al., 2022, Allosteric hotspots in the main protease of SARS-CoV-2, Journal of Molecular Biology, Vol: 434, ISSN: 0022-2836

Inhibiting the main protease of SARS-CoV-2 is of great interest in tackling the COVID-19 pandemic caused by the virus. Most efforts have been centred on inhibiting the binding site of the enzyme. However, considering allosteric sites, distant from the active or orthosteric site, broadens the search space for drug candidates and confers the advantages of allosteric drug targeting. Here, we report the allosteric communication pathways in the main protease dimer by using two novel fully atomistic graph-theoretical methods: Bond-to-bond propensity, which has been previously successful in identifying allosteric sites in extensive benchmark data sets without a priori knowledge, and Markov transient analysis, which has previously aided in finding novel drug targets in catalytic protein families. Using statistical bootstrapping, we score the highest ranking sites against random sites at similar distances, and we identify four statistically significant putative allosteric sites as good candidates for alternative drug targeting.

Journal article

Chrysostomou S, Roy R, Prischi F, Thamlikitkul L, Chapman KL, Mufti U, Peach R, Ding L, Hancock D, Moore C, Molina-Arcas M, Mauri F, Pinato DJ, Abrahams JM, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Pirrie S, Gaunt C, Billingham L, Steven NM, Cullen M, Hrouda D, Winkler M, Post J, Cohen P, Salpeter SJ, Bar V, Zundelevich A, Golan S, Leibovici D, Lara R, Klug DR, Yaliraki SN, Barahona M, Wang Y, Downward J, Skehel JM, Ali MMU, Seckl MJ, Pardo Eet al., 2022, Re: Repurposed Floxacins Targeting RSK4 Prevent Chemoresistance and Metastasis in Lung and Bladder Cancer, JOURNAL OF UROLOGY, Vol: 207, Pages: 919-920, ISSN: 0022-5347

Journal article

Wu N, Stromich L, Yaliraki SN, 2022, Prediction of allosteric sites and signalling: insights from benchmarking datasets, Patterns, Vol: 3, Pages: 1-12, ISSN: 2666-3899

Allostery is a pervasive mechanism that regulates protein activity through ligand binding at a site different from the orthosteric site. The universality of allosteric regulation complemented by the benefits of highly specific and potentially non-toxic allosteric drugs makes uncovering allosteric sites invaluable. However, there are few computational methods to effectively predict them. Bond-to-bond propensity analysis has successfully predicted allosteric sites in 19 of 20 cases using an energy-weighted atomistic graph. We here extended the analysis onto 432 structures of 146 proteins from two benchmarking datasets for allosteric proteins: ASBench and CASBench. We further introduced two statistical measures to account for the cumulative effect of high-propensity residues and the crucial residues in a given site. The allosteric site is recovered for 127 of 146 proteins (407 of 432 structures) knowing only the orthosteric sites or ligands. The quantitative analysis using a range of statistical measures enables better characterization of potential allosteric sites and mechanisms involved.

Journal article

Pardo O, Chrysostomou S, Roy R, Prischi F, Thamlikitkul L, Chapman KL, Mufti U, Peach R, Ding L, Hancock D, Moore C, Molina-Arcas M, Mauri F, Pinato DJ, Abrahams JM, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Pirrie S, Gaunt C, Billingham L, Steven NM, Cullen M, Hrouda D, Winkler M, Post J, Cohen P, Salpeter SJ, Bar V, Zundelevich A, Golan S, Leibovici D, Lara R, Klug DR, Yaliraki SN, Barahona M, Wang Y, Downward J, Skehel JM, Ali MMU, Seckl MJet al., 2021, Repurposed floxacins targeting RSK4 prevent chemoresistance and metastasis in lung and bladder cancer, Science Translational Medicine, Vol: 13, ISSN: 1946-6234

Lung and bladder cancers are mostly incurable because of the early development of drug resistance and metastatic dissemination. Hence, improved therapies that tackle these two processes are urgently needed to improve clinical outcome. We have identified RSK4 as a promoter of drug resistance and metastasis in lung and bladder cancer cells. Silencing this kinase, through either RNA interference or CRISPR, sensitized tumor cells to chemotherapy and hindered metastasis in vitro and in vivo in a tail vein injection model. Drug screening revealed several floxacin antibiotics as potent RSK4 activation inhibitors, and trovafloxacin reproduced all effects of RSK4 silencing in vitro and in/ex vivo using lung cancer xenograft and genetically engineered mouse models and bladder tumor explants. Through x-ray structure determination and Markov transient and Deuterium exchange analyses, we identified the allosteric binding site and revealed how this compound blocks RSK4 kinase activation through binding to an allosteric site and mimicking a kinase autoinhibitory mechanism involving the RSK4’s hydrophobic motif. Last, we show that patients undergoing chemotherapy and adhering to prophylactic levofloxacin in the large placebo-controlled randomized phase 3 SIGNIFICANT trial had significantly increased (P = 0.048) long-term overall survival times. Hence, we suggest that RSK4 inhibition may represent an effective therapeutic strategy for treating lung and bladder cancer.

Journal article

Mersmann S, Stromich L, Song F, Wu N, Vianello F, Barahona M, Yaliraki Set al., 2021, ProteinLens: a web-based application for the analysis of allosteric signalling on atomistic graphs of biomolecules, Nucleic Acids Research, Vol: 49, Pages: W551-W558, ISSN: 0305-1048

The investigation of allosteric effects in biomolecular structures is of great current interest in diverse areas, from fundamental biological enquiry to drug discovery. Here we present ProteinLens, a user-friendly and interactive web application for the investigation of allosteric signalling based on atomistic graph-theoretical methods. Starting from the PDB file of a biomolecule (or a biomolecular complex) ProteinLens obtains an atomistic, energy-weighted graph description of the structure of the biomolecule, and subsequently provides a systematic analysis of allosteric signalling and communication across the structure using two computationally efficient methods: Markov Transients and bond-to-bond propensities. ProteinLens scores and ranks every bond and residue according to the speed and magnitude of the propagation of fluctuations emanating from any site of choice (e.g. the active site). The results are presented through statistical quantile scores visualised with interactive plots and adjustable 3D structure viewers, which can also be downloaded. ProteinLens thus allows the investigation of signalling in biomolecular structures of interest to aid the detection of allosteric sites and pathways. ProteinLens is implemented in Python/SQL and freely available to use at: www.proteinlens.io.

Journal article

Peach R, Arnaudon A, Schmidt J, Palasciano HA, Bernier NR, Jelfs K, Yaliraki S, Barahona Met al., 2021, HCGA: Highly comparative graph analysis for network phenotyping, Patterns, Vol: 2, ISSN: 2666-3899

Networks are widely used as mathematical models of complex systems across many scientific disciplines. Decades of work have produced a vast corpus of research characterizing the topological, combinatorial, statistical, and spectral properties of graphs. Each graph property can be thought of as a feature that captures important (and sometimes overlapping) characteristics of a network. In this paper, we introduce HCGA, a framework for highly comparative analysis of graph datasets that computes several thousands of graph features from any given network. HCGA also offers a suite of statistical learning and data analysis tools for automated identification and selection of important and interpretable features underpinning the characterization of graph datasets. We show that HCGA outperforms other methodologies on supervised classification tasks on benchmark datasets while retaining the interpretability of network features. We exemplify HCGA by predicting the charge transfer in organic semiconductors and clustering a dataset of neuronal morphology images.

Journal article

Peach R, Greenbury S, Johnston I, Yaliraki S, Lefevre D, Barahona Met al., 2021, Understanding learner behaviour in online courses with Bayesian modelling and time series characterisation, Scientific Reports, Vol: 11, ISSN: 2045-2322

The intrinsic temporality of learning demands the adoption of methodologies capable of exploiting time-series information. In this study we leverage the sequence data framework and show how data-driven analysis of temporal sequences of task completion in online courses can be used to characterise personal and group learners’ behaviors, and to identify critical tasks and course sessions in a given course design. We also introduce a recently developed probabilistic Bayesian model to learn sequential behaviours of students and predict student performance. The application of our data-driven sequence-based analyses to data from learners undertaking an on-line Business Management course reveals distinct behaviors within the cohort of learners, identifying learners or groups of learners that deviate from the nominal order expected in the course. Using course grades a posteriori, we explore differences in behavior between high and low performing learners. We find that high performing learners follow the progression between weekly sessions more regularly than low performing learners, yet within each weekly session high performing learners are less tied to the nominal task order. We then model the sequences of high and low performance students using the probablistic Bayesian model and show that we can learn engagement behaviors associated with performance. We also show that the data sequence framework can be used for task-centric analysis; we identify critical junctures and differences among types of tasks within the course design. We find that non-rote learning tasks, such as interactive tasks or discussion posts, are correlated with higher performance. We discuss the application of such analytical techniques as an aid to course design, intervention, and student supervision.

Journal article

Altuncu T, Yaliraki S, Barahona M, 2021, Graph-based topic extraction from vector embeddings of text documents: application to a corpus of news articles, Complex Networks & Their Applications IX, Editors: Benito, Cherifi, Cherifi, Moro, Rocha, Sales-Pardo, Publisher: Springer International Publishing, Pages: 154-166, ISBN: 978-3-030-65351-4

Production of news content is growing at an astonishing rate. To help manage and monitor the sheer amount of text, there is an increasing need to develop efficient methods that can provide insights into emerging content areas, and stratify unstructured corpora of text into ‘topics’ that stem intrinsically from content similarity. Here we present an unsupervised framework that brings together powerful vector embeddings from natural language processing with tools from multiscale graph partitioning that can revealnatural partitions at different resolutions without making a priori assumptions about the number of clusters in the corpus. We show the advantages of graph-based clustering through end-to-end comparisons with other popular clustering and topic modelling methods, and also evaluate different text vector embeddings, from classic Bag-of-Words to Doc2Vec to the recent transformers based model Bert. This comparative work is showcased through an analysis of a corpus of US news coverage during the presidential election year of 2016.

Book chapter

Yu YW, Delvenne J-C, Yaliraki SN, Barahona Met al., 2020, Severability of mesoscale components and local time scales in dynamical networks

A major goal of dynamical systems theory is the search for simplifieddescriptions of the dynamics of a large number of interacting states. Foroverwhelmingly complex dynamical systems, the derivation of a reduceddescription on the entire dynamics at once is computationally infeasible. Othercomplex systems are so expansive that despite the continual onslaught of newdata only partial information is available. To address this challenge, wedefine and optimise for a local quality function severability for measuring thedynamical coherency of a set of states over time. The theoretical underpinningsof severability lie in our local adaptation of the Simon-Ando-Fisher time-scaleseparation theorem, which formalises the intuition of local wells in the Markovlandscape of a dynamical process, or the separation between a microscopic and amacroscopic dynamics. Finally, we demonstrate the practical relevance ofseverability by applying it to examples drawn from power networks, imagesegmentation, social networks, metabolic networks, and word association.

Journal article

Hodges M, Yaliraki SN, Barahona M, 2019, Edge-based formulation of elastic network models, Physical Review Research, Pages: 033211-033211

We present an edge-based framework for the study of geometric elastic networkmodels to model mechanical interactions in physical systems. We use aformulation in the edge space, instead of the usual node-centric approach, tocharacterise edge fluctuations of geometric networks defined in d- dimensionalspace and define the edge mechanical embeddedness, an edge mechanicalsusceptibility measuring the force felt on each edge given a force applied onthe whole system. We further show that this formulation can be directly relatedto the infinitesimal rigidity of the network, which additionally permits three-and four-centre forces to be included in the network description. We exemplifythe approach in protein systems, at both the residue and atomistic levels ofdescription.

Journal article

Peach RL, Saman D, Yaliraki SN, Klug DR, Ying L, Willison KR, Barahona Met al., 2019, Unsupervised Graph-Based Learning Predicts Mutations That Alter Protein Dynamics

<jats:title>A<jats:sc>bstract</jats:sc></jats:title><jats:p>Proteins exhibit complex dynamics across a vast range of time and length scales, from the atomistic to the conformational. Adenylate kinase (ADK) showcases the biological relevance of such inherently coupled dynamics across scales: single mutations can affect large-scale protein motions and enzymatic activity. Here we present a combined computational and experimental study of multiscale structure and dynamics in proteins, using ADK as our system of choice. We show how a computationally efficient method for unsupervised graph partitioning can be applied to atomistic graphs derived from protein structures to reveal intrinsic, biochemically relevant substructures at all scales, without re-parameterisation or<jats:italic>a priori</jats:italic>coarse-graining. We subsequently perform full alanine and arginine<jats:italic>in silico</jats:italic>mutagenesis scans of the protein, and score all mutations according to the disruption they induce on the large-scale organisation. We use our calculations to guide Förster Resonance Energy Transfer (FRET) experiments on ADK, and show that mutating residue D152 to alanine or residue V164 to arginine induce a large dynamical shift of the protein structure towards a closed state, in accordance with our predictions. Our computations also predict a graded effect of different mutations at the D152 site as a result of increased coherence between the core and binding domains, an effect confirmed quantitatively through a high correlation (<jats:italic>R</jats:italic><jats:sup>2</jats:sup>= 0.93) with the FRET ratio between closed and open populations measured on six mutants.</jats:p>

Journal article

Peach R, Yaliraki S, Lefevre D, Barahona Met al., 2019, Data-driven unsupervised clustering of online learner behaviour , npj Science of Learning, Vol: 4, ISSN: 2056-7936

The widespread adoption of online courses opens opportunities for analysing learner behaviour and optimising web-based learning adapted to observed usage. Here we introduce a mathematical framework for the analysis of time series of online learner engagement, which allows the identification of clusters of learners with similar online temporal behaviour directly from the raw data without prescribing a priori subjective reference behaviours. The method uses a dynamic time warping kernel to create a pairwise similarity between time series of learner actions, and combines it with an unsupervised multiscale graph clustering algorithm to identify groups of learners with similar temporal behaviour. To showcase our approach, we analyse task completion data from a cohort of learners taking an online post-graduate degree at Imperial Business School. Our analysis reveals clusters of learners with statistically distinct patterns of engagement, from distributed to massed learning, with different levels of regularity, adherence to pre-planned course structure and task completion. The approach also reveals outlier learners with highly sporadic behaviour. A posteriori comparison against student performance shows that, whereas high performing learners are spread across clusters with diverse temporal engagement, low performers are located significantly in the massed learning cluster, and our unsupervised clustering identifies low performers more accurately than common machine learning classification methods trained on temporal statistics of the data. Finally, we test the applicability of the method by analysing two additional datasets: a different cohort of the same course, and time series of different format from another university.

Journal article

Chrysostomou S, Roy R, Prischi F, Chapman K, Mufti U, Mauri F, Bellezza G, Abrahams J, Ottaviani S, Castellano L, Giamas G, Hrouda D, Winkler M, Klug D, Yaliraki S, Barahona M, Wang Y, Ali M, Seckl M, Pardo Oet al., 2019, Abstract 1775: Targeting RSK4 prevents both chemoresistance and metastasis in lung cancer, AACR Annual Meeting on Bioinformatics, Convergence Science, and Systems Biology, Publisher: American Association for Cancer Research, Pages: 1-2, ISSN: 0008-5472

Lung cancer is the commonest cause of cancer death worldwide with a five-year survival rate of less than five percent for metastatic tumors. Non-small cell lung cancer (NSCLC) accounts for 80% of lung cancer cases of which adenocarcinoma prevails. Patients almost invariably develop metastatic drug-resistant disease and this is responsible for our failure to provide curative therapy. Hence, a better understanding of the mechanisms underlying these biological processes is urgently required to improve clinical outcome.The 90-kDa ribosomal S6 kinases (RSKs) are downstream effectors of the RAS/MAPK cascade. RSKs are highly conserved serine/threonine protein kinases implicated in diverse cellular processes, including cell survival, proliferation, migration and invasion. Four isoforms exist in humans (RSK1-4) and are uniquely characterized by the presence of two non-identical N- and C-terminal kinase domains. RSK isoforms are 73-80% identical at protein level and this has been thought to suggest overlapping functions.However, through functional genomic kinome screens, we show that RSK4, contrary to RSK1, promotes both drug resistance and metastasis in lung cancer. This kinase is overexpressed in the majority (57%) of NSCLC biopsies and this correlates with poor overall survival in lung adenocarcinoma patients. Genetic silencing of RSK4 sensitizes lung cancer cells to chemotherapy and prevents their migration and invasiveness in vitro and in vivo. RSK4 downregulation decreases the anti-apoptotic proteins Bcl2 and cIAP1/2 which correlates with increased apoptotic signalling, whilst it also induces mesenchymal-epithelial transition (MET) through inhibition of NFκB activity. A small-molecule inhibitor screen identified several floxacins, including trovafloxacin, as potent allosteric inhibitors of RSK4 activation. Trovafloxacin reproduced all biological and molecular effects of RSK4 silencing in vitro and in vivo, and is predicted to bind a novel allosteric site revealed

Conference paper

Prischi F, Chrysostomou S, Roy R, Chapman K, Mufti U, Peach R, Ding L, Mauri F, Bellezza G, Cagini L, Barbareschi M, Ferrero S, Abrahams J, Ottaviani S, Castellano L, Giamas G, Pascoe J, Moonamale D, Billingham L, Cullen M, Hrouda D, Winkler M, Klug D, Yaliraki S, Barahona M, Wang Y, Ali M, Seckl M, Pardo Oet al., 2019, Targeting RSK4 prevents both chemoresistance and metastasis in lung and bladder cancer, FEBS Open Bio, Publisher: WILEY, Pages: 330-330, ISSN: 2211-5463

Conference paper

Altuncu MT, Mayer E, Yaliraki SN, Barahona Met al., 2019, From free text to clusters of content in health records: An unsupervised graph partitioning approach, Applied Network Science, Vol: 4, ISSN: 2364-8228

Electronic Healthcare records contain large volumes of unstructured data in different forms. Free text constitutes a large portion of such data, yet this source of richly detailed information often remains under-used in practice because of a lack of suitable methodologies to extract interpretable contentin a timely manner. Here we apply network-theoretical tools to the analysis of free text in Hospital Patient Incident reports in the English National Health Service, to find clusters of reports in an unsupervised manner and at different levels of resolution based directly on the free text descriptions contained within them. To do so, we combine recently developed deep neural network text-embedding methodologies based on paragraph vectors with multi-scale Markov Stability community detection applied to a similarity graph of documents obtained from sparsified text vector similarities. We showcase the approach with the analysis of incident reports submitted in Imperial College Healthcare NHS Trust, London. The multiscale community structure reveals levels of meaning with different resolution in the topics of the dataset, as shown by relevant descriptive terms extracted from thegroups of records, as well as by comparing a posteriori against hand-coded categories assigned by healthcare personnel. Our content communities exhibit good correspondence with well-defined hand-coded categories, yet our results also provide further medical detail in certain areas as well asrevealing complementary descriptors of incidents beyond the external classification. We also discuss how the method can be used to monitor reports over time and across different healthcare providers, and to detect emerging trends that fall outside of pre-existing categories.

Journal article

Zhang H, Salazar JD, Yaliraki SN, 2018, Proteins across scales through graph partitioning: application to the major peanut allergen Ara h 1, Journal of Complex Networks, Vol: 6, Pages: 679-692, ISSN: 2051-1310

The analysis of community structure in complex networks has been given much attention recently, as it is hoped that the communities at various scales can affect or explain the global behaviour of the system. A plethora of community detection algorithms have been proposed, insightful yet often restricted by certain inherent resolutions. Proteins are multi-scale biomolecular machines with coupled structural organization across scales, which is linked to their function. To reveal this organization, we applied a recently developed multi-resolution method, Markov Stability, which is based on atomistic graph partitioning, along with theoretical mutagenesis that further allows for hot spot identification using Gaussian process regression. The methodology finds partitions of a graph without imposing a particular scale a priori and analyses the network in a computationally efficient way. Here, we show an application on peanut allergenicity, which despite extensive experimental studies that focus on epitopes, groups of atoms associated with allergenic reactions, remains poorly understood. We compare our results against available experiment data, and we further predict distal regulatory sites that may significantly alter protein dynamics.

Journal article

Hodges M, Barahona M, Yaliraki S, 2018, Allostery and cooperativity in multimeric proteins: bond-to-bond propensities in ATCase, Scientific Reports, Vol: 8, ISSN: 2045-2322

Aspartate carbamoyltransferase (ATCase) is a large dodecameric enzyme with six active sites that exhibits allostery: its catalytic rate is modulated by the binding of various substrates at distal points from the active sites. A recently developed method, bond-to-bond propensity analysis, has proven capable of predicting allosteric sites in a wide range of proteins using an energy-weighted atomistic graph obtained from the protein structure and given knowledge only of the location of the active site. Bond-to-bond propensity establishes if energy fluctuations at given bonds have significant effects on any other bond in the protein, by considering their propagation through the protein graph. In this work, we use bond-to-bond propensity analysis to study different aspects of ATCase activity using three different protein structures and sources of fluctuations. First, we predict key residues and bonds involved in the transition between inactive (T) and active (R) states of ATCase by analysing allosteric substrate binding as a source of energy perturbations in the protein graph. Our computational results also indicate that the effect of multiple allosteric binding is non linear: a switching effect is observed after a particular number and arrangement of substrates is bound suggesting a form of long range communication between the distantly arranged allosteric sites. Second, cooperativity is explored by considering a bisubstrate analogue as the source of energy fluctuations at the active site, also leading to the identification of highly significant residues to the T ↔ R transition that enhance cooperativity across active sites. Finally, the inactive (T) structure is shown to exhibit a strong, non linear communication between the allosteric sites and the interface between catalytic subunits, rather than the active site. Bond-to-bond propensity thus offers an alternative route to explain allosteric and cooperative effects in terms of detailed atomistic cha

Journal article

Altuncu MT, Mayer E, Yaliraki SN, Barahona Met al., 2018, From Text to Topics in Healthcare Records: An Unsupervised Graph Partitioning Methodology, 2018 KDD Conference Proceedings - MLMH: Machine Learning for Medicine and Healthcare

Electronic Healthcare Records contain large volumes of unstructured data,including extensive free text. Yet this source of detailed information oftenremains under-used because of a lack of methodologies to extract interpretablecontent in a timely manner. Here we apply network-theoretical tools to analysefree text in Hospital Patient Incident reports from the National HealthService, to find clusters of documents with similar content in an unsupervisedmanner at different levels of resolution. We combine deep neural networkparagraph vector text-embedding with multiscale Markov Stability communitydetection applied to a sparsified similarity graph of document vectors, andshowcase the approach on incident reports from Imperial College Healthcare NHSTrust, London. The multiscale community structure reveals different levels ofmeaning in the topics of the dataset, as shown by descriptive terms extractedfrom the clusters of records. We also compare a posteriori against hand-codedcategories assigned by healthcare personnel, and show that our approachoutperforms LDA-based models. Our content clusters exhibit good correspondencewith two levels of hand-coded categories, yet they also provide further medicaldetail in certain areas and reveal complementary descriptors of incidentsbeyond the external classification taxonomy.

Conference paper

Altuncu T, Yaliraki SN, Barahona M, 2018, Content-driven, unsupervised clustering of news articles through multiscale graph partitioning, KDD 2018 - Workshop on Data Science Journalism and Media (DSJM)

The explosion in the amount of news and journalistic content being generatedacross the globe, coupled with extended and instantaneous access to informationthrough online media, makes it difficult and time-consuming to monitor newsdevelopments and opinion formation in real time. There is an increasing needfor tools that can pre-process, analyse and classify raw text to extractinterpretable content; specifically, identifying topics and content-drivengroupings of articles. We present here such a methodology that brings togetherpowerful vector embeddings from Natural Language Processing with tools fromGraph Theory that exploit diffusive dynamics on graphs to reveal naturalpartitions across scales. Our framework uses a recent deep neural network textanalysis methodology (Doc2vec) to represent text in vector form and thenapplies a multi-scale community detection method (Markov Stability) topartition a similarity graph of document vectors. The method allows us toobtain clusters of documents with similar content, at different levels ofresolution, in an unsupervised manner. We showcase our approach with theanalysis of a corpus of 9,000 news articles published by Vox Media over oneyear. Our results show consistent groupings of documents according to contentwithout a priori assumptions about the number or type of clusters to be found.The multilevel clustering reveals a quasi-hierarchy of topics and subtopicswith increased intelligibility and improved topic coherence as compared toexternal taxonomy services and standard topic detection methods.

Conference paper

Colijn C, Jones N, Johnston I, Yaliraki SN, Barahona Met al., 2017, Towards precision healthcare: context and mathematical challenges, Frontiers in Physiology, Vol: 8, ISSN: 1664-042X

Precision medicine refers to the idea of delivering the right treatment to the right patient at the right time, usually with a focus on a data-centred approach to this task. In this perspective piece, we use the term "precision healthcare" to describe the development of precision approaches that bridge from the individual to the population, taking advantage of individual-level data, but also taking the social context into account. These problems give rise to a broad spectrum of technical, scientific, policy, ethical and social challenges, and new mathematical techniques will be required to meet them. To ensure that the science underpin-ning "precision" is robust, interpretable and well-suited to meet the policy, ethical and social questions that such approaches raise, the mathematical methods for data analysis should be transparent, robust and able to adapt to errors and uncertainties. In particular, precision methodologies should capture the complexity of data, yet produce tractable descriptions at the relevant resolution while preserving intelligibility and traceability, so that they can be used by practitioners to aid decision-making. Through several case studies in this domain of precision healthcare, we argue that this vision requires the development of new mathematical frameworks, both in modelling and in data analysis and interpretation.

Journal article

Amor BRC, Schaub MT, Yaliraki S, Barahona Met al., 2016, Prediction of allosteric sites and mediating interactions through bond-to-bond propensities, Nature Communications, Vol: 7, Pages: 1-13, ISSN: 2041-1723

Allostery is a fundamental mechanism of biological regulation, in which binding of a molecule at a distant location affects the active site of a protein. Allosteric sites provide targets to fine-tune protein activity, yet we lack computational methodologies to predict them. Here we present an efficient graph-theoretical framework to reveal allosteric interactions (atoms and communication pathways strongly coupled to the active site) without a priori information of their location. Using an atomistic graph with energy-weighted covalent and weak bonds, we define a bond-to-bond propensity quantifying the non-local effect of instantaneous bond fluctuations propagating through the protein. Significant interactions are then identified using quantile regression. We exemplify our method with three biologically important proteins: caspase-1, CheY, and h-Ras, correctly predicting key allosteric interactions, whose significance is additionally confirmed against a reference set of 100 proteins. The almost-linear scaling of our method renders it suitable for high-throughput searches for candidate allosteric sites.

Journal article

Amor B, Vuik S, Callahan R, Darzi A, Yaliraki SN, Barahona Met al., 2016, Community detection and role identification in directed networks: understanding the Twitter network of the care.data debate, Dynamic Networks and Cyber-Security, Editors: Adams, Heard, Publisher: World Scientific, Pages: 111-136, ISBN: 978-1-60558752-3

With the rise of social media as an important channel for the debate anddiscussion of public affairs, online social networks such as Twitter havebecome important platforms for public information and engagement by policymakers. To communicate effectively through Twitter, policy makers need tounderstand how influence and interest propagate within its network of users. Inthis chapter we use graph-theoretic methods to analyse the Twitter debatesurrounding NHS England's controversial care.data scheme. Directionality is acrucial feature of the Twitter social graph - information flows from thefollowed to the followers - but is often ignored in social network analyses;our methods are based on the behaviour of dynamic processes on the network andcan be applied naturally to directed networks. We uncover robust communities ofusers and show that these communities reflect how information flows through theTwitter network. We are also able to classify users by their differing roles indirecting the flow of information through the network. Our methods and resultswill be useful to policy makers who would like to use Twitter effectively as acommunication medium.

Book chapter

Georgiou PS, Yaliraki SN, Drakakis EM, Barahona Met al., 2016, Window functions and sigmoidal behaviour of memristive systems, International Journal of Circuit Theory and Applications, Vol: 44, Pages: 1685-1696, ISSN: 0098-9886

Summary: A common approach to model memristive systems is to include empirical window functions to describe edge effects and nonlinearities in the change of the memristance. We demonstrate that under quite general conditions, each window function can be associated with a sigmoidal curve relating the normalised time-dependent memristance to the time integral of the input. Conversely, this explicit relation allows us to derive window functions suitable for the mesoscopic modelling of memristive systems from a variety of well-known sigmoidals. Such sigmoidal curves are defined in terms of measured variables and can thus be extracted from input and output signals of a device and then transformed to its corresponding window. We also introduce a new generalised window function that allows the flexible modelling of asymmetric edge effects in a simple manner.

Journal article

Sim A, Yaliraki SN, Barahona M, Stumpf MPet al., 2015, Great cities look small., Journal of the Royal Society Interface, Vol: 12, ISSN: 1742-5689

Great cities connect people; failed cities isolate people. Despite the fundamental importance of physical, face-to-face social ties in the functioning of cities, these connectivity networks are not explicitly observed in their entirety. Attempts at estimating them often rely on unrealistic over-simplifications such as the assumption of spatial homogeneity. Here we propose a mathematical model of human interactions in terms of a local strategy of maximizing the number of beneficial connections attainable under the constraint of limited individual travelling-time budgets. By incorporating census and openly available online multi-modal transport data, we are able to characterize the connectivity of geometrically and topologically complex cities. Beyond providing a candidate measure of greatness, this model allows one to quantify and assess the impact of transport developments, population growth, and other infrastructure and demographic changes on a city. Supported by validations of gross domestic product and human immunodeficiency virus infection rates across US metropolitan areas, we illustrate the effect of changes in local and city-wide connectivities by considering the economic impact of two contemporary inter- and intra-city transport developments in the UK: High Speed 2 and London Crossrail. This derivation of the model suggests that the scaling of different urban indicators with population size has an explicitly mechanistic origin.

Journal article

Beguerisse-Díaz M, Garduño-Hernández G, Vangelov B, Yaliraki SN, Barahona Met al., 2014, Interest communities and flow roles in directed networks: the Twitter network of the UK riots, J. R. Soc. Interface 6 December 2014, Vol: 11

Directionality is a crucial ingredient in many complex networks in whichinformation, energy or influence are transmitted. In such directed networks,analysing flows (and not only the strength of connections) is crucial to revealimportant features of the network that might go undetected if the orientationof connections is ignored. We showcase here a flow-based approach for communitydetection in networks through the study of the network of the most influentialTwitter users during the 2011 riots in England. Firstly, we use directed MarkovStability to extract descriptions of the network at different levels ofcoarseness in terms of interest communities, i.e., groups of nodes within whichflows of information are contained and reinforced. Such interest communitiesreveal user groupings according to location, profession, employer, and topic.The study of flows also allows us to generate an interest distance, whichaffords a personalised view of the attention in the network as viewed from thevantage point of any given user. Secondly, we analyse the profiles of incomingand outgoing long-range flows with a combined approach of role-based similarityand the novel relaxed minimum spanning tree algorithm to reveal that the usersin the network can be classified into five roles. These flow roles go beyondthe standard leader/follower dichotomy and differ from classifications based onregular/structural equivalence. We then show that the interest communities fallinto distinct informational organigrams characterised by a different mix ofuser roles reflecting the quality of dialogue within them. Our genericframework can be used to provide insight into how flows are generated,distributed, preserved and consumed in directed networks.

Journal article

Georgiou PS, Barahona M, Yaliraki SN, Drakakis EMet al., 2014, On memristor ideality and reciprocity, Microelectronics Journal, Vol: 45, Pages: 1363-1371, ISSN: 0026-2692

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00306513&limit=30&person=true