Publications

Journal article

Luff CE, Peach R, Mallas E-J, Rhodes E, Laumann F, Boyden ES, Sharp DJ, Barahona M, Grossman Net al., 2024,

The neuron mixer and its impact on human brain dynamics

, Cell Reports, Vol: 43, ISSN: 2211-1247

A signal mixer facilitates rich computation, which has been the building block of modern telecommunication. This frequency mixing produces new signals at the sum and difference frequencies of input signals, enabling powerful operations such as heterodyning and multiplexing. Here, we report that a neuron is a signal mixer. We found through ex vivo and in vivo whole-cell measurements that neurons mix exogenous (controlled) and endogenous (spontaneous) subthreshold membrane potential oscillations, producing new oscillation frequencies, and that neural mixing originates in voltage-gated ion channels. Furthermore, we demonstrate that mixing is evident in human brain activity and is associated with cognitive functions. We found that the human electroencephalogram displays distinct clusters of local and inter-region mixing and that conversion of the salient posterior alpha-beta oscillations into gamma-band oscillations regulates visual attention. Signal mixing may enable individual neurons to sculpt the spectrum of neural circuit oscillations and utilize them for computational operations.

Journal article

Beaney T, Clarke J, Salman D, Woodcock T, Majeed F, Aylin P, Barahona Met al., 2024,

Identifying multi-resolution clusters of diseases in ten million patients with multimorbidity in primary care in England

, Communications Medicine, Vol: 4, ISSN: 2730-664X

BackgroundIdentifying clusters of diseases may aid understanding of shared aetiology, management of co-morbidities, and the discovery of new disease associations. Our study aims to identify disease clusters using a large set of long-term conditions and comparing methods that use the co-occurrence of diseases versus methods that use the sequence of disease development in a person over time.MethodsWe use electronic health records from over ten million people with multimorbidity registered to primary care in England. First, we extract data-driven representations of 212 diseases from patient records employing (i) co-occurrence-based methods and (ii) sequence-based natural language processing methods. Second, we apply the graph-based Markov Multiscale Community Detection (MMCD) to identify clusters based on disease similarity at multiple resolutions. We evaluate the representations and clusters using a clinically curated set of 253 known disease association pairs, and qualitatively assess the interpretability of the clusters.ResultsBoth co-occurrence and sequence-based algorithms generate interpretable disease representations, with the best performance from the skip-gram algorithm. MMCD outperforms k-means and hierarchical clustering in explaining known disease associations. We find that diseases display an almost-hierarchical structure across resolutions from closely to more loosely similar co-occurrence patterns and identify interpretable clusters corresponding to both established and novel patterns.ConclusionsOur method provides a tool for clustering diseases at different levels of resolution from co-occurrence patterns in high-dimensional electronic health records, which could be used to facilitate discovery of associations between diseases in the future.

Journal article

Shah M, Inacio M, Lu C, Schiratti P-R, Zheng S, Clement A, Simoes Monteiro de Marvao A, Bai W, King A, Ware J, Wilkins M, Mielke J, Elci E, Kryukov I, McGurk K, Bender C, Freitag D, O'Regan Det al., 2023,

Environmental and genetic predictors of human cardiovascular ageing

, Nature Communications, Vol: 14, Pages: 1-15, ISSN: 2041-1723

Cardiovascular ageing is a process that begins early in life and leads to a progressive change instructure and decline in function due to accumulated damage across diverse cell types, tissues andorgans contributing to multi-morbidity. Damaging biophysical, metabolic and immunological factors exceed endogenous repair mechanisms resulting in a pro-fibrotic state, cellular senescence andend-organ damage, however the genetic architecture of cardiovascular ageing is not known. Herewe use machine learning approaches to quantify cardiovascular age from image-derived traits ofvascular function, cardiac motion and myocardial fibrosis, as well as conduction traits from electrocardiograms, in 39,559 participants of UK Biobank. Cardiovascular ageing is found to be significantly associated with common or rare variants in genes regulating sarcomere homeostasis, myocardial immunomodulation, and tissue responses to biophysical stress. Ageing is accelerated bycardiometabolic risk factors and we also identify prescribed medications that are potential modifiersof ageing. Through large-scale modelling of ageing across multiple traits our results reveal insightsinto the mechanisms driving premature cardiovascular ageing and reveal potential molecular targetsto attenuate age-related processes.

Journal article

Thanaj M, Mielke J, McGurk K, Bai W, Savioli N, Simoes Monteiro de Marvao A, Meyer H, Zeng L, Sohler F, Lumbers T, Wilkins M, Ware J, Bender C, Rueckert D, MacNamara A, Freitag D, O'Regan Det al., 2022,

Genetic and environmental determinants of diastolic heart function

, Nature Cardiovascular Research, Vol: 1, Pages: 361-371, ISSN: 2731-0590

Diastole is the sequence of physiological events that occur in the heart during ventricular filling and principally depends onmyocardial relaxation and chamber stiffness. Abnormal diastolic function is related to many cardiovascular disease processesand is predictive of health outcomes, but its genetic architecture is largely unknown. Here, we use machine learning cardiacmotion analysis to measure diastolic functional traits in 39,559 participants of the UK Biobank and perform a genome-wideassociation study. We identified 9 significant, independent loci near genes that are associated with maintaining sarcomericfunction under biomechanical stress and genes implicated in the development of cardiomyopathy. Age, sex and diabetes wereindependent predictors of diastolic function and we found a causal relationship between genetically-determined ventricularstiffness and incident heart failure. Our results provide insights into the genetic and environmental factors influencing diastolicfunction that are relevant for identifying causal relationships and potential tractable targets.

Journal article

Beaney T, Clarke J, Woodcock T, McCarthy R, Saravanakumar K, Barahona M, Blair M, Hargreaves Det al., 2021,

Patterns of healthcare utilisation in children and young people: a retrospective cohort study using routinely collected healthcare data in Northwest London

, BMJ Open, Vol: 11, Pages: 1-14, ISSN: 2044-6055

ObjectivesWith a growing role for health services in managing population health, there is a need for early identification of populations with high need. Segmentation approaches partition the population based on demographics, long-term conditions (LTCs) or healthcare utilisation but have mostly been applied to adults. Our study uses segmentation methods to distinguish patterns of healthcare utilisation in children and young people (CYP) and to explore predictors of segment membership.DesignRetrospective cohort study.SettingRoutinely collected primary and secondary healthcare data in Northwest London from the Discover database.Participants378,309 CYP aged 0-15 years registered to a general practice in Northwest London with one full year of follow-up.Primary and secondary outcome measuresAssignment of each participant to a segment defined by seven healthcare variables representing primary and secondary care attendances, and description of utilisation patterns by segment. Predictors of segment membership described by age, sex, ethnicity, deprivation and LTCs.ResultsParticipants were grouped into six segments based on healthcare utilisation. Three segments predominantly used primary care; two moderate utilisation segments differed in use of emergency or elective care, and a high utilisation segment, representing 16,632 (4.4%) children accounted for the highest mean presentations across all service types. The two smallest segments, representing 13.3% of the population, accounted for 62.5% of total costs. Younger age, residence in areas of higher deprivation, and presence of one or more LTCs were associated with membership of higher utilisation segments, but 75.0% of those in the highest utilisation segment had no LTC.ConclusionsThis article identifies six segments of healthcare utilisation in CYP and predictors of segment membership. Demographics and LTCs may not explain utilisation patterns as strongly as in adults which may limit the use of routine data in predicting ut

Journal article

Liu Z, Peach R, Lawrance E, Noble A, Ungless M, Barahona Met al., 2021,

Listening to mental health crisis needs at scale: using Natural Language Processing to understand and evaluate a mental health crisis text messaging service

, Frontiers in Digital Health, Vol: 3, Pages: 1-14, ISSN: 2673-253X

The current mental health crisis is a growing public health issue requiring a large-scale response that cannot be met with traditional services alone. Digital support tools are proliferating, yet most are not systematically evaluated, and we know little about their users and their needs. Shout is a free mental health text messaging service run by the charity Mental Health Innovations, which provides support for individuals in the UK experiencing mental or emotional distress and seeking help. Here we study a large data set of anonymised text message conversations and post-conversation surveys compiled through Shout. This data provides an opportunity to hear at scale from those experiencing distress; to better understand mental health needs for people not using traditional mental health services; and to evaluate the impact of a novel form of crisis support. We use natural language processing (NLP) to assess the adherence of volunteers to conversation techniques and formats, and to gain insight into demographic user groups and their behavioural expressions of distress. Our textual analyses achieve accurate classification of conversation stages (weighted accuracy = 88%), behaviours (1-hamming loss = 95%) and texter demographics (weighted accuracy = 96%), exemplifying how the application of NLP to frontline mental health data sets can aid with post-hoc analysis and evaluation of quality of service provision in digital mental health services.

Journal article

Ming DK, Myall AC, Hernandez B, Weiße AY, Peach RL, Barahona M, Rawson TM, Holmes AHet al., 2021,

Informing antimicrobial management in the context of COVID-19: understanding the longitudinal dynamics of C-reactive protein and procalcitonin

, BMC Infectious Diseases, Vol: 21

Background: To characterise the longitudinal dynamics of C-reactive protein (CRP) and Procalcitonin (PCT) in a cohort of hospitalised patients with COVID-19 and support antimicrobial decision-making. Methods: Longitudinal CRP and PCT concentrations and trajectories of 237 hospitalised patients with COVID-19 were modelled. The dataset comprised of 2,021 data points for CRP and 284 points for PCT. Pairwise comparisons were performed between: (i) those with or without significant bacterial growth from cultures, and (ii) those who survived or died in hospital. Results: CRP concentrations were higher over time in COVID-19 patients with positive microbiology (day 9: 236 vs 123 mg/L, p < 0.0001) and in those who died (day 8: 226 vs 152 mg/L, p < 0.0001) but only after day 7 of COVID-related symptom onset. Failure for CRP to reduce in the first week of hospital admission was associated with significantly higher odds of death. PCT concentrations were higher in patients with COVID-19 and positive microbiology or in those who died, although these differences were not statistically significant. Conclusions: Both the absolute CRP concentration and the trajectory during the first week of hospital admission are important factors predicting microbiology culture positivity and outcome in patients hospitalised with COVID-19. Further work is needed to describe the role of PCT for co-infection. Understanding relationships of these biomarkers can support development of risk models and inform optimal antimicrobial strategies.

Abstract
Cite
Citations: 19

Conference paper

Liu Z, Barahona M, 2021,

Similarity measure for sparse time course data based on Gaussian processes

, Uncertainty in Artificial Intelligence 2021, Publisher: PMLR, Pages: 1332-1341

We propose a similarity measure for sparsely sampled time course data in the form of a log-likelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.

Journal article

Simoes Monteiro de Marvao A, McGurk K, Zheng S, Thanaj M, Bai W, Duan J, Biffi C, Mazzarotto F, Statton B, Dawes T, Savioli N, Halliday B, Xu X, Buchan R, Baksi A, Quinlan M, Tokarczuk P, Tayal U, Francis C, Whiffin N, Theotokis A, Zhang X, Jang M, Berry A, Pantazis A, Barton P, Rueckert D, Prasad S, Walsh R, Ho C, Cook S, Ware J, O'Regan Det al., 2021,

Phenotypic expression and outcomes in individuals with rare genetic variants of hypertrophic cardiomyopathy

, Journal of the American College of Cardiology, Vol: 78, Pages: 1097-1110, ISSN: 0735-1097

Background: Hypertrophic cardiomyopathy (HCM) is caused by rare variants in sarcomereencoding genes, but little is known about the clinical significance of these variants in thegeneral population.Objectives: To compare lifetime outcomes and cardiovascular phenotypes according to thepresence of rare variants in sarcomere-encoding genes amongst middle-aged adults.Methods: We analysed whole exome sequencing and cardiac magnetic resonance (CMR)imaging in UK Biobank participants stratified by sarcomere-encoding variant status.Results: The prevalence of rare variants (allele frequency <0.00004) in HCM-associatedsarcomere-encoding genes in 200,584 participants was 2.9% (n=5,712; 1 in 35), and theprevalence of variants pathogenic or likely pathogenic for HCM (SARC-HCM-P/LP) was0.25% (n=493, 1 in 407). SARC-HCM-P/LP variants were associated with increased risk ofdeath or major adverse cardiac events compared to controls (HR 1.69, 95% CI 1.38 to 2.07,p<0.001), mainly due to heart failure endpoints (HR 4.23, 95% CI 3.07 to 5.83, p<0.001). In21,322 participants with CMR, SARC-HCM-P/LP were associated with asymmetric increasein left ventricular maximum wall thickness (10.9±2.7 vs 9.4±1.6 mm, p<0.001) buthypertrophy (≥13mm) was only present in 18.4% (n=9/49, 95% CI 9 to 32%). SARC-HCMP/LP were still associated with heart failure after adjustment for wall thickness (HR 6.74,95% CI 2.43 to 18.7, p<0.001).Conclusions: In this population of middle-aged adults, SARC-HCM-P/LP variants have lowaggregate penetrance for overt HCM but are associated with increased risk of adversecardiovascular outcomes and an attenuated cardiomyopathic phenotype. Although absoluteevent rates are low, identification of these variants may enhance risk stratification beyondfamilial disease.

Journal article

Mersmann S, Stromich L, Song F, Wu N, Vianello F, Barahona M, Yaliraki Set al., 2021,

ProteinLens: a web-based application for the analysis of allosteric signalling on atomistic graphs of biomolecules

, Nucleic Acids Research, Vol: 49, Pages: W551-W558, ISSN: 0305-1048

The investigation of allosteric effects in biomolecular structures is of great current interest in diverse areas, from fundamental biological enquiry to drug discovery. Here we present ProteinLens, a user-friendly and interactive web application for the investigation of allosteric signalling based on atomistic graph-theoretical methods. Starting from the PDB file of a biomolecule (or a biomolecular complex) ProteinLens obtains an atomistic, energy-weighted graph description of the structure of the biomolecule, and subsequently provides a systematic analysis of allosteric signalling and communication across the structure using two computationally efficient methods: Markov Transients and bond-to-bond propensities. ProteinLens scores and ranks every bond and residue according to the speed and magnitude of the propagation of fluctuations emanating from any site of choice (e.g. the active site). The results are presented through statistical quantile scores visualised with interactive plots and adjustable 3D structure viewers, which can also be downloaded. ProteinLens thus allows the investigation of signalling in biomolecular structures of interest to aid the detection of allosteric sites and pathways. ProteinLens is implemented in Python/SQL and freely available to use at: www.proteinlens.io.

Search or filter publications

Filter by type:

Filter by year:

Results

Search results