Imperial College London


Faculty of Natural SciencesDepartment of Mathematics

Academic Visitor







Burlington DanesHammersmith Campus





Publication Type

9 results found

Liu Z, Barahona M, 2021, Similarity measure for sparse time course data based on Gaussian processes, Uncertainty in Artificial Intelligence 2021, Publisher: PMLR, Pages: 1332-1341

We propose a similarity measure for sparsely sampled time course data in the form of a log-likelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.

Conference paper

Xue Y, Liu Z, Fang X, Wang Fet al., 2021, Multimodal Pre-Training Model for Sequence-based Prediction of Protein-Protein Interaction, Machine Learning in Computational Biology 2021

Conference paper

Liu Z, Ye X, Fang X, Wang F, Wu H, Wang Het al., 2021, Docking-based virtual screening with multi-task learning, IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021, Publisher: IEEE

Machine learning shows great potential in virtualscreening for drug discovery. Current efforts on acceleratingdocking-based virtual screening do not consider using existingdata of other previously developed targets. To make use ofthe knowledge of the other targets and take advantage of theexisting data, in this work, we apply multi-task learning tothe problem of docking-based virtual screening. With two largedocking datasets, the results of extensive experiments show thatmulti-task learning can achieve better performances on dockingscore prediction. By learning knowledge across multiple targets,the model trained by multi-task learning shows a better abilityto adapt to a new target. Additional empirical study showsthat other problems in drug discovery, such as the experimentaldrug-target affinity prediction, may also benefit from multi-tasklearning. Our results demonstrate that multi-task learning is apromising machine learning approach for docking-based virtualscreening and accelerating the process of drug discovery.

Conference paper

Ling Y, Liu Z, Xue J-H, 2021, Dimension reduction for data with heterogeneous missingness, Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Publisher: PMLR, Pages: 1310-1320

Dimension reduction plays a pivotal role in analysing high-dimensional data. However, observations with missing values present serious difficulties in directly applying standard dimension reduction techniques. As a large number of dimension reduction approaches are based on the Gram matrix, we first investigate the effects of missingness on dimension reduction by studying the statistical properties of the Gram matrix with or without missingness, and then we present a bias-corrected Gram matrix with nice statistical properties under heterogeneous missingness. Extensive empirical results, on both simulated and publicly available real datasets, show that the proposed unbiased Gram matrix can significantly improve a broad spectrum of representative dimension reduction approaches.

Conference paper

Saavedra-Garcia P, Roman-Trufero M, Al-Sadah HA, Blighe K, Lopez-Jimenez E, Christoforou M, Penfold L, Capece D, Xiong X, Miao Y, Parzych K, Caputo V, Siskos AP, Encheva V, Liu Z, Thiel D, Kaiser MF, Piazza P, Chaidos A, Karadimitris A, Franzoso G, Snijder AP, Keun HC, OyarzĂșn DA, Barahona M, Auner Het al., 2021, Systems level profiling of chemotherapy-induced stress resolution in cancer cells reveals druggable trade-offs, Proceedings of the National Academy of Sciences of USA, Vol: 118, ISSN: 0027-8424

Cancer cells can survive chemotherapy-induced stress, but how they recover from it is not known.Using a temporal multiomics approach, we delineate the global mechanisms of proteotoxic stressresolution in multiple myeloma cells recovering from proteasome inhibition. Our observations definelayered and protracted programmes for stress resolution that encompass extensive changes acrossthe transcriptome, proteome, and metabolome. Cellular recovery from proteasome inhibitioninvolved protracted and dynamic changes of glucose and lipid metabolism and suppression ofmitochondrial function. We demonstrate that recovering cells are more vulnerable to specific insultsthan acutely stressed cells and identify the general control nonderepressable 2 (GCN2)-driven cellularresponse to amino acid scarcity as a key recovery-associated vulnerability. Using a transcriptomeanalysis pipeline, we further show that GCN2 is also a stress-independent bona fide target intranscriptional signature-defined subsets of solid cancers that share molecular characteristics. Thus,identifying cellular trade-offs tied to the resolution of chemotherapy-induced stress in tumour cellsmay reveal new therapeutic targets and routes for cancer therapy optimisation.

Journal article

Bryois J, Skene NG, Hansen TF, Kogelman LJA, Watson HJ, Liu Z, Eating Disorders Working Group of the Psychiatric Genomics Consortium, International Headache Genetics Consortium, 23andMe Research Team, Brueggeman L, Breen G, Bulik CM, Arenas E, Hjerling-Leffler J, Sullivan PFet al., 2020, Genetic identification of cell types underlying brain complex traits yields insights into the etiology of Parkinson's disease, Nature Genetics, Vol: 52, Pages: 482-493, ISSN: 1061-4036

Genome-wide association studies have discovered hundreds of loci associated with complex brain disorders, but it remains unclear in which cell types these loci are active. Here we integrate genome-wide association study results with single-cell transcriptomic data from the entire mouse nervous system to systematically identify cell types underlying brain complex traits. We show that psychiatric disorders are predominantly associated with projecting excitatory and inhibitory neurons. Neurological diseases were associated with different cell types, which is consistent with other lines of evidence. Notably, Parkinson's disease was genetically associated not only with cholinergic and monoaminergic neurons (which include dopaminergic neurons) but also with enteric neurons and oligodendrocytes. Using post-mortem brain transcriptomic data, we confirmed alterations in these cells, even at the earliest stages of disease progression. Our study provides an important framework for understanding the cellular basis of complex brain maladies, and reveals an unexpected role of oligodendrocytes in Parkinson's disease.

Journal article

Liu Z, Barahona M, 2020, Graph-based data clustering via multiscale community detection, Applied Network Science, Vol: 5, Pages: 1-20, ISSN: 2364-8228

We present a graph-theoretical approach to data clustering, which combines the creation of a graph from the data with Markov Stability, a multiscale community detection framework. We show how the multiscale capabilities of the method allow the estimation of the number of clusters, as well as alleviating the sensitivity to the parameters in graph construction. We use both synthetic and benchmark real datasets to compare and evaluate several graph construction methods and clustering algorithms, and show that multiscale graph-based clustering achieves improved performance compared to popular clustering methods without the need to set externally the number of clusters.

Journal article

Saavedra-Garcia P, Al-Sadah HA, Penfold L, Xiong X, Lopez-Jimenez E, Parzych K, Caputo VS, Blighe K, Kaiser MF, Piazza P, Encheva V, Snijders AP, Keun HC, Oyarzun D, Thiel D, Liu Z, Barahona M, Auner HWet al., 2019, Integrated Systems Level Examination of Proteasome Inhibitor Stress Recovery in Myeloma Cells Reveals Druggable Vulnerabilities Linked to Multiple Metabolic Processes, 61st Annual Meeting and Exposition of the American-Society-of-Hematology (ASH), Publisher: AMER SOC HEMATOLOGY, ISSN: 0006-4971

Conference paper

Liu Z, Barahona M, 2017, Geometric multiscale community detection: Markov stability and vector partitioning, Journal of Complex Networks, Vol: 6, Pages: 157-172, ISSN: 2051-1329

Multiscale community detection can be viewed from a dynamical perspective within the Markov stability framework, which uses the diffusion of a Markov process on the graph to uncover intrinsic network substructures across all scales. Here we reformulate multiscale community detection as a max-sum length vector partitioning problem with respect to the set of time-dependent node vectors expressed in terms of eigenvectors of the transition matrix. This formulation provides a geometric interpretation of Markov stability in terms of a time-dependent spectral embedding, where the Markov time acts as an inhomogeneous geometric resolution factor that zooms the components of the node vectors at different rates. Our geometric formulation encompasses both modularity and the multi-resolution Potts model, which are shown to correspond to vector partitioning in a pseudo-Euclidean space, and is also linked to spectral partitioning methods, where the number of eigenvectors used corresponds to the dimensionality of the underlying embedding vector space. Inspired by the Louvain optimization for community detection, we then propose an algorithm based on a graph-theoretical heuristic for the vector partitioning problem. We apply the algorithm to the spectral optimization of modularity and Markov stability community detection. The spectral embedding based on the transition matrix eigenvectors leads to improved partitions with higher information content and higher modularity than the eigen-decomposition of the modularity matrix. We illustrate the results with random network benchmarks.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00843409&limit=30&person=true