343 results found
Metherell LA, Guerra-Assunção JA, Sternberg M, et al., 2016, Three-dimensional model of human Nicotinamide Nucleotide Transhydrogenase (NNT) and sequence-structure analysis of its disease-causing variations, Human Mutation, Vol: 37, Pages: 1074-1084, ISSN: 1098-1004
Defective mitochondrial proteins are emerging as major contributors to human disease. Nicotinamide nucleotide transhydrogenase (NNT), a widely expressed mitochondrial protein, has a crucial role in the defence against oxidative stress. NNT variations have recently been reported in patients with familial glucocorticoid deficiency (FGD) and in patients with heart failure. Moreover, knockout animal models suggest that NNT has a major role in diabetes mellitus and obesity. In this study, we used experimental structures of bacterial transhydrogenases to generate a structural model of human NNT (H-NNT). Structure-based analysis allowed the identification of H-NNT residues forming the NAD binding site, the proton canal and the large interaction site on the H-NNT dimer. In addition, we were able to identify key motifs that allow conformational changes adopted by domain III in relation to its functional status, such as the flexible linker between domains II and III and the salt bridge formed by H-NNT Arg882 and Asp830. Moreover, integration of sequence and structure data allowed us to study the structural and functional effect of deleterious amino acid substitutions causing FGD and left ventricular non-compaction cardiomyopathy. In conclusion, interpretation of the function–structure relationship of H-NNT contributes to our understanding of mitochondrial disorders.
Howard SR, Guasti L, Ruiz-Babot G, et al., 2016, IGSF10 mutations dysregulate gonadotropin-releasing hormone neuronal migration resulting in delayed puberty., EMBO Molecular Medicine, Vol: 8, Pages: 626-642, ISSN: 1757-4676
Early or late pubertal onset affects up to 5% of adolescents and is associated with adverse health and psychosocial outcomes. Self-limited delayed puberty (DP) segregates predominantly in an autosomal dominant pattern, but the underlying genetic background is unknown. Using exome and candidate gene sequencing, we have identified rare mutations in IGSF10 in 6 unrelated families, which resulted in intracellular retention with failure in the secretion of mutant proteins. IGSF10 mRNA was strongly expressed in embryonic nasal mesenchyme, during gonadotropin-releasing hormone (GnRH) neuronal migration to the hypothalamus. IGSF10 knockdown caused a reduced migration of immature GnRH neurons in vitro, and perturbed migration and extension of GnRH neurons in a gnrh3:EGFP zebrafish model. Additionally, loss-of-function mutations in IGSF10 were identified in hypothalamic amenorrhea patients. Our evidence strongly suggests that mutations in IGSF10 cause DP in humans, and points to a common genetic basis for conditions of functional hypogonadotropic hypogonadism (HH). While dysregulation of GnRH neuronal migration is known to cause permanent HH, this is the first time that this has been demonstrated as a casual mechanism in DP.
Sternberg MJE, Ostankovitch MI, 2016, Computation Resources for Molecular Biology: A Special Issue, Journal of Molecular Biology, Vol: 428, Pages: 669-670, ISSN: 1089-8638
Mezulis S, Sternberg MJ, Kelley LA, 2015, PhyreStorm: A web server for fast structural searches against the PDB., Journal of Molecular Biology, Vol: 428, Pages: 702-708, ISSN: 1089-8638
The identification of structurally similar proteins can provide a range of biological insights and accordingly the alignment of a query protein to a database of experimentally-determined protein structures is a technique commonly used in the fields of structural and evolutionary biology. The PhyreStorm web server has been designed to provide comprehensive, up-to-date and rapid structural comparisons against the Protein Data Bank (PDB) combined with a rich and intuitive user interface. It is intended that this facility will enable biologists inexpert in bioinformatics access to a powerful tool for exploring protein structure relationships beyond what can be achieved by sequence analysis alone. By partitioning the PDB into similar structures, PhyreStorm is able to quickly discard the majority of structures that cannot possibly align well to a query protein, reducing the number of alignments required by an order of magnitude. PhyreStorm is capable of finding 93±2% of all highly similar (TM-score >0.7) structures in the PDB for each query structure, usually in under 60 seconds. PhyreStorm is available at http://www.sbg.bio.ic.ac.uk/phyrestorm/.
Greener J, Sternberg MJE, 2015, AlloPred: prediction of allosteric pockets on proteins using normal mode perturbation analysis., BMC Bioinformatics, Vol: 16, ISSN: 1471-2105
BackgroundDespite being hugely important in biological processes, allostery is poorly understood and no universal mechanism has been discovered. Allosteric drugs are a largely unexplored prospect with many potential advantages over orthosteric drugs. Computational methods to predict allosteric sites on proteins are needed to aid the discovery of allosteric drugs, as well as to advance our fundamental understanding of allostery.ResultsAlloPred, a novel method to predict allosteric pockets on proteins, was developed. AlloPred uses perturbation of normal modes alongside pocket descriptors in a machine learning approach that ranks the pockets on a protein. AlloPred ranked an allosteric pocket top for 23 out of 40 known allosteric proteins, showing comparable and complementary performance to two existing methods. In 28 of 40 cases an allosteric pocket was ranked first or second. The AlloPred web server, freely available at http://www.sbg.bio.ic.ac.uk/allopred/home, allows visualisation and analysis of predictions. The source code and dataset information are also available from this site.ConclusionsPerturbation of normal modes can enhance our ability to predict allosteric sites on proteins. Computational methods such as AlloPred assist drug discovery efforts by suggesting sites on proteins for further experimental study.
Cornish AJ, Filippis I, David A, et al., 2015, Exploring the cellular basis of human disease through a large-scale mapping of deleterious genes to cell types, Genome Medicine, Vol: 7, ISSN: 1756-994X
David A, Sternberg MJ, 2015, The contribution of missense mutations in core and rim residues of protein-protein interfaces to human disease., Journal of Molecular Biology, Vol: 427, Pages: 2886-2898, ISSN: 1089-8638
Missense mutations at protein-protein interaction (PPIs) sites, called interfaces, are important contributors to human disease. Interfaces are non-uniform surface areas characterized by two main regions, 'core' and 'rim', which differ in terms of evolutionary conservation and physico-chemical properties. Moreover, within interfaces, only a small subset of residues ('hot spots') is crucial for the binding free energy of the protein-protein complex. We performed a large-scale structural analysis of human single amino acid variations (SAVs) and demonstrated that disease-causing mutations are preferentially located within the interface core, as opposed to the rim (p< 0.01). In contrast, the interface rim is significantly enriched in polymorphisms, similar to the remaining non-interacting surface. Energetic hot spots tend to be enriched in disease-causing mutations compared to non-hot spots (p=0.05), regardless of their occurrence in core or rim residues. For individual amino acids, the frequency of substitution into a polymorphism or disease-causing mutation differed to other amino acids and was related to its structural location, as was the type of physico-chemical change introduced by the SAV. In conclusion, this study demonstrated the different distribution and properties of disease-causing SAVs and polymorphisms within different structural regions and in relation to the energetic contribution of amino acid in protein-protein interfaces, thus highlighting the importance of a structural system biology approach for predicting the effect of SAVs.
Kelley LA, Sternberg MJ, 2015, Partial protein domains: evolutionary insights and bioinformatics challenges., Genome Biology, Vol: 16, Pages: 100-100, ISSN: 1474-760X
Protein domains are generally thought to correspond to units of evolution. New research raises questions about how such domains are defined with bioinformatics tools and sheds light on how evolution has enabled partial domains to be viable.
Kelley LA, Mezulis S, Yates CM, et al., 2015, The Phyre2 web portal for protein modeling, prediction and analysis., Nature Protocols, Vol: 10, Pages: 845-858, ISSN: 1754-2189
Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission.
Reynolds CR, Muggleton SH, Sternberg MJE, 2015, Incorporating virtual reactions into a logic-based ligand-based virtual screening method to discover new leads, Molecular Informatics, Vol: 34, Pages: 615-625, ISSN: 1868-1751
The use of virtual screening has become increasingly central to the drug development pipeline, with ligand-based virtual screening used to screen databases of compounds to predict their bioactivity against a target. These databases can only represent a small fraction of chemical space, and this paper describes a method of exploring synthetic space by applying virtual reactions to promising compounds within a database, and generating focussed libraries of predicted derivatives. A ligand-based virtual screening tool Investigational Novel Drug Discovery by Example (INDDEx) is used as the basis for a system of virtual reactions. The use of virtual reactions is estimated to open up a potential space of 1.21×1012 potential molecules. A de novo design algorithm known as Partial Logical-Rule Reactant Selection (PLoRRS) is introduced and incorporated into the INDDEx methodology. PLoRRS uses logical rules from the INDDEx model to select reactants for the de novo generation of potentially active products. The PLoRRS method is found to increase significantly the likelihood of retrieving molecules similar to known actives with a p-value of 0.016. Case studies demonstrate that the virtual reactions produce molecules highly similar to known actives, including known blockbuster drugs.
Lewis TE, Sillitoe I, Andreeva A, et al., 2015, Genome3D: exploiting structure to help users understand their sequences, NUCLEIC ACIDS RESEARCH, Vol: 43, Pages: D382-D386, ISSN: 0305-1048
Di Fruscia P, Zacharioudakis E, Liu C, et al., 2015, The Discovery of a Highly Selective 5,6,7,8-Tetrahydrobenzo[4,5]thieno[ 2,3-d] pyrimidin-4(3H)-one SIRT2 Inhibitor that is Neuroprotective in an in vitro Parkinson's Disease Model, CHEMMEDCHEM, Vol: 10, Pages: 69-82, ISSN: 1860-7179
Todd S, Todd P, Fol Leymarie F, et al., 2015, FoldSynth: Interactive 2D/3D visualisation platform for molecular strands, Pages: 41-50
© The Eurographics Association 2015. FoldSynth is an interactive platform designed to help understand the characteristics and commonly used visual abstractions of molecular strands with an emphasis on proteins and DNA. It uses a simple model of molecular forces to give real time interactive animations of the folding and docking processes. The shape of a molecular strand is shown as a 3D visualisation floating above a 2D triangular matrix representing distance constraints, contact maps or other features of residue pairs. As well as more conventional raster plots, contact maps can be shown with vectors representing the grouping of contacts as secondary structures. The 2D visualisation is also interactive and can be used to manipulate a molecule, define constraints, control and view the folding dynamically, or even design new molecules. While the 3D visualisation is more realistic showing a molecule representation approximating the physical behavior and spatial properties, the 2D visualisation offers greater visibility, in that all molecular positions (and pairings) are always in view; the 3D mode may suffer occlusions and create complex views which are typically hard to understand to humans.
Irimia M, Weatheritt RJ, Ellis JD, et al., 2014, A Highly Conserved Program of Neuronal Microexons Is Misregulated in Autistic Brains, Cell, Vol: 159, Pages: 1511-1523, ISSN: 0092-8674
Alternative splicing (AS) generates vast transcriptomicand proteomic complexity. However, whichof the myriad of detected AS events provide importantbiological functions is not well understood.Here, we define the largest program of functionallycoordinated, neural-regulated AS described to datein mammals. Relative to all other types of AS withinthis program, 3-15 nucleotide ‘‘microexons’’ displaythe most striking evolutionary conservation andswitch-like regulation. These microexons modulatethe function of interaction domains of proteinsinvolved in neurogenesis. Most neural microexonsare regulated by the neuronal-specific splicing factornSR100/SRRM4, through its binding to adjacentintronic enhancer motifs. Neural microexons arefrequently misregulated in the brains of individualswith autism spectrum disorder, and this misregulationis associated with reduced levels of nSR100.The results thus reveal a highly conserved programof dynamic microexon regulation associated withthe remodeling of protein-interaction networks duringneurogenesis, the misregulation of which islinked to autism.
Talman AM, Prieto JH, Marques S, et al., 2014, Proteomic analysis of the Plasmodium male gamete reveals the key role for glycolysis in flagellar motility, MALARIA JOURNAL, Vol: 13
Yates CM, Filippis I, Kelley LA, et al., 2014, SuSPect: Enhanced Prediction of Single Amino Acid Variant (SAV) Phenotype Using Network Features, JOURNAL OF MOLECULAR BIOLOGY, Vol: 426, Pages: 2692-2701, ISSN: 0022-2836
Yates CM, Sternberg MJE, 2013, The Effects of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) on Protein Protein Interactions, JOURNAL OF MOLECULAR BIOLOGY, Vol: 425, Pages: 3949-3963, ISSN: 0022-2836
Alexov E, Sternberg M, 2013, Understanding Molecular Effects of Naturally Occurring Genetic Differences, JOURNAL OF MOLECULAR BIOLOGY, Vol: 425, Pages: 3911-3913, ISSN: 0022-2836
Adzhubei AA, Sternberg MJE, Makarov AA, 2013, Polyproline-II Helix in Proteins: Structure and Function, JOURNAL OF MOLECULAR BIOLOGY, Vol: 425, Pages: 2100-2132, ISSN: 0022-2836
Yates CM, Sternberg MJE, 2013, Proteins and Domains Vary in Their Tolerance of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs), JOURNAL OF MOLECULAR BIOLOGY, Vol: 425, Pages: 1274-1286, ISSN: 0022-2836
Bryant WA, Sternberg MJE, Pinney JW, 2013, AMBIENT: Active Modules for Bipartite Networks - using high-throughput transcriptomic data to dissect metabolic response, BMC SYSTEMS BIOLOGY, Vol: 7, ISSN: 1752-0509
Radivojac P, Clark WT, Oron TR, et al., 2013, A large-scale evaluation of computational protein function prediction, NATURE METHODS, Vol: 10, Pages: 221-227, ISSN: 1548-7091
Mao C, Shukla M, Larrouy-Maumus G, et al., 2013, Functional assignment of Mycobacterium tuberculosis proteome revealed by genome-scale fold-recognition, TUBERCULOSIS, Vol: 93, Pages: 40-46, ISSN: 1472-9792
Janin J, Sternberg MJE, 2013, Protein flexibility, not disorder, is intrinsic to molecular recognition., F1000 Biol Rep, Vol: 5, ISSN: 1757-594X
An 'intrinsically disordered protein' (IDP) is assumed to be unfolded in the cell and perform its biological function in that state. We contend that most intrinsically disordered proteins are in fact proteins waiting for a partner (PWPs), parts of a multi-component complex that do not fold correctly in the absence of other components. Flexibility, not disorder, is an intrinsic property of proteins, exemplified by X-ray structures of many enzymes and protein-protein complexes. Disorder is often observed with purified proteins in vitro and sometimes also in crystals, where it is difficult to distinguish from flexibility. In the crowded environment of the cell, disorder is not compatible with the known mechanisms of protein-protein recognition, and, foremost, with its specificity. The self-assembly of multi-component complexes may, nevertheless, involve the specific recognition of nascent polypeptide chains that are incompletely folded, but then disorder is transient, and it must remain under the control of molecular chaperones and of the quality control apparatus that obviates the toxic effects it can have on the cell.
Lewis TE, Sillitoe I, Andreeva A, et al., 2013, Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains, Nucleic Acids Res, Vol: 41, Pages: D499-D507, ISSN: 1362-4962
Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).
David A, Kelley LA, Sternberg MJE, 2012, A new structural model of the acid-labile subunit: pathogenetic mechanisms of short stature-causing mutations, JOURNAL OF MOLECULAR ENDOCRINOLOGY, Vol: 49, Pages: 213-220, ISSN: 0952-5041
Sternberg MJE, Tamaddoni-Nezhad A, Lesk VI, et al., 2012, Gene Function Hypotheses for the Campylobacter jejuni Glycome Generated by a Logic-Based Approach, Journal of Molecular Biology, Vol: 425, Pages: 186-197, ISSN: 1089-8638
Increasingly, experimental data on biological systems are obtained from several sources and computational approaches are required to integrate this information and derive models for the function of the system. Here, we demonstrate the power of a logic-based machine learning approach to propose hypotheses for gene function integrating information from two diverse experimental approaches. Specifically, we use inductive logic programming that automatically proposes hypotheses explaining the empirical data with respect to logically encoded background knowledge. We study the capsular polysaccharide biosynthetic pathway of the major human gastrointestinal pathogen Campylobacter jejuni. We consider several key steps in the formation of capsular polysaccharide consisting of 15 genes of which 8 have assigned function, and we explore the extent to which functions can be hypothesised for the remaining 7. Two sources of experimental data provide the information for learning—the results of knockout experiments on the genes involved in capsule formation and the absence/presence of capsule genes in a multitude of strains of different serotypes. The machine learning uses the pathway structure as background knowledge. We propose assignments of specific genes to five previously unassigned reaction steps. For four of these steps, there was an unambiguous optimal assignment of gene to reaction, and to the fifth, there were three candidate genes. Several of these assignments were consistent with additional experimental results. We therefore show that the logic-based methodology provides a robust strategy to integrate results from different experimental approaches and propose hypotheses for the behaviour of a biological system.
Lin D, Chen J, Watanabe H, et al., 2012, Does multi-clause learning help in real-world applications?, Pages: 221-237, ISSN: 0302-9743
The ILP system Progol is incomplete in not being able to generalise a single example to multiple clauses. This limitation is referred as single-clause learning (SCL) in this paper. However, according to the Blumer bound, incomplete learners such as Progol can have higher predictive accuracy while use less search than more complete learners. This issue is particularly relevant in real-world problems, in which it is unclear whether the unknown target theory or its approximation is within the hypothesis space of the incomplete learner. This paper uses two real-world applications in systems biology to study whether it is necessary to have complete multi-clause learning (MCL) methods, which is computationally expensive but capable of deriving multi-clause hypotheses that is in the systems level. The experimental results show that in both applications there do exist datasets, in which MCL has significantly higher predictive accuracies than SCL. On the other hand, MCL does not outperform SCL all the time due to the existence of the target hypothesis or its approximations within the hypothesis space of SCL. © 2012 Springer-Verlag Berlin Heidelberg.
Wass MN, Stanway R, Blagborough AM, et al., 2012, Proteomic analysis of Plasmodium in the mosquito: progress and pitfalls, Parasitology, Vol: 139, Pages: 1131-1145, ISSN: 1469-8161
Here we discuss proteomic analyses of whole cell preparations of the mosquito stages of malaria parasite development (i.e.gametocytes, microgamete, ookinete, oocyst and sporozoite) of Plasmodium berghei. We also include critiques of theproteomes of two cell fractions from the purified ookinete, namely the micronemes and cell surface. Whereas we summarisekey biological interpretations of the data, we also try to identify key methodological constraints we have met, only some ofwhich we were able to resolve. Recognising the need to translate the potential of current genome sequencing into functionalunderstanding, we report our efforts to develop more powerful combinations of methods for the in silico prediction ofprotein function and location. We have applied this analysis to the proteome of the male gamete, a cell whose very simplestructural organisation facilitated interpretation of data. Some of the in silico predictions made have now been supported byongoing protein tagging and genetic knockout studies. We hope this discussion may assist future studies.
Santos JCA, Nassif H, Page D, et al., 2012, Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study, BMC BIOINFORMATICS, Vol: 13, ISSN: 1471-2105
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.