320 results found
Jones DT, Sternberg MJE, Thornton JM, 2006, Introduction. Bioinformatics: from molecules to systems, PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, Vol: 361, Pages: 389-391, ISSN: 0962-8436
In this paper we explore a topic which is at the intersection of two areas of Machine Learning: namely Support Vector Machines (SVMs) and Inductive Logic Programming (ILP). We propose a general method for constructing kernels for Support Vector Inductive Logic Programming (SVILP). The kernel not only captures the semantic and syntactic relational information contained in the data but also provides the flexibility of using arbitrary forms of structured and non-structured data coded in a relational way. While specialised kernels have been developed for strings, trees and graphs our approach uses declarative background knowledge to provide the learning bias. The use of explicitly encoded background knowledge distinguishes SVILP from existing relational kernels which in ILP-terms work purely at the atomic generalisation level. The SVILP approach is a form of generalisation relative to background knowledge, though the final combining function for the ILP-learned clauses is an SVM rather than a logical conjunction. We evaluate SVILP empirically against related approaches, including an industry-standard toxin predictor called TOPKAT. Evaluation is conducted on a new broad-ranging toxicity dataset (DSSTox). The experimental results demonstrate that our approach significantly outperforms all other approaches in the study. © Springer-Verlag Berlin Heidelberg 2006.
Paszkiewicz KH, Sternberg MJE, Lappe M, 2006, Prediction of viable circular permutants using a graph theoretic approach, BIOINFORMATICS, Vol: 22, Pages: 1353-1358, ISSN: 1367-4803
Carter P, Lesk VI, Islam SA, et al., 2005, Protein-protein docking using 3D-dock in rounds 3, 4, and 5 of CAPRI, PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, Vol: 60, Pages: 281-288, ISSN: 0887-3585
Fernandez-Fuentes N, Querol E, Aviles FX, et al., 2005, Prediction of the conformation and geometry of loops in globular proteins: Testing ArchDB, a structural classification of loops, PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, Vol: 60, Pages: 746-757, ISSN: 0887-3585
Madhusudan S, Smart F, Shrimpton P, et al., 2005, Isolation of a small molecule inhibitor of DNA base excision repair, NUCLEIC ACIDS RESEARCH, Vol: 33, Pages: 4711-4724, ISSN: 0305-1048
Muggleton S, Lodhi H, Amini A, et al., 2005, Support vector inductive logic programming, Berlin, 8th International Conference on Discovery Science, 8 - 11 October 2005, Singapore, SINGAPORE, Publisher: Springer-Verlag, Pages: 163-175
Pazos F, Ranea JAG, Juan D, et al., 2005, Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome, J MOL BIOL, Vol: 352, Pages: 1002-1015, ISSN: 0022-2836
Smith GR, Sternberg MJE, Bates PA, 2005, The relationship between the flexibility of proteins and their conformational states on forming protein-protein complexes with an application to protein-protein docking, JOURNAL OF MOLECULAR BIOLOGY, Vol: 347, Pages: 1077-1101, ISSN: 0022-2836
Aguilar D, Aviles FX, Querol E, et al., 2004, Analysis of phenetic trees based on metabolic capabilites across the three domains of life, JOURNAL OF MOLECULAR BIOLOGY, Vol: 340, Pages: 491-512, ISSN: 0022-2836
Espadaler J, Fernandez-Fuentes N, Hermoso A, et al., 2004, ArchDB: automated protein loop classification as a tool for structural genomics, NUCLEIC ACIDS RESEARCH, Vol: 32, Pages: D185-D188, ISSN: 0305-1048
Espadaler J, Fernandez-Fuentes N, Hermoso A, et al., 2004, ArchDB: Automated protein loop classification as a tool for structural genomics, Nucleic Acids Research, Vol: 32, ISSN: 0305-1048
The annotation of protein function has become a crucial problem with the advent of sequence and structural genomics initiatives. A large body of evidence suggests that protein structural information is frequently encoded in local sequences, and that folds are mainly made up of a number of simple local units of super-secondary structural motifs, consisting of a few secondary structures and their connecting loops. Moreover, protein loops play an important role in protein function. Here we present ArchDB, a classification database of structural motifs, consisting of one loop plus its bracing secondary structures. ArchDB currently contains 12 665 super-secondary elements classified into 1496 motif subclasses. The database provides an easy way to retrieve functional information from protein structures sharing a common motif, to search motifs found in a given SCOP family, superfamily or fold, or to search by keywords on proteins with classified loops. The ArchDB database of loops is located at http://sbi.imim.es/archdb.
Fleming K, Muller A, MacCallum RM, et al., 2004, 3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes, NUCLEIC ACIDS RESEARCH, Vol: 32, Pages: D245-D250, ISSN: 0305-1048
Fleming K, Müller A, MacCallum RM, et al., 2004, 3D-GENOMICS: A database to compare structural and functional annotations of proteins between sequenced genomes, Nucleic Acids Research, Vol: 32, ISSN: 0305-1048
The 3D-GENOMICS database (http://www.sbg.bio.ic.ac.uk/3dgenomics/) provides structural annotations for proteins from sequenced genomes. In August 2003 the database included data for 93 proteomes. The annotations stored in the database include homologous sequences from various sequence databases, domains from SCOP and Pfam, patterns from Prosite and other predicted sequence features such as transmembrane regions and coiled coils. In addition to annotations at the sequence level, several precomputed crossproteome comparative analyses are available based on SCOP domain superfamily composition. Annotations are available to the user via a web interface to the database. Multiple points of entry are available so that a user is able to: (i) directly access annotations for a single protein sequence via keywords or accession codes, (ii) examine a sequence of interest chosen from a summary of annotations for a particular proteome, or (iii) access precomputed frequency-based cross-proteome comparative analyses.
Pazos F, Sternberg MJE, 2004, Automated prediction of protein function and detection of functional sites from structure, PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, Vol: 101, Pages: 14754-14759, ISSN: 0027-8424
Smith GR, Sternberg MJE, Bates PA, 2004, Molecular dynamics study of the flexibility of complex-forming proteins, 48th Annual Meeting of the Biophysical Society, Publisher: BIOPHYSICAL SOCIETY, Pages: 413A-413A, ISSN: 0006-3495
Cootes AP, Muggleton SH, Sternberg MJE, 2003, The automatic discovery of structural principles describing protein fold space, JOURNAL OF MOLECULAR BIOLOGY, Vol: 330, Pages: 839-850, ISSN: 0022-2836
Janin J, Henrick K, Moult J, et al., 2003, CAPRI: A Critical Assessment of PRedicted Interactions, PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, Vol: 52, Pages: 2-9, ISSN: 0887-3585
Smith GR, Sternberg MJE, 2003, Evaluation of the 3D-Dock protein docking suite in rounds 1 and 2 of the CAPRI blind trial, PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, Vol: 52, Pages: 74-79, ISSN: 0887-3585
Sternberg MJE, Muggleton SH, 2003, Structure activity relationships (SAR) and pharmacophore discovery using Inductive Logic Programming (ILP), QSAR & COMBINATORIAL SCIENCE, Vol: 22, Pages: 527-532, ISSN: 1611-020X
Alves R, Chaleil RAG, Sternberg MJE, 2002, Evolution of enzymes in metabolism: A network perspective, JOURNAL OF MOLECULAR BIOLOGY, Vol: 320, Pages: 751-770, ISSN: 0022-2836
Alves R, Chaleil RAG, Sternberg MJE, 2002, Evolution of enzymes in metabolism: A network perspective (vol 320, pg 751, 2002), JOURNAL OF MOLECULAR BIOLOGY, Vol: 324, Pages: 387-387, ISSN: 0022-2836
Muller A, MacCallum RM, Sternberg MJE, 2002, Structural characterization of the human proteome, GENOME RESEARCH, Vol: 12, Pages: 1625-1641, ISSN: 1088-9051
Smith GR, Sternberg MJE, 2002, Prediction of protein-protein interactions by docking methods, CURRENT OPINION IN STRUCTURAL BIOLOGY, Vol: 12, Pages: 28-35, ISSN: 0959-440X
Stapley BJ, Kelley LA, Sternberg MJE, 2002, Predicting the sub-cellular location of proteins from text using support vector machines., Pac Symp Biocomput, Pages: 374-385, ISSN: 2335-6928
We present an automatic method to classify the sub-cellular location of proteins based on the text of relevant medline abstracts. For each protein, a vector of terms is generated from medline abstracts in which the protein/gene's name or synonym occurs. A Support Vector Machine (SVM) is used to automatically partition the term space and to thus discriminate the textual features that define sub-cellular location. The method is benchmarked on a set of proteins of known sub-cellular location from S. cerevisiae. No prior knowledge of the problem domain nor any natural language processing is used at any stage. The method out-performs support vector machines trained on amino acid composition and has comparable performance to rule-based text classifiers. Combining text with protein amino-acid composition improves recall for some sub-cellular locations. We discuss the generality of the method and its potential application to a variety of biological classification problems.
Aloy P, Querol E, Aviles FX, et al., 2001, Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking, JOURNAL OF MOLECULAR BIOLOGY, Vol: 311, Pages: 395-408, ISSN: 0022-2836
Bates PA, Kelley LA, MacCallum RM, et al., 2001, Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM, PROTEINS-STRUCTURE FUNCTION AND GENETICS, Pages: 39-46, ISSN: 0887-3585
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.