323 results found
Stapley BJ, Kelley LA, Sternberg MJE, 2002, Predicting the sub-cellular location of proteins from text using support vector machines., Pac Symp Biocomput, Pages: 374-385, ISSN: 2335-6928
We present an automatic method to classify the sub-cellular location of proteins based on the text of relevant medline abstracts. For each protein, a vector of terms is generated from medline abstracts in which the protein/gene's name or synonym occurs. A Support Vector Machine (SVM) is used to automatically partition the term space and to thus discriminate the textual features that define sub-cellular location. The method is benchmarked on a set of proteins of known sub-cellular location from S. cerevisiae. No prior knowledge of the problem domain nor any natural language processing is used at any stage. The method out-performs support vector machines trained on amino acid composition and has comparable performance to rule-based text classifiers. Combining text with protein amino-acid composition improves recall for some sub-cellular locations. We discuss the generality of the method and its potential application to a variety of biological classification problems.
Turcotte M, Muggleton SH, Sternberg MJE, 2001, Generating protein three-dimensional fold signatures using inductive logic programming, COMPUTERS & CHEMISTRY, Vol: 26, Pages: 57-64, ISSN: 0097-8485
, 2001, PML bodies associate specifically with the MHC gene cluster in interphase nuclei, Journal of Cell Science, Vol: 114, Pages: 3705-3716, ISSN: 0021-9533
Promyelocytic leukemia (PML) bodies are nuclear multiprotein domains. The observations that viruses transcribe their genomes adjacent to PML bodies and that nascent RNA accumulates at their periphery suggest that PML bodies function in transcription. We have used immuno-FISH in primary human fibroblasts to determine the 3D spatial organisation of gene-rich and gene-poor chromosomal regions relative to PML bodies. We find a highly non-random association of the gene-rich major histocompatibility complex (MHC) on chromosome 6 with PML bodies. This association is specific for the centromeric end of the MHC and extends over a genomic region of at least 1.6 megabases. We also show that PML association is maintained when a subsection of this region is integrated into another chromosomal location. This is the first demonstration that PML bodies have specific chromosomal associations and supports a model for PML bodies as part of a functional nuclear compartment.
Saqi MAS, Sternberg MJE, 2001, A structural census of metabolic networks for E. coli, JOURNAL OF MOLECULAR BIOLOGY, Vol: 313, Pages: 1195-1206, ISSN: 0022-2836
Shiels C, Islam SA, Vatcheva R, et al., 2001, PML bodies associate specifically with the MHC gene cluster in interphase nuclei, JOURNAL OF CELL SCIENCE, Vol: 114, Pages: 3705-3716, ISSN: 0021-9533
Aloy P, Querol E, Aviles FX, et al., 2001, Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking, JOURNAL OF MOLECULAR BIOLOGY, Vol: 311, Pages: 395-408, ISSN: 0022-2836
Jennings AJ, Edge CM, Sternberg MJE, 2001, An approach to improving multiple alignments of protein sequences using predicted secondary structure, PROTEIN ENGINEERING, Vol: 14, Pages: 227-231, ISSN: 0269-2139
Turcotte M, Muggleton SH, Sternberg MJE, 2001, The effect of relational background knowledge on learning of protein three-dimensional fold signatures, 8th International Conference on Inductive Logic Programming (ILP98), Publisher: SPRINGER, Pages: 81-95, ISSN: 0885-6125
Turcotte M, Muggleton SH, Sternberg MJE, 2001, Automated discovery of structural signatures of protein fold and function, JOURNAL OF MOLECULAR BIOLOGY, Vol: 306, Pages: 591-605, ISSN: 0022-2836
Bates PA, Kelley LA, MacCallum RM, et al., 2001, Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM, PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, Pages: 39-46, ISSN: 0887-3585
Huyton T, Bates PA, Zhang XD, et al., 2000, The BRCA1 C-terminal domain: structure and function, MUTATION RESEARCH-DNA REPAIR, Vol: 460, Pages: 319-332, ISSN: 0921-8777
Kelley LA, MacCallum RM, Sternberg MJE, 2000, Enhanced genome annotation using structural profiles in the program 3D-PSSM, JOURNAL OF MOLECULAR BIOLOGY, Vol: 299, Pages: 499-520, ISSN: 0022-2836
, 2000, Enhanced genome annotation using structural profiles in the program 3D-PSSM, Journal of Molecular Biology, Vol: 299, Pages: 501-522, ISSN: 0022-2836
A method (three-dimensional position-specific scoring matrix, 3D-PSSM) to recognise remote protein sequence homologues is described. The method combines the power of multiple sequence profiles with knowledge of protein structure to provide enhanced recognition and thus functional assignment of newly sequenced genomes. The method uses structural alignments of homologous proteins of similar three-dimensional structure in the structural classification of proteins (SCOP) database to obtain a structural equivalence of residues. These equivalences are used to extend multiply aligned sequences obtained by standard sequence searches. The resulting large superfamily-based multiple alignment is converted into a PSSM. Combined with secondary structure matching and solvation potentials, 3D-PSSM can recognise structural and functional relationships beyond state-of-the-art sequence methods. In a cross-validated benchmark on 136 homologous relationships unambiguously undetectable by position-specific iterated basic local alignment search tool (PSI-Blast), 3D-PSSM can confidently assign 18%. The method was applied to the remaining unassigned regions of the Mycoplasma genitalium genome and an additional 13 regions were assigned with 95% confidence. 3D-PSSM is available to the community as a web server: http://www.bmm.icnet.uk/servers/3dpssm. (C) 2000 Academic Press.
MacCallum RM, Kelley LA, Sternberg MJE, 2000, SAWTED: Structure Assignment With Text Description - Enhanced detection, of remote homologues with automated SWISS-PROT annotation comparisons, BIOINFORMATICS, Vol: 16, Pages: 125-129, ISSN: 1367-4803
Sternberg MJ, Gabb HA, Jackson RM, et al., 2000, Protein-protein docking. Generation and filtering of complexes., Methods Mol Biol, Vol: 143, Pages: 399-415, ISSN: 1064-3745
Muller A, MacCallum RM, Sternberg MJE, 1999, Benchmarking PSI-BLAST in genome annotation, JOURNAL OF MOLECULAR BIOLOGY, Vol: 293, Pages: 1257-1271, ISSN: 0022-2836
Sternberg MJE, Bates PA, Kelley LA, et al., 1999, Prepress in protein structure prediction: assessment of CASP3, CURRENT OPINION IN STRUCTURAL BIOLOGY, Vol: 9, Pages: 368-373, ISSN: 0959-440X
Moont G, Gabb HA, Sternberg MJE, 1999, Use of pair potentials across protein interfaces in screening predicted docked complexes, PROTEINS-STRUCTURE FUNCTION AND GENETICS, Vol: 35, Pages: 364-373, ISSN: 0887-3585
Moont G, Gabb HA, Sternberg MJE, 1999, Use of pair potentials across protein interfaces in screening predicted docked complexes, Proteins: Structure, Function and Genetics, Vol: 35, Pages: 364-373, ISSN: 0887-3585
Empirical residue-residue pair potentials are used to screen possible complexes for protein-protein dockings. A correct docking is defined as a complex with not more than 2.5 Å root-mean-square distance from the known experimental structure. The complexes were generated by 'ftdock' (Gabb et al. J Mol Biol 1997;272:106-120) that ranks using shape complementarity. The complexes studied were 5 enzyme-inhibitors and 2 antibody-antigens, starting from the unbound crystallographic coordinates, with a further 2 antibody- antigens where the antibody was from the bound crystallographic complex. The pair potential functions tested were derived both from observed intramolecular pairings in a database of nonhomologous protein domains, and from observed intermolecular pairings across the interfaces in sets of nonhomologous heterodimers and homodimers. Out of various alternate strategies, we found the optimal method used a mole-fraction calculated random model from the intramolecular pairings. For all the systems, a correct docking was placed within the top 12% of the pair potential score ranked complexes. A combined strategy was developed that incorporated 'multidock,' a side-chain refinement algorithm (Jackson et al. J Mol Biol 1998;276:265- 285). This placed a correct docking within the top 5 complexes for enzyme- inhibitor systems, and within the top 40 complexes for antibody-antigen systems.
Betts MJ, Sternberg MJE, 1999, An analysis of conformational changes on protein-protein association: implications for predictive docking, PROTEIN ENGINEERING, Vol: 12, Pages: 271-283, ISSN: 0269-2139
Elofsson A, Godzik A, Jones D, et al., 1999, CAFASP-1: Critical assessment of fully automated structure prediction methods, Proteins: Structure, Function and Genetics, Vol: 37, Pages: 209-217, ISSN: 0887-3585
The results of the first Critical Assessment of Fully Automated Structure Prediction (CAFASP-1) are presented. The objective was to evaluate the success rates of fully automatic web servers for fold recognition which are available to the community. This study was based on the targets used in the third meeting on the Critical-Assessment of Techniques for Protein Structure Prediction (CASP-3). However, unlike CASP-3, the study was not a blind trial, as it was held after the structures of the targets were known. The aim was to assess the performance of methods without the user intervention that several groups used in their CASP-3 submissions. Although it is clear that 'human plus machine' predictions are superior to automated ones, this CAFASP-1 experiment is extremely valuable for users of our methods; it provides an indication of the performance of the methods alone, and not of the 'human plus machine' performance assessed in CASP. This information may aid users in choosing which programs they wish to use and in evaluating the reliability of the programs when applied to their specific prediction targets. In addition, evaluation of fully automated methods is particularly important to assess their applicability at genomic scales. For each target, groups submitted the top-ranking folds generated from their servers. In CAFASP-1 we concentrated on fold-recognition web servers only and evaluated only recognition of the correct fold, and not, as in CASP-3, alignment accuracy. Although some performance differences appeared within each of the four target categories used here, overall, no single server has proved markedly superior to the others. The results showed that current fully automated fold recognition servers can often identify remote similarities when pairwise sequence search methods fall. Nevertheless, in only a few cases outside the family-level targets has the score of the top-ranking fold been significant enough to allow for a confident fully automated prediction.
Bates PA, Sternberg MJE, 1999, Model building by comparison at CASP3: Using expert knowledge and computer automation, Proteins: Structure, Function and Genetics, Vol: 37, Pages: 47-54, ISSN: 0887-3585
Ten models were constructed for the comparative modeling section of the Critical Assessment of Techniques for Protein Structure Prediction-3 (CASP3). Sequence identity between each target and the best possible parent(s) ranged between 12% and 64%. The modeling protocol is a mixture of automated computer algorithms with human intervention at certain critical stages. In particular, intervention is required to check sequence alignments and the selection of parameters for various computer programs. Seven of the targets were constructed from single-parent templates, and three were constructed from multiple parents. The reasons for such a high ratio of modeling from single parents only are discussed. Models constructed from multiple parents were found to be more accurate than models constructed from single parents only. A novel loop-modeling algorithm is presented that consists of fragment database searches, several fragment libraries, and mean-field calculations on representative fragment candidates.
Bates PA, Sternberg MJE, 1999, Model building by comparison at CASP3: Using expert knowledge and computer automation, PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, Pages: 47-54, ISSN: 0887-3585
, 1999, Recognition of remote protein homologies using three-dimensional information to generate a position specific scoring matrix in the program 3D-PSSM, Pages: 218-225
A method (3D-PSSM) to recognize remote protein sequence homologues is described. The method uses homologous proteins of similar three-dimensional structure in the SCOP database to obtain a structural equivalence of residues. These equivalences are used to extend multiply-aligned sequences obtained by standard sequence searches (i.e. 1D-profiles). The resultant 3D profile is converted into a position specific scoring matrix (a 3D-PSSM). The approach is benchmarked on recognizing remote homologues in the SCOP database and comparing the hit and error rates. 3D-PSSMs are compared with 1D-PSSMs and with two widely-used sensitive search approaches - PSI-BLAST and global dynamic programming using the BLOSUM62 matrix. In a cross-validated benchmark, 3D-PSSMs and 1D-PSSMs achieved similar results and both have lower error rates compared to the other two methods when recognizing remote homologues. The combination of 1D- and 3D-PSSMs provide improved performance over either individual method and thus can identify remote homologies that would not be detected by PSI-BLAST. It is envisaged that 3D-PSSM can complement current homology searches in a two-stage approach in which 3D-PSSMs will follow an initial search using PSI-BLAST or dynamic programming.
Bates PA, Dokurno P, Freemont PS, et al., 1998, Conformational analysis of the first observed non-proline cis-peptide bond occurring within the complementarity determining region (CDR) of an antibody, JOURNAL OF MOLECULAR BIOLOGY, Vol: 284, Pages: 549-555, ISSN: 0022-2836
Dokurno P, Bates PA, Band HA, et al., 1998, Crystal structure at 1.95 angstrom resolution of the breast tumour-specific antibody SM3 complexed with its peptide epitope reveals novel hypervariable loop recognition, JOURNAL OF MOLECULAR BIOLOGY, Vol: 284, Pages: 713-728, ISSN: 0022-2836
, 1998, Modelling repressor proteins docking to DNA, Proteins: Structure, Function and Genetics, Vol: 33, Pages: 535-549, ISSN: 0887-3585
The docking of repressor proteins to DNA starting from the unbound protein and model-built DNA coordinates is modeled computationally. The approach was evaluated on eight repressor/DNA complexes that employed different modes for protein/DNA recognition. The global search is based on a protein-protein docking algorithm that evaluates shape and electrostatic complementarity, which was modified to consider the importance of electrostatic features in DNA-protein recognition. Complexes were then ranked by an empirical score for the observed amino acid/nucleotide pairings (i.e., protein-DNA pair potentials) derived from a database of 20 protein/DNA complexes. A good prediction had at least 65% of the correct contacts modeled. This approach was able to identify a good solution at rank four or better for three out of the eight complexes. Predicted complexes were filtered by a distance constraint based on experimental data defining the DNA footprint. This improved coverage to four out of eight complexes having a good model at rank four or better. The additional use of amino acid mutagenesis and phylogenetic data defining residues on the repressor resulted in between 2 and 27 models that would have to be examined to find a good solution for seven of the eight test systems. This study shows that starting with unbound coordinates one can predict three-dimensional models for protein/DNA complexes that do not involve gross conformational changes on association.
Zhang XD, Morera S, Bates PA, et al., 1998, Structure of an XRCC1 BRCT domain: a new protein-protein interaction module, EMBO JOURNAL, Vol: 17, Pages: 6404-6411, ISSN: 0261-4189
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.