34 results found
Goncalves BP, Procter SR, Clifford S, et al., 2021, Estimation of country-level incidence of early-onset invasive Group B Streptococcus disease in infants using Bayesian methods, PLOS COMPUTATIONAL BIOLOGY, Vol: 17, ISSN: 1553-734X
Bottolo L, Banterle M, Richardson S, et al., 2021, A computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional quantitative trait loci discovery, JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, Vol: 70, Pages: 886-908, ISSN: 0035-9254
Alves AC, De Silva NMG, Karhunen V, et al., 2019, GWAS on longitudinal growth traits reveals different genetic factors influencing infant, child, and adult BMI, Science Advances, Vol: 5, ISSN: 2375-2548
Early childhood growth patterns are associated with adult health, yet the genetic factors and the developmental stages involved are not fully understood. Here we combine genome-wide association studies with modelling of longitudinal growth traits to study the genetics of infant and child growth, followed by functional, pathway, genetic correlation, risk score and co-localization analyses to determine how developmental timings, molecular pathways and genetic determinants of these traits overlap with those of adult health. We found a robust overlap between the genetics of child and adult BMI, with variants associated with adult BMI acting as early as 4-6 years old. However, we demonstrated a completely distinct genetic makeup for peak BMI during infancy, influenced by variation at the LEPR/LEPROT locus. These findings suggest that different genetic factors control infant and child BMI. In light of the obesity epidemic, these findings are important to inform the timing and targets of prevention strategies.
Kessy A, Lewin A, Strimmer K, 2018, Optimal Whitening and Decorrelation, AMERICAN STATISTICIAN, Vol: 72, Pages: 309-314, ISSN: 0003-1305
Hinney A, Kesselmeier M, Jall S, et al., 2017, Evidence for three genetic loci involved in both anorexia nervosa risk and variation of body mass index., Mol Psychiatry, Vol: 22, Pages: 192-201
The maintenance of normal body weight is disrupted in patients with anorexia nervosa (AN) for prolonged periods of time. Prior to the onset of AN, premorbid body mass index (BMI) spans the entire range from underweight to obese. After recovery, patients have reduced rates of overweight and obesity. As such, loci involved in body weight regulation may also be relevant for AN and vice versa. Our primary analysis comprised a cross-trait analysis of the 1000 single-nucleotide polymorphisms (SNPs) with the lowest P-values in a genome-wide association meta-analysis (GWAMA) of AN (GCAN) for evidence of association in the largest published GWAMA for BMI (GIANT). Subsequently we performed sex-stratified analyses for these 1000 SNPs. Functional ex vivo studies on four genes ensued. Lastly, a look-up of GWAMA-derived BMI-related loci was performed in the AN GWAMA. We detected significant associations (P-values <5 × 10(-5), Bonferroni-corrected P<0.05) for nine SNP alleles at three independent loci. Interestingly, all AN susceptibility alleles were consistently associated with increased BMI. None of the genes (chr. 10: CTBP2, chr. 19: CCNE1, chr. 2: CARF and NBEAL1; the latter is a region with high linkage disequilibrium) nearest to these SNPs has previously been associated with AN or obesity. Sex-stratified analyses revealed that the strongest BMI signal originated predominantly from females (chr. 10 rs1561589; Poverall: 2.47 × 10(-06)/Pfemales: 3.45 × 10(-07)/Pmales: 0.043). Functional ex vivo studies in mice revealed reduced hypothalamic expression of Ctbp2 and Nbeal1 after fasting. Hypothalamic expression of Ctbp2 was increased in diet-induced obese (DIO) mice as compared with age-matched lean controls. We observed no evidence for associations for the look-up of BMI-related loci in the AN GWAMA. A cross-trait analysis of AN and BMI loci revealed variants at three chromosomal loci with potential joint impact. The chromosome 10 locus is particularly pr
Ried JS, Jeff JM, Chu AY, et al., 2016, A principal component meta-analysis on multiple anthropometric traits identifies novel loci for body shape, Nature Communications, Vol: 7, ISSN: 2041-1723
Large consortia have revealed hundreds of genetic loci associated with anthropometric traits,one trait at a time. We examined whether genetic variants affect body shape as a compositephenotype that is represented by a combination of anthropometric traits. We developed anapproach that calculates averaged PCs (AvPCs) representing body shape derived fromsix anthropometric traits (body mass index, height, weight, waist and hip circumference,waist-to-hip ratio). The first four AvPCs explain499% of the variability, are heritable, andassociate with cardiometabolic outcomes. We performed genome-wide association analysesfor each body shape composite phenotype across 65 studies and meta-analysed summarystatistics. We identify six novel loci:LEMD2andCD47for AvPC1,RPS6KA5/C14orf159andGANABfor AvPC3, andARL15andANP32for AvPC4. Our findings highlight the value ofusing multiple traits to define complex phenotypes for discovery, which are not captured bysingle-trait analyses, and may shed light onto new pathways.
Lewin A, Hamilton S, Witkover A, et al., 2016, Free serum haemoglobin is associated with brain atrophy in secondary progressive multiple sclerosis [version 1; peer review: 1 approved, 2 approved with reservations], Wellcome Open Research, Vol: 1, ISSN: 2398-502X
Background A major cause of disability in secondary progressive multiple sclerosis (SPMS) is progressive brain atrophy, whose pathogenesis is not fully understood. The objective of this study was to identify protein biomarkers of brain atrophy in SPMS. Methods We used surface-enhanced laser desorption-ionization time-of-flight mass spectrometry to carry out an unbiased search for serum proteins whose concentration correlated with the rate of brain atrophy, measured by serial MRI scans over a 2-year period in a well-characterized cohort of 140 patients with SPMS. Protein species were identified by liquid chromatography-electrospray ionization tandem mass spectrometry. Results There was a significant (p<0.004) correlation between the rate of brain atrophy and a rise in the concentration of proteins at 15.1 kDa and 15.9 kDa in the serum. Tandem mass spectrometry identified these proteins as alpha-haemoglobin and beta-haemoglobin, respectively. The abnormal concentration of free serum haemoglobin was confirmed by ELISA (p<0.001). The serum lactate dehydrogenase activity was also highly significantly raised (p<10(-12)) in patients with secondary progressive multiple sclerosis. Conclusions An underlying low-grade chronic intravascular haemolysis is a potential source of the iron whose deposition along blood vessels in multiple sclerosis plaques contributes to the neurodegeneration and consequent brain atrophy seen in progressive disease. Chelators of free serum iron will be ineffective in preventing this neurodegeneration, because the iron (Fe(2+)) is chelated by haemoglobin.
Felix JF, Bradfield JP, Monnereau C, et al., 2015, Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index., Human Molecular Genetics, Vol: 25, Pages: 389-403, ISSN: 1460-2083
A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We included 35 668 children from 20 studies in the discovery phase and 11 873 children from 13 studies in the replication phase. In total, 15 loci reached genome-wide significance (P-value < 5 × 10(-8)) in the joint discovery and replication analysis, of which 12 are previously identified loci in or close to ADCY3, GNPDA2, TMEM18, SEC16B, FAIM2, FTO, TFAP2B, TNNI3K, MC4R, GPR61, LMX1B and OLFM4 associated with adult body mass index or childhood obesity. We identified three novel loci: rs13253111 near ELP3, rs8092503 near RAB27B and rs13387838 near ADAM23. Per additional risk allele, body mass index increased 0.04 Standard Deviation Score (SDS) [Standard Error (SE) 0.007], 0.05 SDS (SE 0.008) and 0.14 SDS (SE 0.025), for rs13253111, rs8092503 and rs13387838, respectively. A genetic risk score combining all 15 SNPs showed that each additional average risk allele was associated with a 0.073 SDS (SE 0.011, P-value = 3.12 × 10(-10)) increase in childhood body mass index in a population of 1955 children. This risk score explained 2% of the variance in childhood body mass index. This study highlights the shared genetic background between childhood and adult body mass index and adds three novel loci. These loci likely represent age-related differences in strength of the associations with body mass index.
Jaenes J, Hu F, Lewin A, et al., 2015, A comparative study of RNA-seq analysis strategies, BRIEFINGS IN BIOINFORMATICS, Vol: 16, Pages: 932-940, ISSN: 1467-5463
Lewin A, Saadi H, Peters JE, et al., 2015, MT-HESS: an efficient Bayesian approach for simultaneous association detection in OMICS datasets, with application to eQTL mapping in multiple tissues, Bioinformatics, Vol: 32, Pages: 523-532, ISSN: 1367-4803
Motivation: Analysing the joint association between a large set of responses and predictors is a fundamental statistical task in integrative genomics, exemplified by numerous expression Quantitative Trait Loci (eQTL) studies. Of particular interest are the so-called ‘hotspots’, important genetic variants that regulate the expression of many genes. Recently, attention has focussed on whether eQTLs are common to several tissues, cell-types or, more generally, conditions or whether they are specific to a particular condition.Results: We have implemented MT-HESS, a Bayesian hierarchical model that analyses the association between a large set of predictors, e.g. SNPs, and many responses, e.g. gene expression, in multiple tissues, cells or conditions. Our Bayesian sparse regression algorithm goes beyond ‘one-at-a-time’ association tests between SNPs and responses and uses a fully multivariate model search across all linear combinations of SNPs, coupled with a model of the correlation between condition/tissue-specific responses. In addition, we use a hierarchical structure to leverage shared information across different genes, thus improving the detection of hotspots. We show the increase of power resulting from our new approach in an extensive simulation study. Our analysis of two case studies highlights new hotspots that would remain undetected by standard approaches and shows how greater prediction power can be achieved when several tissues are jointly considered.
van der Valk RJP, Kreiner-Moller E, Kooijman MN, et al., 2014, A novel common variant in DCST2 is associated with length in early life and height in adulthood, Human Molecular Genetics, Vol: 24, Pages: 1155-1168, ISSN: 1460-2083
Common genetic variants have been identified for adult height, but not much is known about the genetics of skeletal growth in early life. To identify common genetic variants that influence fetal skeletal growth, we meta-analyzed 22 genome-wide association studies (Stage 1; N = 28 459). We identified seven independent top single nucleotide polymorphisms (SNPs) (P < 1 × 10−6) for birth length, of which three were novel and four were in or near loci known to be associated with adult height (LCORL, PTCH1, GPR126 and HMGA2). The three novel SNPs were followed-up in nine replication studies (Stage 2; N = 11 995), with rs905938 in DC-STAMP domain containing 2 (DCST2) genome-wide significantly associated with birth length in a joint analysis (Stages 1 + 2; β = 0.046, SE = 0.008, P = 2.46 × 10−8, explained variance = 0.05%). Rs905938 was also associated with infant length (N = 28 228; P = 5.54 × 10−4) and adult height (N = 127 513; P = 1.45 × 10−5). DCST2 is a DC-STAMP-like protein family member and DC-STAMP is an osteoclast cell-fusion regulator. Polygenic scores based on 180 SNPs previously associated with human adult stature explained 0.13% of variance in birth length. The same SNPs explained 2.95% of the variance of infant length. Of the 180 known adult height loci, 11 were genome-wide significantly associated with infant length (SF3B4, LCORL, SPAG17, C6orf173, PTCH1, GDF5, ZNFX1, HHIP, ACAN, HLA locus and HMGA2). This study highlights that common variation in DCST2 influences variation in early growth and adult height.
The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.
Kirk P, Witkover A, Bangham CRM, et al., 2013, Balancing the Robustness and Predictive Performance of Biomarkers, JOURNAL OF COMPUTATIONAL BIOLOGY, Vol: 20, Pages: 979-989, ISSN: 1066-5277
Turro E, Lewin AM, 2013, Statistical analysis of mapped reads from mRNA-seq data., Advances in Statistical Bioinformatics, Editors: Vannucci, ISBN: 9781107244917
Thillai M, Eberhardt C, Lewin AM, et al., 2012, Sarcoidosis and tuberculosis cytokine profiles: indistinguishable in bronchoalveolar lavage but different in blood, PLoS One, Vol: 7, ISSN: 1932-6203
BACKGROUND: The clinical, radiological and pathological similarities between sarcoidosis and tuberculosis can make disease differentiation challenging. A complicating factor is that some cases of sarcoidosis may be initiated by mycobacteria. We hypothesised that immunological profiling might provide insight into a possible relationship between the diseases or allow us to distinguish between them. METHODS: We analysed bronchoalveolar lavage (BAL) fluid in sarcoidosis (n = 18), tuberculosis (n = 12) and healthy volunteers (n = 16). We further investigated serum samples in the same groups; sarcoidosis (n = 40), tuberculosis (n = 15) and healthy volunteers (n = 40). A cross-sectional analysis of multiple cytokine profiles was performed and data used to discriminate between samples. RESULTS: We found that BAL profiles were indistinguishable between both diseases and significantly different from healthy volunteers. In sera, tuberculosis patients had significantly lower levels of the Th2 cytokine interleukin-4 (IL-4) than those with sarcoidosis (p = 0.004). Additional serum differences allowed us to create a linear regression model for disease differentiation (within-sample accuracy 91%, cross-validation accuracy 73%). CONCLUSIONS: These data warrant replication in independent cohorts to further develop and validate a serum cytokine signature that may be able to distinguish sarcoidosis from tuberculosis. Systemic Th2 cytokine differences between sarcoidosis and tuberculosis may also underly different disease outcomes to similar respiratory stimuli.
Kirk PDW, Witkover A, Courtney A, et al., 2011, Plasma proteome analysis in HTLV-1-associated myelopathy/tropical spastic paraparesis, Retrovirology, Vol: 8, ISSN: 1742-4690
Background: Human T lymphotropic virus Type 1 (HTLV-1) causes a chronic inflammatory disease of the central nervous system known as HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM) which resembles chronic spinal forms of multiple sclerosis (MS). The pathogenesis of HAM remains uncertain. To aid in the differential diagnosis of HAM and to identify pathogenetic mechanisms we analysed the plasma proteome in asymptomatic HTLV-1 carriers (ACs), patients with HAM, uninfected controls and patients with MS. We used surface-enhanced laser desorption-ionization (SELDI) mass spectrometry to analyse the plasma proteome in 68 HTLV-1-infected individuals (in two non-overlapping sets, each comprising 17 patients with HAM and 17 ACs), 16 uninfected controls, and 11 patients with secondary progressive MS. Candidate biomarkers were identified by tandem Q-TOF mass spectrometry. Results: The concentrations of three plasma proteins – high [2-microglobulin], high [Calgranulin B], and low [apolipoprotein A2] – were specifically associated with HAM, independently of proviral load. The plasma [2-microglobulin] was positively correlated with disease severity. Conclusions: The results indicate that monocytes are activated by contact with activated endothelium in HAM. Using 2-microglobulin and Calgranulin B alone we derive a diagnostic algorithm that correctly classified the disease status (presence or absence of HAM) in 81% of HTLV-1-infected subjects in the cohort.
Turro E, Su S-Y, Goncalves A, et al., 2011, Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads, GENOME BIOLOGY, Vol: 12, ISSN: 1474-760X
Turro E, Lewin A, Rose A, et al., 2010, MMBGX: a method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays (vol 38, pg e4, 2010), NUCLEIC ACIDS RESEARCH, Vol: 38, Pages: 1413-1413, ISSN: 0305-1048
Turro E, Lewin A, Rose A, et al., 2010, MMBGX: a method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays, NUCLEIC ACIDS RESEARCH, Vol: 38, ISSN: 0305-1048
Bochkina N, Lewin A, 2009, Bayesian Modeling in Bioinformatics, Bayesian Modeling in Bioinformatics, Editors: Dey, Ghosh, Mallick, Publisher: Chapman & Hall/CRC, ISBN: 9781420070170
These models have been reviewed in some detail in Lewin & Richardson (2007). In this chapter we review our work using Bayesian hierarchical models to select ...
Kulinskaya E, Lewin A, 2009, On fuzzy familywise error rate and false discovery rate procedures for discrete distributions, BIOMETRIKA, Vol: 96, Pages: 201-211, ISSN: 0006-3444
Kulinskaya E, Lewin A, 2009, Testing for linkage and Hardy-Weinberg disequilibrium, ANNALS OF HUMAN GENETICS, Vol: 73, Pages: 253-262, ISSN: 0003-4800
Lewin A, Richardson S, 2008, Bayesian Methods for Microarray Data, Handbook of Statistical Genetics: Third Edition, Pages: 267-295, ISBN: 9780470058305
In this article, we review the use of Bayesian methods for analyzing gene expression data. We focus on methods that select groups of genes on the basis of their expression in RNA samples derived under different experimental conditions. First, we describe Bayesian methods for estimating gene expression level from the intensity measurements obtained from the analysis of microarray images. Next, we discuss the issues involved in assessing differential gene expression between two conditions at a time, including models for classifying the genes as differentially expressed or not. In the last two sections, we present models for grouping gene expression profiles over different experimental conditions, in order to find coexpressed genes, and multivariate models for finding gene signatures, i.e. for selecting a parsimonious group of genes that discriminate between entities such as subtypes of disease. © 2007 John Wiley & Sons, Ltd.
Lewin A, Richardson S, 2007, Handbook of statistical genetics, Handbook of statistical genetics, Editors: Balding, Publisher: Wiley-Interscience, ISBN: 9780470058305
The Handbook for Statistical Genetics is widely regarded as the reference work in the field.
Lewin A, Bochkina N, Richardson S, 2007, Fully bayesian mixture model for differential gene expression: Simulations and model checks, STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, Vol: 6, ISSN: 2194-6302
Lewin A, Grieve IC, 2006, Grouping Gene Ontology terms to improve the assessment of gene set enrichment in microarray data, BMC BIOINFORMATICS, Vol: 7, ISSN: 1471-2105
We present a Bayesian hierarchical model for detecting differentially expressing genes that includes simultaneous estimation of array effects, and show how to use the output for choosing lists of genes for further investigation. We give empirical evidence that expression-level dependent array effects are needed, and explore different nonlinear functions as part of our model-based approach to normalization. The model includes gene-specific variances but imposes some necessary shrinkage through a hierarchical structure. Model criticism via posterior predictive checks is discussed. Modeling the array effects (normalization) simultaneously with differential expression gives fewer false positive results. To choose a list of genes, we propose to combine various criteria (for instance, fold change and overall expression) into a single indicator variable for each gene. The posterior distribution of these variables is used to pick the list of genes, thereby taking into account uncertainty in parameter estimates. In an application to mouse knockout data, Gene Ontology annotations over- and underrepresented among the genes on the chosen list are consistent with biological expectations. © 2005, The International Biometric Society.
Hein A, Lewin A, Richardson S, 2006, Bayesian inference for gene expression and proteomics, Bayesian inference for gene expression and proteomics, Editors: Do, Müller, Vannucci, Publisher: Cambridge Univ Pr, ISBN: 9780521860925
3 Bayesian Hierarchical Models for Inference in Microarray Data ANNE-METTE K. HEIN, ALEX LEWIN, AND SYLVIA RICHARDSON Imperial College Abstract We review ...
Broët P, Lewin A, Richardson S, et al., 2004, A mixture model based strategy for selecting sets of genes in multiclass response microarray experiments., Bioinformatics
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.