Publications from our Researchers

Several of our current PhD candidates and fellow researchers at the Data Science Institute have published, or in the proccess of publishing, papers to present their research.  

Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

    Curcin V, Ghanem M, Guo Y, 2010,

    The design and implementation of a workflow analysis tool

    Curcin V, Ghanem M, Guo Y, 2010,

    Polymorphic type framework for scientific workflows with relational data model

    , International Journal of Business Process Integration and Management, Vol: 5, Pages: 45-62, ISSN: 1741-8763

    Scientific workflow systems provide languages for representing complex scientific processes as decompositions into lower level tasks, down to the level of atomic, executable units. To support data analysis activities, a wide variety of such languages represent data transformation and processing operations as task nodes within a workflow. Adding data type information to the task inputs and outputs allows workflow authors to perform type checking at design time, search for compatible nodes in public component repositories and define specifications of abstract workflows. Introducing support for strict data typing simplifies the implementation of a workflow system in addressing these issues, but at the expense of losing flexibility. We address this challenge by introducing workflow type signatures suitable for use in registries and for type matching, and developing a polymorphic type inference over compositions of such signatures. The focus is on the relational data model, popular in data analysis workflow systems, and the techniques introduced are validated by applying the inference engine prototype to an adverse drug reaction study implemented in the relational algebra subset of the Discovery Net workflow system. Copyright © 2010 Inderscience Enterprises Ltd.

    Deng X, Guo Y, Ghanem M, 2010,

    Learning Ensemble Models on Categorized Datasets.

    , Publisher: CSREA Press, Pages: 242-248
    Guo L, Guo Y, Tia X, 2010,

    IC cloud: A design space for composable cloud computing

    , Pages: 394-401

    Cloud computing has attracted great interest from both academic and industrial communities. Different paradigms, architectures and applications have emerged. However, to the best of our knowledge, only few efforts have been de- voted to study the architecture as well as implementation details for building up a cloud computing system. In this paper, we present our design and implementation of Imperial College Cloud (IC Cloud). The goal of IC Cloud is to provide a generic design space where various cloud computing architectures and implementation strategies can be systematically studied. The IC Cloud design strictly fol- lows the SOA principle and incorporates a highly flexible system design approach. © 2010 IEEE.

    He S, Guo Y, Ghanem M, 2010,

    Incremental Learning of Relations from the Most Frequent Patterns in Conversations for Microblogging Services

    , Pages: 33-44
    Huntley DM, Pandis I, Butcher SA, Ackers JPet al., 2010,

    Bioinformatic analysis of Entamoeba histolytica SINE1 elements

    , BMC GENOMICS, Vol: 11, ISSN: 1471-2164
    Kapushesky M, Emam I, Holloway E, Kurnosov P, Zorin A, Malone J, Rustici G, Williams E, Parkinson H, Brazma Aet al., 2010,

    Gene Expression Atlas at the European Bioinformatics Institute

    , NUCLEIC ACIDS RESEARCH, Vol: 38, Pages: D690-D698, ISSN: 0305-1048
    Ma Y, Guo Y, Ghanem M, 2010,

    RECA: Referenced energy-based CDS algorithm in wireless sensor networks

    , Int. J. Commun. Syst., Vol: 23, Pages: 125-138, ISSN: 1074-5351
    Curcin V, Ghanem M, Guo Y, 2009,

    Analysing scientific workflows with Computational Tree Logic. Journal of Cluster Computing

    , Journal of Cluster Computing: Special Issue of Recent Advances in e-Science, ISSN: 1386-7857

    Motivated by the widespread use of workflow systems in e-Science applications, this article introduces a formal analysis framework for the verification and profiling of the control flow aspects of scientific workflows. The framework relies on process algebras that characterise each workflow component with a process behaviour, which is then used to build a CTL state model that can be reasoned about. We demonstrate the benefits of the approach by modelling the control flow behaviour of the Discovery Net system, one of the earliest workflow-based e-Science systems, and present how some key properties of workflows and individual service utilisation can be queried at design time. Our approach is generic and can be applied easily to modelling workflows developed in any other system. It also provides a formal basis for the comparison of control aspects of e-Science workflow systems and a design method for future systems.

    David Birch, 2009,

    Unifying Procedural Graphics

    , Imperial College Dept Computing Distinguished Projects

    Modern graphics scenes are complex requiring huge volumes of content to create compellingvisual e ects. This volume increasingly exceeds current content creation, storage and deliverymechanisms.One solution is procedural or algorithmic graphics which can be executed to generate contenton demand. However these algorithms are hard to create - requiring either the artist knowinghow to write code or the programmer to be an artist!A large number of procedural graphics techniques have been developed, each with successin its own domain. Unfortunately each formalism currently has to be implemented in separateenvironments with no unified system for combining procedural graphics frameworks.We present an easy to use, highly expressive environment for the creation of proceduralgraphics which draws together several types of procedural formalisms including LSystems, CSGlike trees, math based modelling and graphical pipelines. The system would dovetail well withmany other procedural frameworks and would map well to implementation on modern GeneralPurpose GPUs.

    Deng X, Guo Y, Ghanem M, 2009,

    Real-time data mining methodology and a supporting framework.

    , Proceedings of 2nd IEEE International Workshop on Data Mining and Artificial Intelligence DMAI 2009
    Deng X, Guo Y, Ghanem M, 2009,

    Dynamic data mining: A novel data mining process model

    , 5th International Conference on Data Mining, DMIN'09
    Ma Y, Ghanem M, Guo Y, 2009,

    An Experimental Study of the Distributed Clustering for Air Pollution Pattern Recognition in Sensor Networks

    , Proceedings of the 2009 IADIS European Conference on Data Mining.
    Munro RE, Guo Y, 2009,

    Solutions for complex, multi data type and multi tool analysis: principles and applications of using workflow and pipelining methods.

    , Methods in molecular biology (Clifton, N.J.), Vol: 563, Pages: 259-271, ISSN: 1064-3745

    Analytical workflow technology, sometimes also called data pipelining, is the fundamental component that provides the scalable analytical middleware that can be used to enable the rapid building and deployment of an analytical application. Analytical workflows enable researchers, analysts and informaticians to integrate and access data and tools from structured and non-structured data sources so that analytics can bridge different silos of information; compose multiple analytical methods and data transformations without coding; rapidly develop applications and solutions by visually constructing analytical workflows that are easy to revise should the requirements change; access domain-specific extensions for specific projects or areas, for example, text extraction, visualisation, reporting, genetics, cheminformatics, bioinformatics and patient-based analytics; automatically deploy workflows directly into web portals and as web services to be part of a service-oriented architecture (SOA). By performing workflow building, using a middleware layer for data integration, it is a relatively simple exercise to visually design an analytical process for data analysis and then publish this as a service to a web browser. All this is encapsulated into what can be referred to as an ’Embedded Analytics’ methodology which will be described here with examples covering different scientifically focused data analysis problems.

    Parkinson H, Kapushesky M, Kolesnikov N, Rustici G, Shojatalab M, Abeygunawardena N, Berube H, Dylag M, Emam I, Farne A, Holloway E, Lukk M, Malone J, Mani R, Pilicheva E, Rayner TF, Rezwan F, Sharma A, Williams E, Bradley XZ, Adamusiak T, Brandizi M, Burdett T, Coulson R, Krestyaninova M, Kurnosov P, Maguire E, Neogi SG, Rocca-Serra P, Sansone S-A, Sklyar N, Zhao M, Sarkans U, Brazma Aet al., 2009,

    ArrayExpress update-from an archive of functional genomics experiments to the atlas of gene expression

    , NUCLEIC ACIDS RESEARCH, Vol: 37, Pages: D868-D872, ISSN: 0305-1048

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=607&limit=15&page=6&respub-action=search.html Current Millis: 1534609749969 Current Time: Sat Aug 18 17:29:09 BST 2018