Publications from our Researchers

Several of our current PhD candidates and fellow researchers at the Data Science Institute have published, or in the proccess of publishing, papers to present their research.  

Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

  • Conference paper
    Baroukh C, Rowe A, Guo Y, 2010,

    Process Calculi for Systems Biology and applications in severe asthma

    , Publisher: IEEE, Pages: 217-222
  • Conference paper
    Deng X, Guo Y, Ghanem M, 2010,

    Learning Ensemble Models on Categorized Datasets

  • Journal article
    Curcin V, Ghanem M, Guo Y, 2010,

    Polymorphic type framework for scientific workflows with relational data model

    , International Journal of Business Process Integration and Management, Vol: 5, Pages: 45+-45+, ISSN: 1741-8763
  • Journal article
    Curcin V, Ghanem M, Guo Y, 2010,

    The design and implementation of a workflow analysis tool

    , Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol: 368, Pages: 4193-4208
  • Conference paper
    Guo L, Guo Y, Tian X, 2010,

    IC Cloud: A Design Space for Composable Cloud Computing

    , Pages: 394-401
  • Conference paper
    Deng X, Guo Y, Ghanem M, 2009,

    Real-time data mining methodology and a supporting framework.

    , Proceedings of 2nd IEEE International Workshop on Data Mining and Artificial Intelligence DMAI 2009
  • Conference paper
    Deng X, Guo Y, Ghanem M, 2009,

    Dynamic data mining: A novel data mining process model

    , 5th International Conference on Data Mining, DMIN'09
  • Conference paper
    Ma Y, Ghanem M, Guo Y, 2009,

    An Experimental Study of the Distributed Clustering for Air Pollution Pattern Recognition in Sensor Networks

    , Proceedings of the 2009 IADIS European Conference on Data Mining.
  • Journal article
    Parkinson H, Kapushesky M, Kolesnikov N, Rustici G, Shojatalab M, Abeygunawardena N, Berube H, Dylag M, Emam I, Farne A, Holloway E, Lukk M, Malone J, Mani R, Pilicheva E, Rayner TF, Rezwan F, Sharma A, Williams E, Bradley XZ, Adamusiak T, Brandizi M, Burdett T, Coulson R, Krestyaninova M, Kurnosov P, Maguire E, Neogi SG, Rocca-Serra P, Sansone S-A, Sklyar N, Zhao M, Sarkans U, Brazma Aet al., 2009,

    ArrayExpress update-from an archive of functional genomics experiments to the atlas of gene expression

    , NUCLEIC ACIDS RESEARCH, Vol: 37, Pages: D868-D872, ISSN: 0305-1048
  • Journal article
    Munro RE, Guo Y, 2009,

    Solutions for complex, multi data type and multi tool analysis: principles and applications of using workflow and pipelining methods.

    , Methods in molecular biology (Clifton, N.J.), Vol: 563, Pages: 259-271, ISSN: 1064-3745

    Analytical workflow technology, sometimes also called data pipelining, is the fundamental component that provides the scalable analytical middleware that can be used to enable the rapid building and deployment of an analytical application. Analytical workflows enable researchers, analysts and informaticians to integrate and access data and tools from structured and non-structured data sources so that analytics can bridge different silos of information; compose multiple analytical methods and data transformations without coding; rapidly develop applications and solutions by visually constructing analytical workflows that are easy to revise should the requirements change; access domain-specific extensions for specific projects or areas, for example, text extraction, visualisation, reporting, genetics, cheminformatics, bioinformatics and patient-based analytics; automatically deploy workflows directly into web portals and as web services to be part of a service-oriented architecture (SOA). By performing workflow building, using a middleware layer for data integration, it is a relatively simple exercise to visually design an analytical process for data analysis and then publish this as a service to a web browser. All this is encapsulated into what can be referred to as an ’Embedded Analytics’ methodology which will be described here with examples covering different scientifically focused data analysis problems.

  • Journal article
    Curcin V, Ghanem M, Guo Y, 2009,

    Analysing scientific workflows with Computational Tree Logic. Journal of Cluster Computing

    , Journal of Cluster Computing: Special Issue of Recent Advances in e-Science, ISSN: 1386-7857

    Motivated by the widespread use of workflow systems in e-Science applications, this article introduces a formal analysis framework for the verification and profiling of the control flow aspects of scientific workflows. The framework relies on process algebras that characterise each workflow component with a process behaviour, which is then used to build a CTL state model that can be reasoned about. We demonstrate the benefits of the approach by modelling the control flow behaviour of the Discovery Net system, one of the earliest workflow-based e-Science systems, and present how some key properties of workflows and individual service utilisation can be queried at design time. Our approach is generic and can be applied easily to modelling workflows developed in any other system. It also provides a formal basis for the comparison of control aspects of e-Science workflow systems and a design method for future systems.

  • Conference paper
    Curcin V, Ghanem M, Guo Y, Darlington Jet al., 2008,

    Mining adverse drug reactions with e-science workflows.

    , Proceedings of the 4th Cairo International Biomedical Engineering Conference, 2008. CIBEC 2008
  • Book chapter
    Ghanem M, Curcin V, Wendel P, Guo Yet al., 2008,

    Building and using analytical workflows in Discovery Net

    , Data Mining Techniques in Grid Environments. Dubitzky, Werner (Ed)., Publisher: Wiley-Blackwell, Pages: 119-140, ISBN: 9780470512586

    The Discovery Net platform is built around a workflow model for integrating distributed data sources and analytical tools. The platform was originally designed to support the design and execution of distributed data mining tasks within a grid-based environment. However, over the years it has evolved into a generic data analysis platform with applications in such diverse areas as bioinformatics, cheminformatics, text mining and business intelligence. In this work we present our experience in designing the platform and map out the evolution paths for a workflow language, and its architecture, that need to address the requirements of different scientific domains.

  • Journal article
    Ma Y, Richards M, Ghanem M, Guo Y, Hassard Jet al., 2008,

    Air pollution monitoring and mining based on sensor grid in London

    , Sensors, Vol: 8, Pages: 3601-3623, ISSN: 1424-8220

    In this paper, we present a distributed infrastructure based on wireless sensors network and Grid computing technology for air pollution monitoring and mining, which aims to develop low-cost and ubiquitous sensor networks to collect real-time, large scale and comprehensive environmental data from road traffic emissions for air pollution monitoring in urban environment. The main informatics challenges in respect to constructing the high-throughput sensor Grid are discussed in this paper. We present a twolayer network framework, a P2P e-Science Grid architecture, and the distributed data mining algorithm as the solutions to address the challenges. We simulated the system in TinyOS to examine the operation of each sensor as well as the networking performance. We also present the distributed data mining result to examine the effectiveness of the algorithm.

  • Conference paper
    Ghanem M, Curcin V, Guo Y, 2008,

    GoTag: A Case Study in Using a Shared UK e-Science Infrastructure for the Automatic Annotation of Medline Documents

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=607&limit=15&page=13&respub-action=search.html Current Millis: 1627248731387 Current Time: Sun Jul 25 22:32:11 BST 2021