Publications from our Researchers

Several of our current PhD candidates and fellow researchers at the Data Science Institute have published, or in the proccess of publishing, papers to present their research.  To have a look at one of our most recent papers published, please click here, entitled Visualizing Dynamic Bitcoin Transaction Patterns. The abstract for the paper follows:

Soman, R. K., Birch D & Whyte, J. K. (2017). “Framework for shared visualization and real-time information flow to the construction site”, Proceedings of the 24th EG-ICE Workshop on Intelligent Computing in Engineering, 10-12 July 2017, Nottingham, UK

ABSTRACT: The aim of this paper is to develop a framework for shared visualization between design office and construction office using augmented reality as a platform with focus on the security of Building Information Model. The current paper is part of an ongoing study aimed at creating a real-time bi-directional information flow between the construction office and site and focuses on a shared visualisation context. A framework architecture for enabling shared visualisation with a stress on security of Building Information Model is discussed and a prototype was deployed on an Android device in a controlled environment for testing and the application augmented Building Information object dynamically to the real-world without any latency. Salient features of the prototype include dynamic loading of Building Information content during the runtime, data encapsulation based on user privileges, deployability on portable low-end computing devices etc. Using shared visualization would empower the construction engineers with real-time models updates and results in many near-optimal management solutions to narrow in on the best solution under given the constraints


 Visualizing Dynamic Bitcoin Transaction Patterns

  This work presents a systemic top-down visualization of Bitcoin transaction activity to explore dynamically generated patterns of algorithmic behavior. Bitcoin dominates the cryptocurrency markets and presents researchers with a rich source of real-time transactional data. The pseudonymous yet public nature of the data presents opportunities for the discovery of human and algorithmic behavioral patterns of interest to many parties such as financial regulators, protocol designers, and security analysts. However, retaining visual fidelity to the underlying data to retain a fuller understanding of activity within the network remains challenging, particularly in real time. We expose an effective force-directed graph visualization employed in our large-scale data observation facility to accelerate this data exploration and derive useful insight among domain experts and the general public alike. The high-fidelity visualizations demonstrated in this article allowed for collaborative discovery of unexpected high frequency transaction patterns, including automated laundering operations, and the evolution of multiple distinct algorithmic denial of service attacks on the Bitcoin network.




dsiCreating a Chemistry of Sciences with Big Data: Building the Data Science Institute at Imperial College London

Y. Guo, D. Johnson
Data Science Institute, Imperial College London


The Data Science Institute at Imperial College London launched in April 2014, and will provide a hub for data-driven research and education. Its mission is to provide a focal point for the College's capabilities in multidisciplinary data-driven research by coordinating advanced data science research for college scientists and partners, and educating the next generation of data scientists. We surveyed the data-driven research needs at Imperial College London to gain an understanding across all disciplines offered by the College, and analysed the responses to gain insights into scientific and engineering needs for data science research. A clear message is that multidisciplinarity is essential for Big Data and data science research to enable a "chemistry of sciences": connecting all disciplines by integrating data. This paper presents our efforts to best understand data-driven research needs in a highly multidisciplinary research-intensive institution and describes our vision for the future of the Data Science Institute at Imperial College London.
You can read the full paper here.



Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

    Curcin V, Guo Y, Gilardoni F,

    Scientific Workflow Applied to Nano-and Material Sciences

    Bertone G, Calore F, Caron S, Ruiz R, Kim JS, Trotta R, Weniger Cet al., 2016,

    Global analysis of the pMSSM in light of the Fermi GeV excess: prospects for the LHC Run-II and astroparticle experiments

    Heinis T, Ailamaki A, 2015,

    Reconsolidating Data Structures.

    , Publisher:, Pages: 665-670
    Heinis T, Ham DA, 2015,

    On-the-Fly Data Synopses: Efficient Data Exploration in the Simulation Sciences

    , SIGMOD RECORD, Vol: 44, Pages: 23-28, ISSN: 0163-5808
    Karpathiotakis M, Alagiannis I, Heinis T, Branco M, Ailamaki Aet al., 2015,

    Just-In-Time Data Virtualization: Lightweight Data Management with ViDa.

    , Publisher:
    Tauheed F, Heinis T, Ailamaki A, 2015,

    THERMAL-JOIN: A Scalable Spatial Join for Dynamic Workloads.

    , Publisher: ACM, Pages: 939-950
    Guo Y, He S, Guo L, 2014,

    Enhancing Cloud Resource Utilisation using Statistical Analysis

    Resource provisioning based on virtual machine (VM) has been widely accepted and adopted in cloud computing environments. A key problem resulting from using static scheduling approaches for allocating VMs on different physical machines (PMs) is that resources tend to be not fully utilised. Although some existing cloud reconfiguration algorithms have been developed to address the problem, they normally result in high migration costs and low resource utilisation due to ignoring the multi-dimensional characteristics of VMs and PMs. In this paper we present and evaluate a new algorithm for improving resource utilisation for cloud providers. By using a multivariate probabilistic model, our algorithm selects suitable PMs for VM re-allocation which are then used to generate a reconfiguration plan. We also describe two heuristics metrics which can be used in the algorithm to capture the multi-dimensional characteristics of VMs and PMs. By combining these two heuristics metrics in our experiments, we observed that our approach improves the resource utilisation level by around 8% for cloud providers, such as IC Cloud, which accept user-defined VM configurations and 14% for providers, such as Amazon EC2, which only provide limited types of VM configurations.

    Han R, Ghanem MM, Guo L, Guo Y, Osmond Met al., 2014,

    Enabling cost-aware and adaptive elasticity of multi-tier cloud applications

    He S, Guo L, Guo Y, 2014,

    Elastic application container system: Elastic web applications provisioning

    , Handbook of Research on Demand-Driven Web Services: Theory, Technologies, and Applications, Pages: 376-398, ISBN: 9781466658851

    © 2014 by IGI Global. All rights reserved.Cloud applications have been gaining popularity in recent years for their flexibility in resource provisioning according to Web application demands. The Elastic Application Container (EAC) system is a technology that delivers a lightweight virtual resource unit for better resource efficiency and more scalable Web applications in the Cloud. It allows multiple application providers to concurrently run their Web applications on this technology without worrying the demand change of their Web applications. This is because the EAC system constantly monitors the resource usage of all hosting Web applications and automatically reacts to the resource usage change of Web applications (i.e. it automatically handles resource provisioning of the Web applications, such as scaling of the Web applications according to the demand). In the chapter, the authors firstly describe the architecture, its components of the EAC system, in order to give a brief overview of technologies involved in the system. They then present and explain resource-provisioning algorithms and techniques used in the EAC system for demand-driven Web applications. The resource-provisioning algorithms are presented, discussed, and evaluated so as to give readers a clear picture of resource-provisioning algorithms in the EAC system. Finally, the authors compare this EAC system technology with other Cloud technologies in terms of flexibility and resource efficiency.

    Heinis T, 2014,

    Data analysis: approximation aids handling of big data.

    , Nature, Vol: 515
    Martin J, Ringeval C, Trotta R, Vennin Vet al., 2014,

    The best inflationary models after Planck

    Martin J, Ringeval C, Trotta R, Vennin Vet al., 2014,

    Compatibility of Planck and BICEP2 results in light of inflation

    , PHYSICAL REVIEW D, Vol: 90, ISSN: 2470-0010
    Nie L, Yang X, Adcock I, Xu Z, Guo Yet al., 2014,

    Inferring Cell-Scale Signalling Networks via Compressive Sensing

    , PLOS ONE, Vol: 9, ISSN: 1932-6203
    Strege C, Bertone G, Besjes GJ, Caron S, Ruiz de Austri R, Strubig A, Trotta Ret al., 2014,

    Profile likelihood maps of a 15-dimensional MSSM

    Wang M, Zhang W, Ding W, Dai D, Zhang H, Xie H, Chen L, Guo Y, Xie Jet al., 2014,

    Parallel Clustering Algorithm for Large-Scale Biological Data Sets

    , PLOS ONE, Vol: 9, ISSN: 1932-6203

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=607&limit=15&respub-action=search.html Current Millis: 1495667676726 Current Time: Thu May 25 00:14:36 BST 2017