Publications from our Researchers

Several of our current PhD candidates and fellow researchers at the Data Science Institute have published, or in the proccess of publishing, papers to present their research.  

Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

  • Journal article
    de Montjoye YKJV,

    Privacy by design in big data: An overview of privacy enhancing technologies in the era of big data analytics

    , arXiv

    The extensive collection and processing of personal information in big data analytics has given rise to serious privacy concerns, related to wide scale electronic surveillance, profiling, and disclosure of private data. To reap the benefits of analytics without invading the individuals' private sphere, it is essential to draw the limits of big data processing and integrate data protection safeguards in the analytics value chain. ENISA, with the current report, supports this approach and the position that the challenges of ...

  • Book chapter
    de Montjoye YKJV, 2015,

    Modeling and UnderstandingIntrinsic Characteristics of Human Mobility

    , Social Phenomena From Data Analysis to Models, Publisher: Springer, ISBN: 9783319140117

    Humans are intrinsically social creatures and our mobility is central to understanding how our societies grow and function. Movement allows us to congregate with our peers, access things we need, and exchange information. Human mobility has huge impacts on topics like urban and transportation planning, social and biologic spreading, and economic outcomes. So far, modeling these processes has been hindered by a lack of data. This is radically changing with the rise of ubiquitous devices. In this chapter, we discuss recent progress deriving insights from the massive, high resolution data sets collected from mobile phone and other devices. We begin with individual mobility, where empirical evidence and statistical models have shown important intrinsic and universal characteristics about our movement: we, as human, are fundamentally slow to explore new places, relatively predictable, and mostly unique. We then explore methods of modeling aggregate movement of people from place to place and discuss how these estimates can be used to understand and optimize transportation infrastructure. Finally, we highlight applications of these findings to the dynamics of disease spread, social networks, and economic outcomes.

  • Journal article
    Rivera-Rubio J, Alexiou I, Bharath AA, 2015,

    Appearance-based indoor localization: a comparison of patch descriptor performance

    , Pattern Recognition Letters, Vol: 66, Pages: 109-117, ISSN: 1872-7344

    Vision is one of the most important of the senses, and humans use it extensively during navigation. We evaluated different types of image and video frame descriptors that could be used to determine distinctive visual landmarks for localizing a person based on what is seen by a camera that they carry. To do this, we created a database containing over 3 km of video-sequences with ground-truth in the form of distance travelled along different corridors. Using this database, the accuracy of localization—both in terms of knowing which route a user is on—and in terms of position along a certain route, can be evaluated. For each type of descriptor, we also tested different techniques to encode visual structure and to search between journeys to estimate a user’s position. The techniques include single-frame descriptors, those using sequences of frames, and both color and achromatic descriptors. We found that single-frame indexing worked better within this particular dataset. This might be because the motion of the person holding the camera makes the video too dependent on individual steps and motions of one particular journey. Our results suggest that appearance-based information could be an additional source of navigational data indoors, augmenting that provided by, say, radio signal strength indicators (RSSIs). Such visual information could be collected by crowdsourcing low-resolution video feeds, allowing journeys made by different users to be associated with each other, and location to be inferred without requiring explicit mapping. This offers a complementary approach to methods based on simultaneous localization and mapping (SLAM) algorithms.

  • Conference paper
    Rivera-Rubio J, Alexiou I, Bharath AA, 2015,

    Associating Locations Between Indoor Journeys from Wearable Cameras

    , 13th European Conference on Computer Vision (ECCV), Publisher: SPRINGER-VERLAG BERLIN, Pages: 29-44, ISSN: 0302-9743
  • Conference paper
    Tauheed F, Heinis T, Ailamaki A, 2015,

    THERMAL-JOIN: A Scalable Spatial Join for Dynamic Workloads

    , Pages: 939-950
  • Conference paper
    Heinis T, Ailamaki A, 2015,

    Reconsolidating Data Structures

    , Pages: 665-670
  • Conference paper
    Karpathiotakis M, Alagiannis I, Heinis T, Branco M, Ailamaki Aet al., 2015,

    Just-In-Time Data Virtualization: Lightweight Data Management with ViDa

  • Conference paper
    Rivera-Rubio J, Alexiou I, Bharath AA, 2015,

    Indoor Localisation with Regression Networks and Place Cell Models.

    , Publisher: BMVA Press, Pages: 147.1-147.1
  • Journal article
    Heinis T, Ham DA, 2015,

    On-the-Fly Data Synopses: Efficient Data Exploration in the Simulation Sciences

    , SIGMOD Record, Vol: 44, Pages: 23-28
  • Journal article
    Wang S, Pandis I, Johnson D, Emam I, Guitton F, Oehmichen A, Guo Yet al., 2014,

    Optimising Correlation Matrix Calculations on Gene Expression Data

    , BMC Bioinformatics, Vol: 15, ISSN: 1471-2105
  • Journal article
    Strege C, Bertone G, Besjes GJ, Caron S, Ruiz de Austri R, Strubig A, Trotta Ret al., 2014,

    Profile likelihood maps of a 15-dimensional MSSM

    , Journal of High Energy Physics, Vol: 2014, ISSN: 1126-6708

    We present statistically convergent profile likelihood maps obtained via globalfits of a phenomenological Minimal Supersymmetric Standard Model with 15 free parameters(the MSSM-15), based on over 250M points. We derive constraints on the modelparameters from direct detection limits on dark matter, the Planck relic density measurementand data from accelerator searches. We provide a detailed analysis of the richphenomenology of this model, and determine the SUSY mass spectrum and dark matterproperties that are preferred by current experimental constraints. We evaluate the impactof the measurement of the anomalous magnetic moment of the muon (g −2) on our results,and provide an analysis of scenarios in which the lightest neutralino is a subdominant componentof the dark matter. The MSSM-15 parameters are relatively weakly constrained bycurrent data sets, with the exception of the parameters related to dark matter phenomenology(M1, M2, µ), which are restricted to the sub-TeV regime, mainly due to the relic densityconstraint. The mass of the lightest neutralino is found to be < 1.5 TeV at 99% C.L., butcan extend up to 3 TeV when excluding the g − 2 constraint from the analysis. Low-massbino-like neutralinos are strongly favoured, with spin-independent scattering cross-sectionsextending to very small values, ∼ 10−20 pb. ATLAS SUSY null searches strongly impacton this mass range, and thus rule out a region of parameter space that is outside the reachof any current or future direct detection experiment. The best-fit point obtained after inclusionof all data corresponds to a squark mass of 2.3 TeV, a gluino mass of 2.1 TeV and a130 GeV neutralino with a spin-independent cross-section of 2.4×10−10 pb, which is withinthe reach of future multi-ton scale direct detection experiments and of the upcoming LHCrun at increased centre-of-mass energy.

  • Journal article
    Martin J, Ringeval C, Trotta R, Vennin Vet al., 2014,

    Compatibility of Planck and BICEP2 results in light of inflation

    , PHYSICAL REVIEW D, Vol: 90, ISSN: 1550-7998
  • Book chapter
    Guo Y, He S, Guo L, 2014,

    Elastic Application Container System: Elastic Web Applications Provisioning

    Cloud applications have been gaining popularity in recent years for their flexibility in resource provisioning according to Web application demands. The Elastic Application Container (EAC) system is a technology that delivers a lightweight virtual resource unit for better resource efficiency and more scalable Web applications in the Cloud. It allows multiple application providers to concurrently run their Web applications on this technology without worrying the demand change of their Web applications. This is because the EAC system constantly monitors the resource usage of all hosting Web applications and automatically reacts to the resource usage change of Web applications (i.e. it automatically handles resource provisioning of the Web applications, such as scaling of the Web applications according to the demand). In the chapter, the authors firstly describe the architecture, its components of the EAC system, in order to give a brief overview of technologies involved in the system. They then present and explain resource-provisioning algorithms and techniques used in the EAC system for demand-driven Web applications. The resource-provisioning algorithms are presented, discussed, and evaluated so as to give readers a clear picture of resource-provisioning algorithms in the EAC system. Finally, the authors compare this EAC system technology with other Cloud technologies in terms of flexibility and resource efficiency.

  • Conference paper
    Guo Y, He S, Guo L, 2014,

    Cloud Resource Monitoring for Intrusion Detection

    We present a novel security monitoring framework for intrusion detection in IaaS cloud infrastructures. The framework uses statistical anomaly detection techniques over data monitored both inside and outside each Virtual Machine instance. We present the architecture of our monitoring framework and describe the implementation of the real-time monitors and detectors. We also describe how the framework is used in three different attack scenarios. For each of the three attack scenarios, we describe how the attack itself works and how it could be detected. We describe what data is monitored in our framework and how the detection is conducted using anomaly detection methods. We also present evaluation of the detection using synthetic and real data sets. Our experimental evaluation across all three scenarios shows that our tools perform well in practical situations and provide a promising direction for future research.

  • Journal article
    Guo Y, He S, Guo L, 2014,

    Enhancing Cloud Resource Utilisation using Statistical Analysis

    Resource provisioning based on virtual machine (VM) has been widely accepted and adopted in cloud computing environments. A key problem resulting from using static scheduling approaches for allocating VMs on different physical machines (PMs) is that resources tend to be not fully utilised. Although some existing cloud reconfiguration algorithms have been developed to address the problem, they normally result in high migration costs and low resource utilisation due to ignoring the multi-dimensional characteristics of VMs and PMs. In this paper we present and evaluate a new algorithm for improving resource utilisation for cloud providers. By using a multivariate probabilistic model, our algorithm selects suitable PMs for VM re-allocation which are then used to generate a reconfiguration plan. We also describe two heuristics metrics which can be used in the algorithm to capture the multi-dimensional characteristics of VMs and PMs. By combining these two heuristics metrics in our experiments, we observed that our approach improves the resource utilisation level by around 8% for cloud providers, such as IC Cloud, which accept user-defined VM configurations and 14% for providers, such as Amazon EC2, which only provide limited types of VM configurations.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=607&limit=15&page=7&respub-action=search.html Current Millis: 1600777329685 Current Time: Tue Sep 22 13:22:09 BST 2020