Imperial College London

DrRossellaArcucci

Faculty of EngineeringDepartment of Earth Science & Engineering

Senior Lecturer in Data Science and Machine Learning
 
 
 
//

Contact

 

r.arcucci Website

 
 
//

Location

 

Royal School of MinesSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

83 results found

Dmitrewski A, Molina-Solana M, Arcucci R, 2022, CNTRLDA: A building energy management control system with real-time adjustments. Application to indoor temperature, BUILDING AND ENVIRONMENT, Vol: 215, ISSN: 0360-1323

Journal article

Buizza C, Casas CQ, Nadler P, Mack J, Marrone S, Titus Z, Le Cornec C, Heylen E, Dur T, Ruiz LB, Heaney C, Lopez JAD, Kumar KSS, Arcucci Ret al., 2022, Data Learning: Integrating Data Assimilation and Machine Learning, JOURNAL OF COMPUTATIONAL SCIENCE, Vol: 58, ISSN: 1877-7503

Journal article

Lever J, Arcucci R, Cai J, 2022, Social Data Assimilation of Human Sensor Networks for Wildfires, 15th ACM International Conference on Pervasive Technologies Related to Assistive Environments (PETRA), Publisher: ASSOC COMPUTING MACHINERY, Pages: 455-462

Conference paper

Lever J, Arcucci R, 2022, Towards Social Machine Learning for Natural Disasters, 22nd Annual International Conference on Computational Science (ICCS), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 756-769, ISSN: 0302-9743

Conference paper

Cheng S, Quilodran-Casas C, Arcucci R, 2022, Reduced Order Surrogate Modelling and Latent Assimilation for Dynamical Systems, 22nd Annual International Conference on Computational Science (ICCS), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 31-44, ISSN: 0302-9743

Conference paper

Arcucci R, Casas CQ, Joshi A, Obeysekara A, Mottet L, Guo Y-K, Pain Cet al., 2022, Merging Real Images with Physics Simulations via Data Assimilation, 27th International European Conference on Parallel and Distributed Computing (Euro-Par), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 255-266, ISSN: 0302-9743

Conference paper

Tajnafoi G, Arcucci R, Mottet L, Vouriot C, Molina-Solana M, Pain C, Guo Y-Ket al., 2021, Variational Gaussian process for optimal sensor placement, Applications of Mathematics, Vol: 66, Pages: 287-317, ISSN: 0373-6725

Sensor placement is an optimisation problem that has recently gained great relevance. In order to achieve accurate online updates of a predictive model, sensors are used to provide observations. When sensor location is optimally selected, the predictive model can greatly reduce its internal errors. A greedy-selection algorithm is used for locating these optimal spatial locations from a numerical embedded space. A novel architecture for solving this big data problem is proposed, relying on a variational Gaussian process. The generalisation of the model is further improved via the preconditioning of its inputs: Masked Autoregressive Flows are implemented to learn nonlinear, invertible transformations of the conditionally modelled spatial features. Finally, a global optimisation strategy extending the Mutual Information-based optimisation and fine-tuning of the selected optimal location is proposed. The methodology is parallelised to speed up the computational time, making these tools very fast despite the high complexity associated with both spatial modelling and placement tasks. The model is applied to a real three-dimensional test case considering a room within the Clarence Centre building located in Elephant and Castle, London, UK.

Journal article

Wu P, Chang X, Yuan W, Sun J, Zhang W, Arcucci R, Guo Yet al., 2021, Fast data assimilation (FDA): Data assimilation by machine learning for faster optimize model state, JOURNAL OF COMPUTATIONAL SCIENCE, Vol: 51, ISSN: 1877-7503

Journal article

Bonavita M, Arcucci R, Carrassi A, Dueben P, Geer AJ, Le Saux B, Longepe N, Mathieu P-P, Raynaud Let al., 2021, Machine Learning for Earth System Observation and Prediction, Publisher: AMER METEOROLOGICAL SOC, Pages: E710-E716, ISSN: 0003-0007

Conference paper

Cheng S, Pain CC, Guo Y-K, Arcucci Ret al., 2021, Real-time Updating of Dynamic Social Networks for COVID-19 Vaccination Strategies

<jats:title>Abstract</jats:title><jats:p>Vaccination strategy is crucial in fighting the COVID-19 pandemic. Since the supply is still limited in many countries, contact network-based interventions can be most powerful to set an efficient strategy by identifying high-risk individuals or communities. However, due to the high dimension, only partial and noisy network information can be available in practice, especially for dynamic systems where contact networks are highly time-variant. Furthermore, the numerous mutations of SARS-CoV-2 have a significant impact on the infectious probability, requiring real-time network updating algorithms. In this study, we propose a sequential network updating approach based on data assimilation techniques to combine different sources of temporal information. We then prioritise the individuals with high-degree or high-centrality, obtained from assimilated networks, for vaccination. The assimilation-based approach is compared with the standard method (based on partially observed networks) and a random selection strategy in terms of vaccination effectiveness in a SIR model. The numerical comparison is first carried out using real-world face-to-face dynamic networks collected in a high school, followed by sequential multi-layer networks generated relying on the Barabasi-Albert model emulating large-scale social networks with several communities.</jats:p>

Journal article

D'Amore L, Murano A, Sorrentino L, Arcucci R, Laccetti Get al., 2021, Toward a multilevel scalable parallel Zielonka's algorithm for solving parity games, CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, Vol: 33, ISSN: 1532-0626

Journal article

Kumar P, Kalaiarasan G, Porter AE, Pinna A, Kłosowski MM, Demokritou P, Chung KF, Pain C, Arvind DK, Arcucci R, Adcock IM, Dilliway Cet al., 2021, An overview of methods of fine and ultrafine particle collection for physicochemical characterisation and toxicity assessments., Science of the Total Environment, Vol: 756, Pages: 1-22, ISSN: 0048-9697

Particulate matter (PM) is a crucial health risk factor for respiratory and cardiovascular diseases. The smaller size fractions, ≤2.5 μm (PM2.5; fine particles) and ≤0.1 μm (PM0.1; ultrafine particles), show the highest bioactivity but acquiring sufficient mass for in vitro and in vivo toxicological studies is challenging. We review the suitability of available instrumentation to collect the PM mass required for these assessments. Five different microenvironments representing the diverse exposure conditions in urban environments are considered in order to establish the typical PM concentrations present. The highest concentrations of PM2.5 and PM0.1 were found near traffic (i.e. roadsides and traffic intersections), followed by indoor environments, parks and behind roadside vegetation. We identify key factors to consider when selecting sampling instrumentation. These include PM concentration on-site (low concentrations increase sampling time), nature of sampling sites (e.g. indoors; noise and space will be an issue), equipment handling and power supply. Physicochemical characterisation requires micro- to milli-gram quantities of PM and it may increase according to the processing methods (e.g. digestion or sonication). Toxicological assessments of PM involve numerous mechanisms (e.g. inflammatory processes and oxidative stress) requiring significant amounts of PM to obtain accurate results. Optimising air sampling techniques are therefore important for the appropriate collection medium/filter which have innate physical properties and the potential to interact with samples. An evaluation of methods and instrumentation used for airborne virus collection concludes that samplers operating cyclone sampling techniques (using centrifugal forces) are effective in collecting airborne viruses. We highlight that predictive modelling can help to identify pollution hotspots in an urban environment for the efficient collection of PM mass. This review provides

Journal article

Quilodrán-Casas C, Silva VS, Arcucci R, Heaney CE, Guo Y, Pain CCet al., 2021, Digital twins based on bidirectional LSTM and GAN for modelling COVID-19

The outbreak of the coronavirus disease 2019 (COVID-19) has now spreadthroughout the globe infecting over 100 million people and causing the death ofover 2.2 million people. Thus, there is an urgent need to study the dynamics ofepidemiological models to gain a better understanding of how such diseasesspread. While epidemiological models can be computationally expensive, recentadvances in machine learning techniques have given rise to neural networks withthe ability to learn and predict complex dynamics at reduced computationalcosts. Here we introduce two digital twins of a SEIRS model applied to anidealised town. The SEIRS model has been modified to take account of spatialvariation and, where possible, the model parameters are based on official virusspreading data from the UK. We compare predictions from a data-correctedBidirectional Long Short-Term Memory network and a predictive GenerativeAdversarial Network. The predictions given by these two frameworks are accuratewhen compared to the original SEIRS model data. Additionally, these frameworksare data-agnostic and could be applied to towns, idealised or real, in the UKor in other countries. Also, more compartments could be included in the SEIRSmodel, in order to study more realistic epidemiological behaviour.

Journal article

Afzali J, Casas CQ, Arcucci R, 2021, Latent GAN: Using a Latent Space-Based GAN for Rapid Forecasting of CFD Models, Pages: 360-372, ISSN: 0302-9743

The focus of this study is to simulate realistic fluid flow, through Machine Learning techniques that could be utilised in real-time forecasting of urban air pollution. We propose a novel Latent GAN architecture which looks at combining an AutoEncoder with a Generative Adversarial Network to predict fluid flow at the proceeding timestep of a given input, whilst keeping computational costs low. This architecture is applied to tracer flows and velocity fields around an urban city. We present a pair of AutoEncoders capable of dimensionality reduction of 3 orders of magnitude. Further, we present a pair of Generator models capable of performing real-time forecasting of tracer flows and velocity fields. We demonstrate that the models, as well as the latent spaces generated, learn and retain meaningful physical features of the domain. Despite the domain of this project being that of computational fluid dynamics, the Latent GAN architecture is designed to be generalisable such that it can be applied to other dynamical systems.

Conference paper

Amendola M, Arcucci R, Mottet L, Casas CQ, Fan S, Pain C, Linden P, Guo YKet al., 2021, Data Assimilation in the Latent Space of a Convolutional Autoencoder, Pages: 373-386, ISSN: 0302-9743

Data Assimilation (DA) is a Bayesian inference that combines the state of a dynamical system with real data collected by instruments at a given time. The goal of DA is to improve the accuracy of the dynamic system making its result as real as possible. One of the most popular technique for DA is the Kalman Filter (KF). When the dynamic system refers to a real world application, the representation of the state of a physical system usually leads to a big data problem. For these problems, KF results computationally too expensive and mandates to use of reduced order modeling techniques. In this paper we proposed a new methodology we called Latent Assimilation (LA). It consists in performing the KF in the latent space obtained by an Autoencoder with non-linear encoder functions and non-linear decoder functions. In the latent space, the dynamic system is represented by a surrogate model built by a Recurrent Neural Network. In particular, an Long Short Term Memory (LSTM) network is used to train a function which emulates the dynamic system in the latent space. The data from the dynamic model and the real data coming from the instruments are both processed through the Autoencoder. We apply the methodology to a real test case and we show that the LA has a good performance both in accuracy and in efficiency.

Conference paper

Arcucci R, Zhu J, Hu S, Guo Y-Ket al., 2021, Deep Data Assimilation: Integrating Deep Learning with Data Assimilation, APPLIED SCIENCES-BASEL, Vol: 11

Journal article

Mack J, Arcucci R, Molina-Solana M, Guo Y-Ket al., 2020, Attention-based Convolutional Autoencoders for 3D-Variational Data Assimilation, COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, Vol: 372, ISSN: 0045-7825

Journal article

Ruiz LGB, Pegalajar MC, Arcucci R, Molina-Solana Met al., 2020, A time-series clustering methodology for knowledge extraction in energy consumption data, Expert Systems with Applications, Vol: 160, ISSN: 0957-4174

In the Energy Efficiency field, the incorporation of intelligent systems in cities and buildings is motivated by the energy savings and pollution reduction that can be attained. To achieve this goal, energy modelling and a better understanding of how energy is consumed are fundamental factors. As a result, this study proposes a methodology for knowledge acquisition in energy-related data through Time-Series Clustering (TSC) techniques. In our experimentation, we utilize data from the buildings at the University of Granada (Spain) and compare several clustering methods to get the optimum model, in particular, we tested k-Means, k-Medoids, Hierarchical clustering and Gaussian Mixtures; as well as several algorithms to obtain the best grouping, such as PAM, CLARA, and two variants of Lloyd’s method, Small and Large. Thus, our methodology can provide non-trivial knowledge from raw energy data. In contrast to previous studies in this field, not only do we propose a clustering methodology to group time series straightforwardly, but we also present an automatic strategy to search and analyse energy periodicity in these series recursively so that we can deepen granularity and extract information at different levels of detail. The results show that k-Medoids with PAM is the best approach in virtually all cases, and the Squared Euclidean distance outperforms the rest of the metrics.

Journal article

Casas CQ, Arcucci R, Wu P, Pain C, Guo Y-Ket al., 2020, A Reduced Order Deep Data Assimilation model, PHYSICA D-NONLINEAR PHENOMENA, Vol: 412, ISSN: 0167-2789

Journal article

Wang S, Nadler P, Arcucci R, Yang X, Li L, Huang Y, Teng Z, Guo Yet al., 2020, A Bayesian Updating Scheme for Pandemics: Estimating the Infection Dynamics of COVID-19, IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, Vol: 15, Pages: 23-33, ISSN: 1556-603X

Journal article

Nadler P, Wang S, Arcucci R, Yang X, Guo Yet al., 2020, An epidemiological modelling approach for Covid19 via data assimilation, Publisher: arXiv

The global pandemic of the 2019-nCov requires the evaluation of policyinterventions to mitigate future social and economic costs of quarantinemeasures worldwide. We propose an epidemiological model for forecasting andpolicy evaluation which incorporates new data in real-time through variationaldata assimilation. We analyze and discuss infection rates in China, the US andItaly. In particular, we develop a custom compartmental SIR model fit tovariables related to the epidemic in Chinese cities, named SITR model. Wecompare and discuss model results which conducts updates as new observationsbecome available. A hybrid data assimilation approach is applied to makeresults robust to initial conditions. We use the model to do inference oninfection numbers as well as parameters such as the disease transmissibilityrate or the rate of recovery. The parameterisation of the model is parsimoniousand extendable, allowing for the incorporation of additional data andparameters of interest. This allows for scalability and the extension of themodel to other locations or the adaption of novel data sources.

Working paper

Dur TH, Arcucci R, Mottet L, Molina Solana M, Pain C, Guo Y-Ket al., 2020, Weak Constraint Gaussian Processes for optimal sensor placement, JOURNAL OF COMPUTATIONAL SCIENCE, Vol: 42, ISSN: 1877-7503

Journal article

Wu P, Sun J, Chang X, Zhang W, Arcucci R, Guo Y, Pain CCet al., 2020, Data-driven reduced order model with temporal convolutional neural network, COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, Vol: 360, ISSN: 0045-7825

Journal article

Nadler P, Arcucci R, Guo Y, 2020, An Econophysical Analysis of the Blockchain Ecosystem, Pages: 27-42, ISSN: 2198-7246

We propose a novel modelling approach for the cryptocurrency ecosystem. We model on-chain and off-chain interactions as econophysical systems and employ methods from physical sciences to conduct interpretation of latent parameters describing the cryptocurrency ecosystem as well as to generate predictions. We work with an extracted dataset from the Ethereum blockchain which we combine with off-chain data from exchanges. This allows us to study a large part of the transaction flows related to the cryptocurrency ecosystem. From this aggregate system view we deduct that movements on the blockchain and price and trading action on exchanges are interrelated. The relationship is one directional: On-chain token flows towards exchanges have little effect on prices and trading volume, but changes in price and volume affect the flow of tokens towards the exchange.

Conference paper

Nadler P, Arcucci R, Guo Y, 2020, A Neural SIR Model for Global Forecasting, Pages: 254-266

Being able to understand and forecast epidemic developments is crucial for policymakers. We develop a predictive model combining epidemiological dynamics of compartmental models with highly non-linear interactions learned by a LSTM Network. A novel dynamic SIR model is fit to variables related to the population transmission of Covid-19. This is embedded in a Bayesian recursive updating framework which is then coupled with a LSTM network to forecast cases of Covid-19. The model significantly improves forecasts over simple univariate LSTM or SIR models. We apply the model to developed and developing countries and forecast confirmed infections and analyze future trajectories.

Conference paper

Arcucci R, Moutiq L, Guo Y-K, 2020, Neural Assimilation, Editors: Krzhizhanovskaya, Zavodszky, Lees, Dongarra, Sloot, Brissos, Teixeira, Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 155-168, ISBN: 978-3-030-50432-8

Book chapter

Nadler P, Arcucci R, Guo Y-K, 2020, A Scalable Approach to Econometric Inference, Conference on Parallel Computing - Technology Trends (ParCo), Publisher: IOS PRESS, Pages: 59-68, ISSN: 0927-5452

Conference paper

Arcucci R, Casas CQ, Xiao D, Mottet L, Fang F, Wu P, Pain C, Guo Y-Ket al., 2020, A Domain Decomposition Reduced Order Model with Data Assimilation (DD-RODA), Conference on Parallel Computing - Technology Trends (ParCo), Publisher: IOS PRESS, Pages: 189-198, ISSN: 0927-5452

Conference paper

Arcucci R, Mottet L, Casas CAQ, Guitton F, Pain C, Guo Y-Ket al., 2020, Adaptive Domain Decomposition for Effective Data Assimilation, 25th International Conference on Parallel and Distributed Computing (Euro-Par), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 583-595, ISSN: 0302-9743

Conference paper

Aristodemou E, Arcucci R, Mottet L, Robins A, Pain C, Guo Y-Ket al., 2019, Enhancing CFD-LES air pollution prediction accuracy using data assimilation, Building and Environment, Vol: 165, ISSN: 0007-3628

It is recognised worldwide that air pollution is the cause of premature deaths daily, thus necessitating the development of more reliable and accurate numerical tools. The present study implements a three dimensional Variational (3DVar) data assimilation (DA) approach to reduce the discrepancy between predicted pollution concentrations based on Computational Fluid Dynamics (CFD) with the ones measured in a wind tunnel experiment. The methodology is implemented on a wind tunnel test case which represents a localised neighbourhood environment. The improved accuracy of the CFD simulation using DA is discussed in terms of absolute error, mean squared error and scatter plots for the pollution concentration. It is shown that the difference between CFD results and wind tunnel data, computed by the mean squared error, can be reduced by up to three order of magnitudes when using DA. This reduction in error is preserved in the CFD results and its benefit can be seen through several time steps after re-running the CFD simulation. Subsequently an optimal sensors positioning is proposed. There is a trade-off between the accuracy and the number of sensors. It was found that the accuracy was improved when placing/considering the sensors which were near the pollution source or in regions where pollution concentrations were high. This demonstrated that only 14% of the wind tunnel data was needed, reducing the mean squared error by one order of magnitude.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00815212&limit=30&person=true&page=2&respub-action=search.html