44 results found
Wu P, Chang X, Yuan W, et al., 2021, Fast data assimilation (FDA): Data assimilation by machine learning for faster optimize model state, Journal of Computational Science, Vol: 51, ISSN: 1877-7503
Data assimilation (DA) can provide the more accurate initial state for numerical forecasting models. But traditional DA algorithms has the problem of long calculation time. This paper proposes fast data assimilation (FDA) based on machine learning. For training model, FDA uses 4DVAR, iForest, MLP, and also includes a modified model that does not require observations. This paper applies FDA in the Lorenz63 dynamical system. The experimental results show that the single analysis time of FDA is almost 524 times faster than 4DVAR. FDA greatly reduces the time of the DA process.
Tajnafoi G, Arcucci R, Mottet L, et al., 2021, Variational Gaussian process for optimal sensor placement, Applications of Mathematics, Vol: 66, Pages: 287-317, ISSN: 0373-6725
Sensor placement is an optimisation problem that has recently gained great relevance. In order to achieve accurate online updates of a predictive model, sensors are used to provide observations. When sensor location is optimally selected, the predictive model can greatly reduce its internal errors. A greedy-selection algorithm is used for locating these optimal spatial locations from a numerical embedded space. A novel architecture for solving this big data problem is proposed, relying on a variational Gaussian process. The generalisation of the model is further improved via the preconditioning of its inputs: Masked Autoregressive Flows are implemented to learn nonlinear, invertible transformations of the conditionally modelled spatial features. Finally, a global optimisation strategy extending the Mutual Information-based optimisation and fine-tuning of the selected optimal location is proposed. The methodology is parallelised to speed up the computational time, making these tools very fast despite the high complexity associated with both spatial modelling and placement tasks. The model is applied to a real three-dimensional test case considering a room within the Clarence Centre building located in Elephant and Castle, London, UK.
Kumar P, Kalaiarasan G, Porter AE, et al., 2021, An overview of methods of fine and ultrafine particle collection for physicochemical characterisation and toxicity assessments., Science of the Total Environment, Vol: 756, Pages: 1-22, ISSN: 0048-9697
Particulate matter (PM) is a crucial health risk factor for respiratory and cardiovascular diseases. The smaller size fractions, ≤2.5 μm (PM2.5; fine particles) and ≤0.1 μm (PM0.1; ultrafine particles), show the highest bioactivity but acquiring sufficient mass for in vitro and in vivo toxicological studies is challenging. We review the suitability of available instrumentation to collect the PM mass required for these assessments. Five different microenvironments representing the diverse exposure conditions in urban environments are considered in order to establish the typical PM concentrations present. The highest concentrations of PM2.5 and PM0.1 were found near traffic (i.e. roadsides and traffic intersections), followed by indoor environments, parks and behind roadside vegetation. We identify key factors to consider when selecting sampling instrumentation. These include PM concentration on-site (low concentrations increase sampling time), nature of sampling sites (e.g. indoors; noise and space will be an issue), equipment handling and power supply. Physicochemical characterisation requires micro- to milli-gram quantities of PM and it may increase according to the processing methods (e.g. digestion or sonication). Toxicological assessments of PM involve numerous mechanisms (e.g. inflammatory processes and oxidative stress) requiring significant amounts of PM to obtain accurate results. Optimising air sampling techniques are therefore important for the appropriate collection medium/filter which have innate physical properties and the potential to interact with samples. An evaluation of methods and instrumentation used for airborne virus collection concludes that samplers operating cyclone sampling techniques (using centrifugal forces) are effective in collecting airborne viruses. We highlight that predictive modelling can help to identify pollution hotspots in an urban environment for the efficient collection of PM mass. This review provides
Quilodrán-Casas C, Silva VS, Arcucci R, et al., 2021, Digital twins based on bidirectional LSTM and GAN for modelling COVID-19, Publisher: arXiv
The outbreak of the coronavirus disease 2019 (COVID-19) has now spreadthroughout the globe infecting over 100 million people and causing the death ofover 2.2 million people. Thus, there is an urgent need to study the dynamics ofepidemiological models to gain a better understanding of how such diseasesspread. While epidemiological models can be computationally expensive, recentadvances in machine learning techniques have given rise to neural networks withthe ability to learn and predict complex dynamics at reduced computationalcosts. Here we introduce two digital twins of a SEIRS model applied to anidealised town. The SEIRS model has been modified to take account of spatialvariation and, where possible, the model parameters are based on official virusspreading data from the UK. We compare predictions from a data-correctedBidirectional Long Short-Term Memory network and a predictive GenerativeAdversarial Network. The predictions given by these two frameworks are accuratewhen compared to the original SEIRS model data. Additionally, these frameworksare data-agnostic and could be applied to towns, idealised or real, in the UKor in other countries. Also, more compartments could be included in the SEIRSmodel, in order to study more realistic epidemiological behaviour.
Arcucci R, Zhu J, Hu S, et al., 2021, Deep Data Assimilation: Integrating Deep Learning with Data Assimilation, APPLIED SCIENCES-BASEL, Vol: 11
Ruiz LGB, Pegalajar MC, Arcucci R, et al., 2020, A time-series clustering methodology for knowledge extraction in energy consumption data, Expert Systems with Applications, Vol: 160, ISSN: 0957-4174
In the Energy Efficiency field, the incorporation of intelligent systems in cities and buildings is motivated by the energy savings and pollution reduction that can be attained. To achieve this goal, energy modelling and a better understanding of how energy is consumed are fundamental factors. As a result, this study proposes a methodology for knowledge acquisition in energy-related data through Time-Series Clustering (TSC) techniques. In our experimentation, we utilize data from the buildings at the University of Granada (Spain) and compare several clustering methods to get the optimum model, in particular, we tested k-Means, k-Medoids, Hierarchical clustering and Gaussian Mixtures; as well as several algorithms to obtain the best grouping, such as PAM, CLARA, and two variants of Lloyd’s method, Small and Large. Thus, our methodology can provide non-trivial knowledge from raw energy data. In contrast to previous studies in this field, not only do we propose a clustering methodology to group time series straightforwardly, but we also present an automatic strategy to search and analyse energy periodicity in these series recursively so that we can deepen granularity and extract information at different levels of detail. The results show that k-Medoids with PAM is the best approach in virtually all cases, and the Squared Euclidean distance outperforms the rest of the metrics.
Mack J, Arcucci R, Molina-Solana M, et al., 2020, Attention-based Convolutional Autoencoders for 3D-Variational Data Assimilation, COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, Vol: 372, ISSN: 0045-7825
Wang S, Nadler P, Arcucci R, et al., 2020, A Bayesian Updating Scheme for Pandemics: Estimating the Infection Dynamics of COVID-19, IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, Vol: 15, Pages: 23-33, ISSN: 1556-603X
D'Amore L, Murano A, Sorrentino L, et al., 2020, Toward a multilevel scalable parallel Zielonka's algorithm for solving parity games, CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, Vol: 33, ISSN: 1532-0626
Nadler P, Wang S, Arcucci R, et al., 2020, An epidemiological modelling approach for Covid19 via data assimilation, Publisher: arXiv
The global pandemic of the 2019-nCov requires the evaluation of policyinterventions to mitigate future social and economic costs of quarantinemeasures worldwide. We propose an epidemiological model for forecasting andpolicy evaluation which incorporates new data in real-time through variationaldata assimilation. We analyze and discuss infection rates in China, the US andItaly. In particular, we develop a custom compartmental SIR model fit tovariables related to the epidemic in Chinese cities, named SITR model. Wecompare and discuss model results which conducts updates as new observationsbecome available. A hybrid data assimilation approach is applied to makeresults robust to initial conditions. We use the model to do inference oninfection numbers as well as parameters such as the disease transmissibilityrate or the rate of recovery. The parameterisation of the model is parsimoniousand extendable, allowing for the incorporation of additional data andparameters of interest. This allows for scalability and the extension of themodel to other locations or the adaption of novel data sources.
Dur TH, Arcucci R, Mottet L, et al., 2020, Weak Constraint Gaussian Processes for optimal sensor placement, JOURNAL OF COMPUTATIONAL SCIENCE, Vol: 42, ISSN: 1877-7503
Wu P, Sun J, Chang X, et al., 2020, Data-driven reduced order model with temporal convolutional neural network, COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, Vol: 360, ISSN: 0045-7825
Arcucci R, Moutiq L, Guo YK, 2020, Neural assimilation, Pages: 155-168, ISBN: 9783030504328
We introduce a new neural network for Data Assimilation (DA). DA is the approximation of the true state of some physical system at a given time obtained combining time-distributed observations with a dynamic model in an optimal way. The typical assimilation scheme is made up of two major steps: a prediction and a correction of the prediction by including information provided by observed data. This is the so called prediction-correction cycle. Classical methods for DA include Kalman filter (KF). KF can provide a rich information structure about the solution but it is often complex and time-consuming. In operational forecasting there is insufficient time to restart a run from the beginning with new data. Therefore, data assimilation should enable real-time utilization of data to improve predictions. This mandates the choice of an efficient data assimilation algorithm. Due to this necessity, we introduce, in this paper, the Neural Assimilation (NA), a coupled neural network made of two Recurrent Neural Networks trained on forecasting data and observed data respectively. We prove that the solution of NA is the same of KF. As NA is trained on both forecasting and observed data, after the phase of training NA is used for the prediction without the necessity of a correction given by the observations. This allows to avoid the prediction-correction cycle making the whole process very fast. Experimental results are provided and NA is tested to improve the prediction of oxygen diffusion across the Blood-Brain Barrier (BBB).
Nadler P, Arcucci R, Guo Y-K, 2020, A Scalable Approach to Econometric Inference, Conference on Parallel Computing - Technology Trends (ParCo), Publisher: IOS PRESS, Pages: 59-68, ISSN: 0927-5452
Arcucci R, Casas CQ, Xiao D, et al., 2020, A Domain Decomposition Reduced Order Model with Data Assimilation (DD-RODA), Conference on Parallel Computing - Technology Trends (ParCo), Publisher: IOS PRESS, Pages: 189-198, ISSN: 0927-5452
Arcucci R, Mottet L, Casas CAQ, et al., 2020, Adaptive Domain Decomposition for Effective Data Assimilation, Pages: 583-595, ISSN: 0302-9743
We present a parallel Data Assimilation model based on an Adaptive Domain Decomposition (ADD-DA) coupled with the open-source, finite-element, fluid dynamics model Fluidity. The model we present is defined on a partition of the domain in sub-domains without overlapping regions. This choice allows to avoid communications among the processes during the Data Assimilation phase. However, during the balance phase, the model exploits the domain decomposition implemented in Fluidity which balances the results among the processes exploiting overlapping regions. Also, the model exploits the technology provided by the mesh adaptivity to generate an optimal mesh we name supermesh. The supermesh is the one used in ADD-DA process. We prove that the ADD-DA model provides the same numerical solution of the corresponding sequential DA model. We also show that the ADD approach reduces the execution time even when the implementation is not on a parallel computing environment. Experimental results are provided for pollutant dispersion within an urban environment.
Nadler P, Arcucci R, Guo YK, 2019, Data assimilation for parameter estimation in economic modelling, Pages: 649-656
We propose a data assimilation approach for latent parameter estimation in economic models. We describe a dynamic model of an economic system with latent state variables describing the relationship of economic entities over time as well as a stochastic volatility component. We show and discuss the model's relationship with data assimilation and how it is derived. We apply it to conduct a multivariate analysis of the cryptocurrency ecosystem. Combining these approaches opens a new dimension of analysis to economic modelling. Economics, Multivariate Analysis, Dynamical System, Bitcoin, Data Assimilation.
Lim EM, Molina Solana M, Pain C, et al., 2019, Hybrid data assimilation: An ensemble-variational approach, Pages: 633-640
Data Assimilation (DA) is a technique used to quantify and manage uncertainty in numerical models by incorporating observations into the model. Variational Data Assimilation (VarDA) accomplishes this by minimising a cost function which weighs the errors in both the numerical results and the observations. However, large-scale domains pose issues with the optimisation and execution of the DA model. In this paper, ensemble methods are explored as a means of sampling the background error at a reduced rank to condition the problem. The impact of ensemble size on the error is evaluated and benchmarked against other preconditioning methods explored in previous work such as using truncated singular value decomposition (TSVD). Localisation is also investigated as a form of reducing the long-range spurious errors in the background error covariance matrix. Both the mean squared error (MSE) and execution time are used as measure of performance. Experimental results for a 3D case for pollutant dispersion within an urban environment are presented with promise for future work using dynamic ensembles and 4D state vectors.
Aristodemou E, Arcucci R, Mottet L, et al., 2019, Enhancing CFD-LES air pollution prediction accuracy using data assimilation, Building and Environment, Vol: 165, ISSN: 0007-3628
It is recognised worldwide that air pollution is the cause of premature deaths daily, thus necessitating the development of more reliable and accurate numerical tools. The present study implements a three dimensional Variational (3DVar) data assimilation (DA) approach to reduce the discrepancy between predicted pollution concentrations based on Computational Fluid Dynamics (CFD) with the ones measured in a wind tunnel experiment. The methodology is implemented on a wind tunnel test case which represents a localised neighbourhood environment. The improved accuracy of the CFD simulation using DA is discussed in terms of absolute error, mean squared error and scatter plots for the pollution concentration. It is shown that the difference between CFD results and wind tunnel data, computed by the mean squared error, can be reduced by up to three order of magnitudes when using DA. This reduction in error is preserved in the CFD results and its benefit can be seen through several time steps after re-running the CFD simulation. Subsequently an optimal sensors positioning is proposed. There is a trade-off between the accuracy and the number of sensors. It was found that the accuracy was improved when placing/considering the sensors which were near the pollution source or in regions where pollution concentrations were high. This demonstrated that only 14% of the wind tunnel data was needed, reducing the mean squared error by one order of magnitude.
Zhu J, Hu S, Arcucci R, et al., 2019, Model error correction in data assimilation by integrating neural networks, Big Data Mining and Analytics, Vol: 2, Pages: 83-91, ISSN: 2096-0654
In this paper, we suggest a new methodology which combines Neural Networks (NN) into Data Assimilation (DA). Focusing on the structural model uncertainty, we propose a framework for integration NN with the physical models by DA algorithms, to improve both the assimilation process and the forecasting results. The NNs are iteratively trained as observational data is updated. The main DA models used here are the Kalman filter and the variational approaches. The effectiveness of the proposed algorithm is validated by examples and by a sensitivity study.
Arcucci R, Mottet L, Pain C, et al., 2019, Optimal reduced space for Variational Data Assimilation, JOURNAL OF COMPUTATIONAL PHYSICS, Vol: 379, Pages: 51-69, ISSN: 0021-9991
Arcucci R, Mcllwraith D, Guo Y-K, 2019, Scalable Weak Constraint Gaussian Processes, 19th Annual International Conference on Computational Science (ICCS), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 111-125, ISSN: 0302-9743
Arcucci R, Pain C, Guo Y-K, 2018, Effective variational data assimilation in air-pollution prediction, Big Data Mining and Analytics, Vol: 1, Pages: 297-307, ISSN: 2096-0654
Numerical simulations are widely used as a predictive tool to better understand complex air flows and pollution transport on the scale of individual buildings, city blocks, and entire cities. To improve prediction for air flows and pollution transport, we propose a Variational Data Assimilation (VarDA) model which assimilates data from sensors into the open-source, finite-element, fluid dynamics model Fluidity. VarDA is based on the minimization of a function which estimates the discrepancy between numerical results and observations assuming that the two sources of information, forecast and observations, have errors that are adequately described by error covariance matrices. The conditioning of the numerical problem is dominated by the condition number of the background error covariance matrix which is ill-conditioned. In this paper, a preconditioned VarDA model is presented, it is based on a reduced background error covariance matrix. The Empirical Orthogonal Functions (EOFs) method is used to alleviate the computational cost and reduce the space dimension. Experimental results are provided assuming observed values provided by sensors from positions mainly located on roofs of buildings.
Song J, Fan S, Lin W, et al., 2018, Natural ventilation in cities: the implications of fluid mechanics, BUILDING RESEARCH AND INFORMATION, Vol: 46, Pages: 809-828, ISSN: 0961-3218
Arcucci R, Basciano D, Cilardo A, et al., 2018, Energy Analysis of a 4D Variational Data Assimilation Algorithm and Evaluation on ARM-Based HPC Systems, 12th International Conference on Parallel Processing and Applied Mathematics (PPAM), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 37-47, ISSN: 0302-9743
D'Amore L, Arcucci R, Li Y, et al., 2018, Performance Assessment of the Incremental Strong Constraints 4DVAR Algorithm in ROMS, 12th International Conference on Parallel Processing and Applied Mathematics (PPAM), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 48-57, ISSN: 0302-9743
Arcucci R, Carracciuolo L, Toumi R, 2018, Toward a preconditioned scalable 3DVAR for assimilating Sea Surface Temperature collected into the Caspian Sea, Journal of Numerical Analysis, Industrial and Applied Mathematics, Vol: 12, Pages: 9-28, ISSN: 1790-8140
© 2018 European Society of Computational Methods in Sciences and Engineering. Data Assimilation (DA) is an uncertainty quantification technique used to incorporate observed data into a prediction model in order to improve numerical forecasted results. As a crucial point into DA models is the ill conditioning of the covariance matrices involved, it is mandatory to introduce, in a DA software, preconditioning methods. Here we present first results obtained introducing two different preconditioning methods in a DA software we are developing (we named S3DVAR) which implements a Scalable Three Dimensional Variational Data Assimilation model for assimilating sea surface temperature (SST) values collected into the Caspian Sea by using the Regional Ocean Modeling System (ROMS) with observations provided by the Group of High resolution sea surface temperature (GHRSST). We present the algorithmic strategies we employ and the numerical issues on data collected in two of the months which present the most significant variability in water temperature: August and March.
Arcucci R, D'Amore L, Carracciuolo L, et al., 2017, A Decomposition of the Tikhonov Regularization Functional Oriented to Exploit Hybrid Multilevel Parallelism, INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, Vol: 45, Pages: 1214-1235, ISSN: 0885-7458
Arcucci R, Celestino S, Toumi R, et al., 2017, Toward the S3DVAR data assimilation software for the Caspian Sea, International Conference on Numerical Analysis and Applied Mathematics (ICNAAM), Publisher: AIP Publishing, ISSN: 1551-7616
Data Assimilation (DA) is an uncertainty quantification technique used to incorporate observed data into a prediction model in order to improve numerical forecasted results. The forecasting model used for producing oceanographic prediction into the Caspian Sea is the Regional Ocean Modeling System (ROMS). Here we propose the computational issues we are facing in a DA software we are developing (we named S3DVAR) which implements a Scalable Three Dimensional Variational Data Assimilation model for assimilating sea surface temperature (SST) values collected into the Caspian Sea with observations provided by the Group of High resolution sea surface temperature (GHRSST). We present the algorithmic strategies we employ and the numerical issues on data collected in two of the months which present the most significant variability in water temperature: August and March.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.