Imperial College London


Faculty of EngineeringDepartment of Aeronautics

Honorary Senior Lecturer



+44 (0)20 7594 5129d.moxey Website CV




363Roderic Hill BuildingSouth Kensington Campus





Publication Type

57 results found

Eichstadt J, Peiro J, Moxey D, 2023, Efficient vectorised kernels for unstructured high-order finite element fluid solvers on GPU architectures in two dimensions, COMPUTER PHYSICS COMMUNICATIONS, Vol: 284, ISSN: 0010-4655

Journal article

Laughton E, Zala V, Narayan A, Kirby RM, Moxey Det al., 2022, Fast Barycentric-Based Evaluation Over Spectral/hp Elements, JOURNAL OF SCIENTIFIC COMPUTING, Vol: 90, ISSN: 0885-7474

Journal article

Mengaldo G, Moxey D, Turner M, Moura RC, Jassim A, Taylor M, Peiro J, Sherwin Set al., 2021, Industry-Relevant Implicit Large-Eddy Simulation of a High-Performance Road Car via Spectral/hp Element Methods, SIAM REVIEW, Vol: 63, Pages: 723-755, ISSN: 0036-1445

Journal article

Lykkegaard MB, Dodwell TJ, Moxey D, 2021, Accelerating uncertainty quantification of groundwater flow modelling using a deep neural network proxy, COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, Vol: 383, ISSN: 0045-7825

Journal article

Laughton E, Tabor G, Moxey D, 2021, A comparison of interpolation techniques for non-conformal high-order discontinuous Galerkin methods, COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, Vol: 381, ISSN: 0045-7825

Journal article

Yan Z-G, Pan Y, Castiglioni G, Hillewaert K, Peiró J, Moxey D, Sherwin SJet al., 2021, Nektar++: Design and implementation of an implicit, spectral/hp element, compressible flow solver using a Jacobian-free Newton Krylov approach, Computers & Mathematics with Applications, Vol: 81, Pages: 351-372, ISSN: 0898-1221

At high Reynolds numbers the use of explicit in time compressible flow simulations with spectral/ element discretization can become significantly limited by time step. To alleviate this limitation we extend the capability of the spectral/ element open-source software framework, Nektar++, to include an implicit discontinuous Galerkin compressible flow solver. The integration in time is carried out by a singly diagonally implicit Runge–Kutta method. The non-linear system arising from the implicit time integration is iteratively solved by the Jacobian-free Newton Krylov (JFNK) method. A favorable feature of the JFNK approach is its extensive use of the explicit operators available from the previous explicit in time implementation. The functionalities of different building blocks of the implicit solver are analyzed from the point of view of software design and placed in appropriate hierarchical levels in the C++ libraries. In the detailed implementation, the contributions of different parts of the solver to computational cost, memory consumption and programming complexity are also analyzed. A combination of analytical and numerical methods is adopted to simplify the programming complexity in forming the preconditioning matrix. The solver is verified and tested using cases such as manufactured compressible Poiseuille flow, Taylor–Green vortex, turbulent flow over a circular cylinder at and shock wave boundary-layer interaction. The results show that the implicit solver can speed-up the simulations while maintaining good simulation accuracy.

Journal article

Marcon J, Castiglioni G, Moxey D, Sherwin SJ, Peiro Jet al., 2020, rp-adaptation for compressible flows, International Journal for Numerical Methods in Engineering, Vol: 121, Pages: 5405-5425, ISSN: 0029-5981

We present an rp-adaptation strategy for high-fidelity simulation of compressible inviscid flows with shocks. The mesh resolution in regions of flow discontinuities is increased by using a variational optimiser to r-adapt the mesh and cluster degrees of freedom there. In regions of smooth flow, we locally increase or decrease the local resolution through increasing or decreasing the polynomial order of the elements, respectively. This dual approach allows us to take advantage of the strengths of both methods for best computational performance, thereby reducing the overall cost of the simulation. The adaptation workflow uses a sensor for both discontinuities and smooth regions that is cheap to calculate, but the framework is general and could be used in conjunction with other feature-based sensors or error estimators. We demonstrate this proof-of-concept using two geometries at transonic and supersonic flow regimes. The method has been implemented in the open-source spectral/hp element framework Nektar++, and its dedicated high-order mesh generation tool NekMesh. The results show that the proposed rp-adaptation methodology is a reasonably cost-effective way of improving accuracy.

Journal article

Eichstaedt J, Vymazal M, Moxey D, Peiro Jet al., 2020, A comparison of the shared-memory parallel programming models OpenMP, OpenACC and Kokkos in the context of implicit solvers for high-order FEM, Computer Physics Communications, Vol: 255, Pages: 1-15, ISSN: 0010-4655

We consider the application of three performance-portable programming models in the context of a high-order spectral element, implicit time-stepping solver for the Navier–Stokes equations. We aim to evaluate whether the use of these models allows code developers to deliver high-performance solvers for computational fluid dynamics simulations that are capable of effectively utilising both many-core CPU and GPU architectures. Using the core elliptic solver for the Navier–Stokes equations as a benchmarking guide, we evaluate the performance of these models on a range of unstructured meshes and give guidelines for the translation of existing codebases and their data structures to these models.

Journal article

Moxey D, Cantwell CD, Bao Y, Cassinelli A, Castiglioni G, Chun S, Juda E, Kazemi E, Lackhove K, Marcon J, Mengaldo G, Serson D, Turner M, Xu H, Peiro J, Kirby RM, Sherwin SJet al., 2020, Nektar++: enhancing the capability and application of high-fidelity spectral/hp element methods, Computer Physics Communications, Vol: 249, Pages: 1-18, ISSN: 0010-4655

Nektar++ is an open-source framework that provides a flexible, high-performance and scalable platform for the development of solvers for partial differential equations using the high-order spectral/ element method. In particular, Nektar++ aims to overcome the complex implementation challenges that are often associated with high-order methods, thereby allowing them to be more readily used in a wide range of application areas. In this paper, we present the algorithmic, implementation and application developments associated with our Nektar++ version 5.0 release. We describe some of the key software and performance developments, including our strategies on parallel I/O, on in situ processing, the use of collective operations for exploiting current and emerging hardware, and interfaces to enable multi-solver coupling. Furthermore, we provide details on a newly developed Python interface that enables a more rapid introduction for new users unfamiliar with spectral/ element methods, C++ and/or Nektar++. This release also incorporates a number of numerical method developments – in particular: the method of moving frames (MMF), which provides an additional approach for the simulation of equations on embedded curvilinear manifolds and domains; a means of handling spatially variable polynomial order; and a novel technique for quasi-3D simulations (which combine a 2D spectral element and 1D Fourier spectral method) to permit spatially-varying perturbations to the geometry in the homogeneous direction. Finally, we demonstrate the new application-level features provided in this release, namely: a facility for generating high-order curvilinear meshes called NekMesh; a novel new AcousticSolver for aeroacoustic problems; our development of a ‘thick’ strip model for the modelling of fluid–structure interaction (FSI) problems in the context of vortex-induced vibrations (VIV). We conclude by commenting on some lessons learned and by discussing some directions fo

Journal article


Journal article

Sherwin SJ, Moxey D, Peiró J, Vincent PE, Schwab Cet al., 2020, Preface, ISBN: 9783030396466


Cohen J, Nowell J, Mortari F, Moxey D, Cantwell Cet al., 2019, london-escience/tempss: v0.5

london-escience/tempss: v0.5


Vymazal M, Moxey D, Cantwell CD, Sherwin SJ, Kirby RMet al., 2019, On weak Dirichlet boundary conditions for elliptic problems in the continuous Galerkin method, Journal of Computational Physics, Vol: 394, Pages: 732-744, ISSN: 0021-9991

We combine continuous and discontinuous Galerkin methods in the setting of a model diffusion problem. Starting from a hybrid discontinuous formulation, we replace element interiors by more general subsets of the computational domain – groups of elements that support a piecewise-polynomial continuous expansion. This step allows us to identify a new weak formulation of Dirichlet boundary condition in the continuous framework. We show that the boundary condition leads to a stable discretization with a single parameter insensitive to mesh size and polynomial order of the expansion. The robustness of the approach is demonstrated on several numerical examples.

Journal article

Buscariolo FF, Hoessler J, Moxey D, Jassim A, Gouder K, Basley J, Murai Y, Assi GRS, Sherwin SJet al., 2019, Spectral/hp element simulation of flow past a Formula One front wing: validation against experiments, Publisher: arXiv

Emerging commercial and academic tools are regularly being applied to thedesign of road and race cars, but there currently are no well-establishedbenchmark cases to study the aerodynamics of race car wings in ground effect.In this paper we propose a new test case, with a relatively complex geometry,supported by the availability of CAD model and experimental results. We referto the test case as the Imperial Front Wing, originally based on the front wingand endplate design of the McLaren 17D race car. A comparison of differentresolutions of a high fidelity spectral/hp element simulation usingunder-resolved DNS/implicit LES approach with fourth and fifth polynomial orderis presented. The results demonstrate good correlation to both the wall-boundedstreaklines obtained by oil flow visualization and experimental PIV results,correctly predicting key characteristics of the time-averaged flow structures,namely intensity, contours and locations. This study highlights the resolutionrequirements in capturing salient flow features arising from this type ofchallenging geometry, providing an interesting test case for both traditionaland emerging high-fidelity simulations.

Working paper

Moxey D, Sastry SP, Kirby RM, 2019, Interpolation Error Bounds for Curvilinear Finite Elements and Their Implications on Adaptive Mesh Refinement, JOURNAL OF SCIENTIFIC COMPUTING, Vol: 78, Pages: 1045-1062, ISSN: 0885-7474

Journal article

Marcon J, Peiro J, Moxey D, Bergemann N, Bucklow H, Gammon Met al., 2019, A semi-structured approach to curvilinear mesh generation around streamlined bodies, AIAA Scitech 2019 Forum, Publisher: AIAA

We present an approach for robust high-order mesh generation specially tailored to streamlined bodies. The method is based on a semi-sructured approach which combines the high quality of structured meshes in the near-field with the flexibility of unstructured meshes in the far-field. We utilise medial axis technology to robustly partition the near-field into blocks which can be meshed coarsely with a linear swept mesher. A high-order mesh of the near-field is then generated and split using an isoparametric approach which allows us to obtain highly stretched elements aligned with the flow field. Special treatment of the partition is performed on the wing root juntion and the trailing edge --- into the wake --- to obtain an H-type mesh configuration with anisotropic hexahedra ideal for the strong shear of high Reynolds number simulations. We then proceed to discretise the far-field using traditional robust tetrahedral meshing tools. This workflow is made possible by two sets of tools: CADfix, focused on CAD system, the block partitioning of the near-field and the generation of a linear mesh; and NekMesh, focused on the curving of the high-order mesh and the generation of highly-stretched boundary layer elements. We demonstrate this approach on a NACA0012 wing attached to a wall and show that a gap between the wake partition and the wall can be inserted to remove the dependency of the partitioning procedure on the local geometry.

Conference paper

Yakhot A, Feldman Y, Moxey D, Sherwin S, Karniadakis GEet al., 2019, Near-Wall Turbulence in a Localized Puff in a Pipe, 8th iTi Conference and Workshop on Turbulent Aspects in Wind Energy, Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 15-20, ISSN: 0930-8989

Conference paper

Eichstädt J, Moxey D, Peiró J, 2019, Towards performance-portable high-order implicit flow solvers

We discuss the steps required to adapt legacy flow, or structural, solvers to modern CPU and GPU architectures using a portable programming model. These steps are illustrated using a high-order mesh optimiser and an implicit Helmholtz solver as examples. We show that satisfactory performance can be achieved in both architectures using such a framework, and highlight the importance of developing efficient data structures.

Conference paper

Yakhot A, Feldman Y, Moxey D, Sherwin S, Karniadakis GEet al., 2019, Turbulence in a Localized Puff in a Pipe, Flow, Turbulence and Combustion, ISSN: 1386-6184

© 2019, Springer Nature B.V. We have performed direct numerical simulations of a spatio-temporally intermittent flow in a pipe for Rem = 2250. From previous experiments and simulations of pipe flow, this value has been estimated as a threshold when the average speeds of upstream and downstream fronts of a puff are identical (Barkley et al., Nature 526, 550–553, 2015; Barkley et al., 2015). We investigated the structure of an individual puff by considering three-dimensional snapshots over a long time period. To assimilate the velocity data, we applied a conditional sampling based on the location of the maximum energy of the transverse (turbulent) motion. Specifically, at each time instance, we followed a turbulent puff by a three-dimensional moving window centered at that location. We collected a snapshot-ensemble (10000 time instances, snapshots) of the velocity fields acquired over T = 2000D/U time interval inside the moving window. The cross-plane velocity field inside the puff showed the dynamics of a developing turbulence. In particular, the analysis of the cross-plane radial motion yielded the illustration of the production of turbulent kinetic energy directly from the mean flow. A snapshot-ensemble averaging over 10000 snapshots revealed azimuthally arranged large-scale (coherent) structures indicating near-wall sweep and ejection activity. The localized puff is about 15-17 pipe diameters long and the flow regime upstream of its upstream edge and downstream of its leading edge is almost laminar. In the near-wall region, despite the low Reynolds number, the turbulence statistics, in particular, the distribution of turbulence intensities, Reynolds shear stress, skewness and flatness factors, become similar to a fully-developed turbulent pipe flow in the vicinity of the puff upstream edge. In the puff core, the velocity profile becomes flat and logarithmic. It is shown that this “fully-developed turbulent flash” is very narrow being about t

Journal article

Turner M, Peiro J, Moxey D, 2018, Curvilinear mesh generation using a variational framework, Computer Aided Design, Vol: 103, Pages: 73-91, ISSN: 0010-4485

We aim to tackle the challenge of generating unstructured high-order meshes of complex three-dimensional bodies, which remains a significant bottleneck in the wider adoption of high-order methods. In particular we show that by adopting a variational approach to the generation process, many of the current popular high-order generation methods can be encompassed under a single unifying framework. This allows us to compare the effectiveness of these methods and to assess the quality of the meshes they produce in a systematic fashion. We present a detailed overview of the theory and formulation of the variational framework, and we highlight how such formulation can be effectively exploited to yield a highly-efficient parallel implementation. The effectiveness of this approach is examined by considering a number of two- and three-dimensional examples, where we show how the proposed approach can be used for both mesh quality optimisation and untangling of invalid high-order meshes.

Journal article

Eichstaedt JR, Green M, Turner M, Peiro J, Moxey Det al., 2018, Accelerating high-order mesh optimisation with an architecture-independent programming model, Computer Physics Communications, Vol: 229, Pages: 36-53, ISSN: 0010-4655

Heterogeneous manycore performance-portable programming models and libraries, such as Kokkos, have been developed to facilitate portability and maintainability of high-performance computing codes and enhance their resilience to architectural changes. Here we investigate the suitability of the Kokkos programming model for optimizing the performance of the high-order mesh generator NekMesh, which has been developed to efficiently generate meshes containing millions of elements for industrial problem involving complex geometries. We describe the variational approach for a posteriori high-order mesh optimisation employed within NekMesh and its parallel implementation. We discuss its implementation for modern manycore massively parallel shared-memory CPU and GPU platforms using Kokkos and demonstrate that we achieve increased performance on multicore CPUs and accelerators compared with a native Pthreads implementation. Further, we show that we achieve additional speedup and cost reduction by running on GPUs without any hardware-specific code optimisation.

Journal article

Cohen J, Marcon J, Turner M, Cantwell C, Sherwin SJ, Peiro J, Moxey Det al., 2018, Simplifying high-order mesh generation for computational scientists, 10th International Workshop on Science Gateways, Publisher: CEUR Workshop Proceedings, ISSN: 1613-0073

Computational modelling is now tightly integrated into many fields of research in science and industry. Computational fluid dynamics software, for example, gives engineers the ability to model fluid flow around complex geometries defined in Computer-Aided Design (CAD) packages, without the expense of constructing large wind tunnel experiments. However, such modelling requires translation from an initial CAD geometry to a mesh of many small elements that modelling software uses to represent the approximate solution in the numerical method. Generating sufficiently high-quality meshes for simulation is a time-consuming, iterative and error-prone process that is often complicated by the need to interact with multiple command-line tools to generate and visualise the mesh data. In this paper we describe our approach to overcoming this complexity through the addition of a meshing console to Nekkloud, a science gateway for simplifying access to the functionality of the Nektar++ spectral/hp element framework. The meshing console makes use of the NekMesh tool in Nektar++ to help reduce the complexity of the mesh generation process. It offers a web-based interface for specifying parameters, undertaking meshing and visualising results. The meshing console enables Nekkloud to offer support for a full, end-to-end simulation pipeline from initial CAD geometry to simulation results.

Conference paper

De Grazia D, Moxey D, Sherwin SJ, Kravtsova MA, Ruban AIet al., 2018, Direct numerical simulation of a compressible boundary-layer flow past an isolated three-dimensional hump in a high-speed subsonic regime, Physical Review Fluids, Vol: 3, ISSN: 2469-990X

In this paper we study the boundary-layer separation produced in a high-speed subsonic boundary layer by a small wall roughness. Specifically, we present a direct numerical simulation (DNS) of a two-dimensional boundary-layer flow over a flat plate encountering a three-dimensional Gaussian-shaped hump. This work was motivated by the lack of DNS data of boundary-layer flows past roughness elements in a similar regime which is typical of civil aviation. The Mach and Reynolds numbers are chosen to be relevant for aeronautical applications when considering small imperfections at the leading edge of wings. We analyze different heights of the hump: The smaller heights result in a weakly nonlinear regime, while the larger result in a fully nonlinear regime with an increasing laminar separation bubble arising downstream of the roughness element and the formation of a pair of streamwise counterrotating vortices which appear to support themselves.

Journal article

Marcon J, Turner M, Peiro J, Moxey D, Pollard CR, Bucklow H, Gammon Met al., 2018, High-order curvilinear hybrid mesh generation for CFD simulations, AIAA Aerospace Sciences Meeting

We describe a semi-structured method for the generation of high-order hybrid meshes suited for the simulation of high Reynolds number flows. This is achieved through the use of highly stretched elements in the viscous boundary layers near the wall surfaces. CADfix is used to first repair any possible defects in the CAD geometry and then generate a medial object based decomposition of the domain that wraps the wall boundaries with partitions suitable for the generation of either prismatic or hexahedral elements. The latter is a novel distinctive feature of the method that permits to obtain well-shaped hexahedral meshes at corners or junctions in the boundary layer. The medial object approach allows greater control on the “thickness” of the boundary-layer mesh than is generally achievable with advancing layer techniques. CADfix subsequently generates a hybrid straight-sided mesh of prismatic and hexahedral elements in the near-field region modelling the boundary layer, and tetrahedral elements in the far-field region covering the rest of the domain. The mesh in the near-field region provides a framework that facilitates the generation, via an isoparametric technique, of layers of highly stretched elements with a distribution of points in the direction normal to the wall tailored to efficiently and accurately capture the flow in the boundary layer. The final step is the generation of a high-order mesh using NekMesh, a high-order mesh generator within the Nektar++ framework. NekMesh uses the CADfix API as a geometry engine that handles all the geometrical queries to the CAD geometry required during the high-order mesh generation process. We will describe in some detail the methodology using a simple geometry, a NACA wing tip, for illustrative purposes. Finally, we will present two examples of application to reasonably complex geometries proposed by NASA as CFD validation cases: the Common Research Model and the Rotor 67.

Conference paper

Turner M, Moxey D, Peiro J, Gammon M, Pollard CR, Bucklow Het al., 2017, A framework for the generation of high-order curvilinear hybrid meshes for CFD simulations, 26th International Meshing Roundtable (IMR), Publisher: ELSEVIER SCIENCE BV, Pages: 206-218, ISSN: 1877-7058

We present a pipeline of state-of-the-art techniques for the generation of high-order meshes that contain highly stretched elements in viscous boundary layers, and are suitable for flow simulations at high Reynolds numbers. The pipeline uses CADfix to generate a medial object based decomposition of the domain, which wraps the wall boundaries with prismatic partitions. The use of medial object allows the prism height to be larger than is generally possible with advancing layer techniques. CADfix subsequently generates a hybrid straight-sided (or linear) mesh. A high-order mesh is then generated a posteriori using NekMesh, a high-order mesh generator within the Nektar++ framework. During the high-order mesh generation process, the CAD definition of the domain is interrogated; we describe the process for integrating the CADfix API as an alternative backend geometry engine for NekMesh, and discuss some of the implementation issues encountered. Finally, we illustrate the methodology using three geometries of increasing complexity: a wing tip, a simplified landing gear and an aircraft in cruise configuration.

Conference paper

Marcon J, Turner M, Moxey D, Sherwin SJ, Peiro Jet al., 2017, A variational approach to high-order r-adaptation, 26th International Meshing Roundtable

A variational framework, initially developed for high-order mesh optimisation, is being extended for r-adaptation. The method is based on the minimisation of a functional of the mesh deformation. To achieve adaptation, elements of the initial mesh are manipulated using metric tensors to obtain target elements. The nonlinear optimisation in turns adapts the final high-order mesh to best fit the description of the target elements by minimising the element distortion. Encouraging preliminary results prove that the method behaves well and can be used in the future for more extensive work which shall include the use of error indicators from CFD simulations.

Conference paper

Ekelschot D, Moxey D, Sherwin SJ, Peiro Jet al., 2017, A p-adaptation method for compressible flow problems using a goal-based error indicator, Computers and Structures, Vol: 181, Pages: 55-69, ISSN: 0045-7949

An accurate calculation of aerodynamic force coefficients for a given geometry is of fundamental importance for aircraft design. High-order spectral/hp element methods, which use a discontinuous Galerkin discretisation of the compressible Navier–Stokes equations, are now increasingly being used to improve the accuracy of flow simulations and thus the force coefficients. To reduce error in the calculated force coefficients whilst keeping computational cost minimal, we propose a p-adaptation method where the degree of the approximating polynomial is locally increased in the regions of the flow where low resolution is identified using a goal-based error estimator as follows.Given an objective functional such as the aerodynamic force coefficients, we use control theory to derive an adjoint problem which provides the sensitivity of the functional with respect to changes in the flow variables, and assume that these changes are represented by the local truncation error. In its final form, the goal-based error indicator represents the effect of truncation error on the objective functional, suitably weighted by the adjoint solution. Both flow governing and adjoint equations are solved by the same high-order method, where we allow the degree of the polynomial within an element to vary across the mesh.We initially calculate a steady-state solution to the governing equations using a low polynomial order and use the goal-based error indicator to identify parts of the computational domain that require improved solution accuracy which is achieved by increasing the approximation order. We demonstrate the cost-effectiveness of our method across a range of polynomial orders by considering a number of examples in two- and three-dimensions and in subsonic and transonic flow regimes. Reductions in both the number of degrees of freedom required to resolve the force coefficients to a given error, as well as the computational cost, are both observed in using the p-adaptive technique.

Journal article

Moxey D, Cantwell CD, Mengaldo G, Serson D, Ekelschot D, Peiró J, Sherwin SJ, Kirby RMet al., 2017, Towards p-adaptive spectral/hp element methods for modelling industrial flows, ICOSAHOM-2016 - International Conference on Spectral and High-order Methods, Publisher: Springer International Publishing AG, Pages: 63-79, ISSN: 1439-7358

There is an increasing requirement from both academia and industry for high-fidelity flow simulations that are able to accurately capture complicated and transient flow dynamics in complex geometries. Coupled with the growing availability of high-performance, highly parallel computing resources, there is therefore a demand for scalable numerical methods and corresponding software frameworks which can deliver the next-generation of complex and detailed fluid simulations to scientists and engineers in an efficient way. In this article we discuss recent and upcoming advances in the use of the spectral/hp element method for addressing these modelling challenges. To use these methods efficiently for such applications, is critical that computational resolution is placed in the regions of the flow where it is needed most, which is often not known a priori. We propose the use of spatially and temporally varying polynomial order, coupled with appropriate error estimators, as key requirements in permitting these methods to achieve computationally efficient high-fidelity solutions to complex flow problems in the fluid dynamics community.

Conference paper

Turner M, Peiro J, Moxey D, 2016, A variational framework for high-order mesh generation, Procedia Engineering, Vol: 163, Pages: 340-352, ISSN: 1877-7058

The generation of sufficiently high quality unstructured high-order meshes remains a significant obstacle in the adoption of high-order methods. However, there is little consensus on which approach is the most robust, fastest and produces the ‘best’ meshes. We aim to provide a route to investigate this question, by examining popular high-order mesh generation methods in the context of an efficient variational framework for the generation of curvilinear meshes. By considering previous works in a variational form, we are able to compare their characteristics and study their robustness. Alongside a description of the theory and practical implementation details, including an efficient multi-threading parallelisation strategy, we demonstrate the effectiveness of the framework, showing how it can be used for both mesh quality optimisation and untangling of invalid meshes.

Journal article

Moxey D, Cantwell C, Kirby RM, Sherwin Set al., 2016, Optimizing the performance of the spectral/hp element method with collective linear algebra operations, Computer Methods in Applied Mechanics and Engineering, Vol: 310, Pages: 628-645, ISSN: 0045-7825

As computing hardware evolves, increasing core counts mean that memory bandwidth is becomingthe deciding factor in attaining peak performance of numerical methods. High-orderfinite element methods, such as those implemented in the spectral/hp framework Nektar++,are particularly well-suited to this environment. Unlike low-order methods that typicallyutilize sparse storage, matrices representing high-order operators have greater density andricher structure. In this paper, we show how these qualities can be exploited to increaseruntime performance on nodes that comprise a typical high-performance computing system,by amalgamating the action of key operators on multiple elements into a single, memorye!cientblock. We investigate di↵erent strategies for achieving optimal performance acrossa range of polynomial orders and element types. As these strategies all depend on externalfactors such as BLAS implementation and the geometry of interest, we present a techniquefor automatically selecting the most e!cient strategy at runtime.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00714478&limit=30&person=true