42 results found
Farrell PE, Knepley MG, Mitchell L, et al., 2021, PCPATCH: Software for the Topological Construction of Multigrid Relaxation Methods, ACM Transactions on Mathematical Software, Vol: 47, ISSN: 0098-3500
Effective relaxation methods are necessary for good multigrid convergence. For many equations, standard Jacobi and Gauß-Seidel are inadequate, and more sophisticated space decompositions are required; examples include problems with semidefinite terms or saddle point structure. In this article, we present a unifying software abstraction, PCPATCH, for the topological construction of space decompositions for multigrid relaxation methods. Space decompositions are specified by collecting topological entities in a mesh (such as all vertices or faces) and applying a construction rule (such as taking all degrees of freedom in the cells around each entity). The software is implemented in PETSc and facilitates the elegant expression of a wide range of schemes merely by varying solver options at runtime. In turn, this allows for the very rapid development of fast solvers for difficult problems.
Farrell PE, Mitchell L, Scott R, et al., 2021, A Reynolds-robust preconditioner for the Scott-Vogelius discretization of the stationary incompressible Navier-Stokes equations, SMAI Journal of Computational Mathematics, Vol: 7, Pages: 75-96
Augmented Lagrangian preconditioners have successfully yielded Reynolds-robust preconditioners for the stationary incompressible Navier-Stokes equations, but only for specific discretizations. The discretizations for which these preconditioners have been designed possess error estimates which depend on the Reynolds number, with the discretization error deteriorating as the Reynolds number is increased. In this paper we present an augmented Lagrangian preconditioner for the Scott-Vogelius discretization on barycentrically-refined meshes. This achieves both Reynolds-robust performance and Reynolds-robust error estimates. A key consideration is the design of a suitable space decomposition that captures the kernel of the grad-div term added to control the Schur complement; the same barycentric refinement that guarantees inf-sup stability also provides a local decomposition of the kernel of the divergence. The robustness of the scheme is confirmed by numerical experiments in two and three dimensions.
Sun T, Mitchell L, Kulkarni K, et al., 2020, A study of vectorization for matrix-free finite element methods, International Journal of High Performance Computing Applications, Vol: 34, Pages: 629-644, ISSN: 1094-3420
Vectorization is increasingly important to achieve high performance on modern hardware with SIMD instructions. Assembly of matrices and vectors in the finite element method, which is characterized by iterating a local assembly kernel over unstructured meshes, poses difficulties to effective vectorization. Maintaining a user-friendly high-level interface with a suitable degree of abstraction while generating efficient, vectorized code for the finite element method is a challenge for numerical software systems and libraries. In this work, we study cross-element vectorization in the finite element framework Firedrake via code transformation and demonstrate the efficacy of such an approach by evaluating a wide range of matrix-free operators spanning different polynomial degrees and discretizations on two recent CPUs using three mainstream compilers. Our experiments show that our approaches for cross-element vectorization achieve 30% of theoretical peak performance for many examples of practical significance, and exceed 50% for cases with high arithmetic intensities, with consistent speed-up over (intra-element) vectorization restricted to the local assembly kernels.
Gibson T, Mitchell L, Ham D, et al., 2020, Slate: extending Firedrake's domain-specific abstraction to hybridized solvers for geoscience and beyond, Geoscientific Model Development, Vol: 13, Pages: 735-761, ISSN: 1991-959X
Within the finite element community, discontinuous Galerkin (DG) and mixed finite element methods have becomeincreasingly popular in simulating geophysical flows. However, robust and efficient solvers for the resulting saddle-point andelliptic systems arising from these discretizations continue to be an on-going challenge. One possible approach for addressingthis issue is to employ a method known as hybridization, where the discrete equations are transformed such that classic staticcondensation and local post-processing methods can be employed. However, it is challenging to implement hybridization as performant parallel code within complex models, whilst maintaining separation of concerns between applications scientistsand software experts. In this paper, we introduce a domain-specific abstraction within the Firedrake finite element library thatpermits the rapid execution of these hybridization techniques within a code-generating framework. The resulting frameworkcomposes naturally with Firedrake’s solver environment, allowing for the implementation of hybridization and static condensa-tion as runtime-configurable preconditioners via the Python interface to PETSc, petsc4py. We provide examples derived from second order elliptic problems and geophysical fluid dynamics. In addition, we demonstrate that hybridization shows greatpromise for improving the performance of solvers for mixed finite element discretizations of equations related to large-scalegeophysical flows.
Kirby RC, Mitchell L, 2019, Code generation for generally mapped finite elements, ACM Transactions on Mathematical Software, Vol: 45, ISSN: 0098-3500
Many classical finite elements such as the Argyris and Bell elements have long been absent from high-level PDE software. Building on recent theoretical work, we describe how to implement very general finite-element transformations in FInAT and hence into the Firedrake finite-element system. Numerical results evaluate the new elements, comparing them to existing methods for classical problems. For a second-order model problem, we find that new elements give smooth solutions at a mild increase in cost over standard Lagrange elements. For fourth-order problems, however, the newly enabled methods significantly outperform interior penalty formulations. We also give some advanced use cases, solving the nonlinear Cahn-Hilliard equation and some biharmonic eigenvalue problems (including Chladni plates) using C1 discretizations.
Ham DA, Mitchell L, Paganini A, et al., 2019, Automated shape differentiation in the Unified Form Language, STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, Vol: 60, Pages: 1813-1820, ISSN: 1615-147X
Farrell PE, Mitchell L, Wechsung F, 2019, An augmented lagrangian preconditioner for the 3D stationary incompressible Navier-Stokes equations at high Reynolds number, SIAM Journal on Scientific Computing, Vol: 41, Pages: A3073-A3096, ISSN: 1064-8275
In [M. Benzi and M. A. Olshanskii, SIAM J. Sci. Comput., 28 (2006), pp. 2095-2113] a preconditioner of augmented Lagrangian type was presented for the two-dimensional stationary incompressible Navier-Stokes equations that exhibits convergence almost independent of Reynolds number. The algorithm relies on a highly specialized multigrid method involving a custom prolongation operator and for robustness requires the use of piecewise constant finite elements for the pressure. However, the prolongation operator and velocity element used do not directly extend to three dimensions: the local solves necessary in the prolongation operator do not satisfy the inf-sup condition. In this work we generalize the preconditioner to three dimensions, proposing alternative finite elements for the velocity and prolongation operators for which the preconditioner works robustly. The solver is effective at high Reynolds number: on a three-dimensional lid-driven cavity problem with approximately one billion degrees of freedom, the average number of Krylov iterations per Newton step varies from 4.5 at Re = 10 to 3 at Re = 1000 and 5 at Re = 5000.
Gibson T, McRae ATT, Cotter C, et al., 2019, Compatible finite element methods for geophysical flows: Automation and implementation using Firedrake, Publisher: Springer International Publishing, ISBN: 9783030239565
This book introduces recently developed mixed finite element methods for large-scale geophysical flows that preserve essential numerical properties for accurate simulations. The methods are presented using standard models of atmospheric flows and are implemented using the Firedrake finite element library. Examples guide the reader through problem formulation, discretisation, and automated implementation.The so-called “compatible” finite element methods possess key numerical properties which are crucial for real-world operational weather and climate prediction. The authors summarise the theory and practical implications of these methods for model problems, introducing the reader to the Firedrake package and providing open-source implementations for all the examples covered.Students and researchers with engineering, physics, mathematics, or computer science backgrounds will benefit from this book. Those readers who are less familiar with the topic are provided with an overview of geophysical fluid dynamics.
Kärnä T, Kramer SC, Mitchell L, et al., 2018, Thetis coastal ocean model: discontinuous Galerkin discretization for the three-dimensional hydrostatic equations, Geoscientific Model Development, Vol: 11, Pages: 4359-4382, ISSN: 1991-959X
Unstructured grid ocean models are advantageous for simulating the coastal ocean and river-estuary-plume systems. However, unstructured grid models tend to be diffusive and/or computationally expensive which limits their applicability to real life problems. In this paper, we describe a novel discontinuous Galerkin (DG) finite element discretization for the hydrostatic equations. The formulation is fully conservative and second-order accurate in space and time. Monotonicity of the advection scheme is ensured by using a strong stability preserving time integration method and slope limiters. Compared to previous DG models advantages include a more accurate mode splitting method, revised viscosity formulation, and new second-order time integration scheme. We demonstrate that the model is capable of simulating baroclinic flows in the eddying regime with a suite of test cases. Numerical dissipation is well-controlled, being comparable or lower than in existing state-of-the-art structured grid models.
Kärnä T, Kramer SC, Mitchell L, et al., 2018, Thetis coastal ocean model: discontinuous Galerkin discretization for the three-dimensional hydrostatic equations, Geoscientific Model Development, Vol: 11, Pages: 4359-4382
<jats:p>Abstract. Unstructured grid ocean models are advantageous for simulating the coastal ocean and river–estuary–plume systems. However, unstructured grid models tend to be diffusive and/or computationally expensive, which limits their applicability to real-life problems. In this paper, we describe a novel discontinuous Galerkin (DG) finite element discretization for the hydrostatic equations. The formulation is fully conservative and second-order accurate in space and time. Monotonicity of the advection scheme is ensured by using a strong stability-preserving time integration method and slope limiters. Compared to previous DG models, advantages include a more accurate mode splitting method, revised viscosity formulation, and new second-order time integration scheme. We demonstrate that the model is capable of simulating baroclinic flows in the eddying regime with a suite of test cases. Numerical dissipation is well-controlled, being comparable or lower than in existing state-of-the-art structured grid models. </jats:p>
Homolya M, Mitchell L, Luporini F, et al., 2018, TSFC: a structure-preserving form compiler, SIAM Journal on Scientific Computing, ISSN: 1064-8275
A form compiler takes a high-level description of the weak form of partialdifferential equations and produces low-level code that carries out the finiteelement assembly. In this paper we present the Two-Stage Form Compiler (TSFC),a new form compiler with the main motivation to maintain the structure of theinput expression as long as possible. This facilitates the application ofoptimizations at the highest possible level of abstraction. TSFC features anovel, structure-preserving method for separating the contributions of a formto the subblocks of the local tensor in discontinuous Galerkin problems. Thisenables us to preserve the tensor structure of expressions longer through thecompilation process than other form compilers. This is also achieved in part bya two-stage approach that cleanly separates the lowering of finite elementconstructs to tensor algebra in the first stage, from the scheduling of thosetensor operations in the second stage. TSFC also efficiently traversescomplicated expressions, and experimental evaluation demonstrates goodcompile-time performance even for highly complex forms.
Kirby RC, Mitchell L, 2017, Solver composition across the PDE/linear algebra barrier, SIAM Journal on Scientific Computing, Vol: 40, Pages: C76-C98
The efficient solution of discretisations of coupled systems of partialdifferential equations (PDEs) is at the core of much of numerical simulation.Significant effort has been expended on scalable algorithms to preconditionKrylov iterations for the linear systems that arise. With few exceptions, thereported numerical implementation of such solution strategies is specific to aparticular model setup, and intimately ties the solver strategy to thediscretisation and PDE, especially when the preconditioner requires auxiliaryoperators. In this paper, we present recent improvements in the Firedrakefinite element library that allow for straightforward development of thebuilding blocks of extensible, composable preconditioners that decouple thesolver from the model formulation. Our implementation extends the algebraiccomposability of linear solvers offered by the PETSc library by augmentingoperators, and hence preconditioners, with the ability to provide any necessaryauxiliary operators. Rather than specifying up front the full solverconfiguration, tied to the model, solvers can be developed independently ofmodel formulation and configured at runtime. We illustrate with examples fromincompressible fluids and temperature-driven convection.
Yamazaki H, Shipton J, Cullen MJP, et al., 2017, Vertical slice modelling of nonlinear Eady waves using a compatible finite element method, Journal of Computational Physics, Vol: 343, Pages: 130-149, ISSN: 1090-2716
A vertical slice model is developed for the Euler–Boussinesq equations with a constant temperature gradient in the direction normal to the slice (the Eady–Boussinesq model). The model is a solution of the full three-dimensional equations with no variation normal to the slice, which is an idealised problem used to study the formation and subsequent evolution of weather fronts. A compatible finite element method is used to discretise the governing equations. To extend the Charney–Phillips grid staggering in the compatible finite element framework, we use the same node locations for buoyancy as the vertical part of velocity and apply a transport scheme for a partially continuous finite element space. For the time discretisation, we solve the semi-implicit equations together with an explicit strong-stability-preserving Runge–Kutta scheme to all of the advection terms. The model reproduces several quasi-periodic lifecycles of fronts despite the presence of strong discontinuities. An asymptotic limit analysis based on the semi-geostrophic theory shows that the model solutions are converging to a solution in cross-front geostrophic balance. The results are consistent with the previous results using finite difference methods, indicating that the compatible finite element method is performing as well as finite difference methods for this test problem. We observe dissipation of kinetic energy of the cross-front velocity in the model due to the lack of resolution at the fronts, even though the energy loss is not likely to account for the large gap on the strength of the fronts between the model result and the semi-geostrophic limit solution.
Mitchell L, Ham DA, McRae ATT, et al., 2017, Firedrake: automating the finite element method by composing abstractions, ACM Transactions on Mathematical Software, Vol: 43, Pages: 1-27, ISSN: 1557-7295
Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrakeadopts the domain-specific language for the finite element method of the FEniCS project, but with a purePython runtime-only implementation centred on the composition of several existing and new abstractions forparticular aspects of scientific computing. The result is a more complete separation of concerns which easesthe incorporation of separate contributions from computer scientists, numerical analysts and applicationspecialists. These contributions may add functionality, or improve performance.Firedrake benefits from automatically applying new optimisations. This includes factorising mixed functionspaces, transforming and vectorising inner loops, and intrinsically supporting block matrix operations.Importantly, Firedrake presents a simple public API for escaping the UFL abstraction. This allows users toimplement common operations that fall outside pure variational formulations, such as flux-limiters.
McRae ATT, Mitchell L, Bercea, et al., 2016, Automated Generation and Symbolic Manipulation of Tensor Product Finite Elements, SIAM Journal on Scientific Computing, Vol: 38, Pages: S25-S47, ISSN: 1064-8275
We describe and implement a symbolic algebra for scalar and vector-valued finite elements, enabling the computer generation of elements with tensor product structure on quadrilateral, hexahedral, and triangular prismatic cells. The algebra is implemented as an extension to the domain-specific language UFL, the Unified Form Language. This allows users to construct many finite element spaces beyond those supported by existing software packages. We have made corresponding extensions to FIAT, the FInite element Automatic Tabulator, to enable numerical tabulation of such spaces. This tabulation is consequently used during the automatic generation of low-level code that carries out local assembly operations, within the wider context of solving finite element problems posed over such function spaces. We have done this work within the code-generation pipeline of the software package Firedrake; we make use of the full Firedrake package to present numerical examples.
Bercea G, McRae ATT, Ham DA, et al., 2016, A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake, Geoscientific Model Development, Vol: 9, Pages: 3803-3815, ISSN: 1991-9603
We present a generic algorithm for numbering and then efﬁciently iterating over the data values attached to an extruded mesh. An extruded mesh is formed by replicating an existing mesh, assumed to be unstructured, to form layers of prismatic cells. Applications of extruded meshes include, but are not limited to, the representation of 3D high aspect ratio domains employed by geophysical ﬁnite element simulations. These meshes are structured in the extruded direction. The algorithm presented here exploits this structure to avoid the performance penalty traditionally associated with unstructured meshes. We evaluate the implementation of this algorithm in the Firedrake ﬁnite element system on a range of low compute intensity operations which constitute worst cases for data layout performance exploration. The experiments show that having structure along the extruded direction enables the cost of the indirect data accesses to be amortized after 10-20 layers as long as the underlying mesh is well-ordered. We characterise the resulting spatial and temporal reuse in a representative set of both continuous-Galerkin and discontinuous-Galerkin discretisations. On meshes with realistic numbers of layers the performance achieved is between 70% and 90% of a theoretical hardware-speciﬁc limit.
Lange M, Mitchell L, Knepley M, et al., 2016, Efficient Mesh Management in Firedrake Using PETSc DMPlex, SIAM Journal on Scientific Computing, Vol: 38, Pages: S143-S155, ISSN: 1095-7197
The use of composable abstractions allows the application of new and established algorithms to a wide range of problems, while automatically inheriting the benefits of well-known performance optimizations. This work highlights the composition of the PETSc DMPlex domain topology abstraction with the Firedrake automated finite element system to create a PDE solving environment that combines expressiveness, flexibility, and high performance. We describe how Firedrake utilizes DMPlex to provide the indirection maps required for finite element assembly, while supporting various mesh input formats and runtime domain decomposition. In particular, we describe how DMPlex and its accompanying data structures allow the generic creation of user-defined discretizations, while utilizing data layout optimizations that improve cache coherency and ensure overlapped communication during assembly computation.
Mitchell L, Mueller EH, 2016, High level implementation of geometric multigrid solvers for finite element problems: Applications in atmospheric modelling, Journal of Computational Physics, Vol: 327, Pages: 1-18, ISSN: 1090-2716
The implementation of efficient multigrid preconditioners for elliptic partial differential equations (PDEs) is a challenge due to the complexity of the resulting algorithms and corresponding computer code. For sophisticated (mixed) finite element discretisations on unstructured grids an efficient implementation can be very time consuming and requires the programmer to have in-depth knowledge of the mathematical theory, parallel computing and optimisation techniques on manycore CPUs.In this paper we show how the development of bespoke multigrid preconditioners can be simplified significantly by using a framework which allows the expression of the each component of the algorithm at the correct abstraction level. Our approach (1) allows the expression of the finite element problem in a language which is close to the mathematical formulation of the problem, (2) guarantees the automatic generation and efficient execution of parallel optimised low-level computer code and (3) is flexible enough to support different abstraction levels and give the programmer control over details of the preconditioner. We use the composable abstractions of the Firedrake/PyOP2 package to demonstrate the efficiency of this approach for the solution of strongly anisotropic PDEs in atmospheric modelling. The weak formulation of the PDE is expressed in Unified Form Language (UFL) and the lower PyOP2 abstraction layer allows the manual design of computational kernels for a bespoke geometric multigrid preconditioner. We compare the performance of this preconditioner to a single-level method and hypre's BoomerAMG algorithm. The Firedrake/PyOP2 code is inherently parallel and we present a detailed performance analysis for a single node (24 cores) on the ARCHER supercomputer. Our implementation utilises a significant fraction of the available memory bandwidth and shows very good weak scaling on up to 6,144 compute cores.
Rathgeber F, Mitchell L, 2016, firedrake-bench: firedrake bench optimality paper release
A repository of Firedrake benchmarks
Guo X, Lange M, Gorman G, et al., 2015, Developing a scalable hybrid MPI/OpenMP unstructured finite element model, COMPUTERS & FLUIDS, Vol: 110, Pages: 227-234, ISSN: 0045-7930
Lange M, Gorman G, Weiland M, et al., 2013, Acieving efficient strong scaling with PETSc using hybrid MPI/OpenMP optimisations, Publisher: Springer Berlin Heidelberg, Pages: 97-108
Markall GR, Rathgeber F, Mitchell L, et al., 2013, Performance-Portable Finite Element Assembly Using PyOP2 and FEniCS, International Supercomputing Conference (ISC), Publisher: Springer, Pages: 279-289, ISSN: 0302-9743
We describe a toolchain that provides a fully automated compilation pathway from a finite element domain-specific language to low-level code for multicore and GPGPU platforms. We demonstrate that the generated code exceeds the performance of the best available alternatives, without requiring manual tuning or modification of the generated code. The toolchain can easily be integrated with existing finite element solvers, providing a means to add performance portable methods without having to rebuild an entire complex implementation from scratch.
Guo X, Gorman G, Lange M, et al., 2013, Exploring the Thread-level Parallelisms for the Next Generation Geophysical Fluid Modelling Framework Fluidity-ICOM, Procedia Engineering, Vol: 61, Pages: 251 - 257-251 - 257, ISSN: 1877-7058
Plank G, Neic A, Liebmann M, et al., 2012, Accelerating cardiac bidomain simulations using Graphics Processing Units, Biomedical Engineering, IEEE Transactions on, Vol: 59, Pages: 2281-2290, ISSN: 0018-9294
Mitchell L, Sloan TM, Mewissen M, et al., 2012, Parallel classification and feature selection in microarray data using SPRINT, Concurrency and Computation: Practice and Experience
Rathgeber F, Markall GR, Mitchell L, et al., 2012, PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes, High Performance Computing, Networking Storage and Analysis, SC Companion, Publisher: IEEE Computer Society, Pages: 1116-1123
Emerging many-core platforms are very difficult to program in a performance portable manner whilst achieving high efficiency on a diverse range of architectures. We present work in progress on PyOP2, a high-level embedded domain-specific language for mesh-based simulation codes that executes numerical kernels in parallel over unstructured meshes. Just-in-time kernel compilation and parallel scheduling are delayed until runtime, when problem-specific parameters are available. Using generative metaprogramming, performance portability is achieved, while details of the parallel implementation are abstracted from the programmer. PyOP2 kernels for finite element computations can be generated automatically from equations given in the domain-specific Unified Form Language. Interfacing to the multi-phase CFD code Fluidity through a very thin layer on top of PyOP2 yields a general purpose finite element solver with an input notation very close to mathematical formulae. Preliminary performance figures show speedups of up to 3.4x compared to Fluidity's built-in solvers when running in parallel.
Weiland M, Mitchell L, Gorman G, et al., 2012, Mixed-mode implementation of PETSc for scalable linear algebra on multi-core processors
Piotrowski M, McGilvary G, Sloan T, et al., 2012, Exploiting Parallel R in the Cloud with SPRINT, Methods of Information in Medicine
Niederer S, Mitchell L, Smith N, et al., 2011, Simulating human cardiac physiology on clinical time-scales, Frontiers in Physiology, Vol: 2, Pages: 1-7, ISSN: 1664-042X
In this study, the feasibility of conducting in silico experiments in near-realtime with anatomically realistic, biophysically detailed models of human cardiac electrophysiology is demonstrated using a current national high-performance computing facility. The required performance is achieved by integrating and optimizing load balancing and parallel I/O, which lead to strongly scalable simulations up to 16,384 compute cores. This degree of parallelization enables computer simulations of human cardiac electrophysiology at 240 times slower than real time and activation times can be simulated in approximately 1 min. This unprecedented speed suffices requirements for introducing in silico experimentation into a clinical workflow.
Piotrowski M, Sloan TM, Mewsissen M, et al., 2011, Optimisation and parallelisation of the partitioning around medoids function in R, Pages: 707-713
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.