38 results found
Cotter CJ, Ham DA, Pain CC, et al., LBB Stability of a Mixed Discontinuous/Continuous Galerkin Finite Element Pair
We introduce a new mixed discontinuous/continuous Galerkin finite element forsolving the 2- and 3-dimensional wave equations and equations of incompressibleflow. The element, which we refer to as P1dg-P2, uses discontinuous piecewiselinear functions for velocity and continuous piecewise quadratic functions forpressure. The aim of introducing the mixed formulation is to produce a newflexible element choice for triangular and tetrahedral meshes which satisfiesthe LBB stability condition and hence has no spurious zero-energy modes. Weillustrate this property with numerical integrations of the wave equation intwo dimensions, an analysis of the resultant discrete Laplace operator in twoand three dimensions, and a normal mode analysis of the semi-discrete waveequation in one dimension.
Homolya M, Mitchell L, Luporini F, et al., TSFC: a structure-preserving form compiler
A form compiler takes a high-level description of the weak form of partialdifferential equations and produces low-level code that carries out the finiteelement assembly. In this paper we present the Two-Stage Form Compiler (TSFC),a new form compiler with the main motivation to maintain the structure of theinput expression as long as possible. This facilitates the application ofoptimizations at the highest possible level of abstraction. TSFC features anovel, structure-preserving method for separating the contributions of a formto the subblocks of the local tensor in discontinuous Galerkin problems. Thisenables us to preserve the tensor structure of expressions longer through thecompilation process than other form compilers. This is also achieved in part bya two-stage approach that cleanly separates the lowering of finite elementconstructs to tensor algebra in the first stage, from the scheduling of thosetensor operations in the second stage. TSFC also efficiently traversescomplicated expressions, and experimental evaluation demonstrates goodcompile-time performance even for highly complex forms.
Schwedes T, Funke SW, Ham DA, An iteration count estimate for a mesh-dependent steepest descent method based on finite elements and Riesz inner product representation
Existing implementations of gradient-based optimisation methods typicallyassume that the problem is posed in Euclidean space. When solving optimalityproblems on function spaces, the functional derivative is then inaccuratelyrepresented with respect to $\ell^2$ instead of the inner product induced bythe function space. This error manifests as a mesh dependence in the number ofiterations required to solve the optimisation problem. In this paper, ananalytic estimate is derived for this iteration count in the case of a simpleand generic discretised optimisation problem. The system analysed is thesteepest descent method applied to a finite element problem. The estimate isbased on Kantorovich's inequality and on an upper bound for the conditionnumber of Galerkin mass matrices. Computer simulations validate the iterationnumber estimate. Similar numerical results are found for a more complexoptimisation problem constrained by a partial differential equation.Representing the functional derivative with respect to the inner productinduced by the continuous control space leads to mesh independent convergence.
Luporini F, Ham DA, Kelly PHJ, 2017, An Algorithm for the Optimization of Finite Element Integration Loops, ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, Vol: 44, ISSN: 0098-3500
Mitchell L, Ham DA, McRae ATT, et al., 2017, Firedrake: automating the finite element method by composing abstractions, ACM Transactions on Mathematical Software, Vol: 43, ISSN: 1557-7295
Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrakeadopts the domain-specific language for the finite element method of the FEniCS project, but with a purePython runtime-only implementation centred on the composition of several existing and new abstractions forparticular aspects of scientific computing. The result is a more complete separation of concerns which easesthe incorporation of separate contributions from computer scientists, numerical analysts and applicationspecialists. These contributions may add functionality, or improve performance.Firedrake benefits from automatically applying new optimisations. This includes factorising mixed functionspaces, transforming and vectorising inner loops, and intrinsically supporting block matrix operations.Importantly, Firedrake presents a simple public API for escaping the UFL abstraction. This allows users toimplement common operations that fall outside pure variational formulations, such as flux-limiters.
Schwedes T, Ham DA, Funke SW, et al., 2017, Mesh Dependence in PDE-Constrained Optimisation An Application in Tidal Turbine Array Layouts, Publisher: Springer, ISBN: 9783319594835
This section verifies the iteration count estimates by solving the optimisation problem (2.2) numerically. The first experiment investigates the number of optimisation iterations required to solve (2.2) under non-uniform mesh refinement.
Bercea G, McRae ATT, Ham DA, et al., 2016, A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake, Geoscientific Model Development, Vol: 9, Pages: 3803-3815, ISSN: 1991-9603
We present a generic algorithm for numbering and then efﬁciently iterating over the data values attached to an extruded mesh. An extruded mesh is formed by replicating an existing mesh, assumed to be unstructured, to form layers of prismatic cells. Applications of extruded meshes include, but are not limited to, the representation of 3D high aspect ratio domains employed by geophysical ﬁnite element simulations. These meshes are structured in the extruded direction. The algorithm presented here exploits this structure to avoid the performance penalty traditionally associated with unstructured meshes. We evaluate the implementation of this algorithm in the Firedrake ﬁnite element system on a range of low compute intensity operations which constitute worst cases for data layout performance exploration. The experiments show that having structure along the extruded direction enables the cost of the indirect data accesses to be amortized after 10-20 layers as long as the underlying mesh is well-ordered. We characterise the resulting spatial and temporal reuse in a representative set of both continuous-Galerkin and discontinuous-Galerkin discretisations. On meshes with realistic numbers of layers the performance achieved is between 70% and 90% of a theoretical hardware-speciﬁc limit.
Homolya M, Ham DA, 2016, A Parallel Edge Orientation Algorithm for Quadrilateral Meshes, SIAM Journal on Scientific Computing, Vol: 38, Pages: S48-S61, ISSN: 1095-7197
One approach to achieving correct finite element assembly is to ensure that the local orientation of facets relative to each cell in the mesh is consistent with the global orientation of that facet. Rognes et al. have shown how to achieve this for any mesh composed of simplex elements, and deal.II contains a serial algorithm for constructing a consistent orientation of any quadrilateral mesh of an orientable manifold. The core contribution of this paper is the extension of this algorithm for distributed memory parallel computers, which facilitates its seamless application as part of a parallel simulation system. Furthermore, our analysis establishes a link between the well-known Union-Find algorithm and the construction of a consistent orientation of a quadrilateral mesh. As a result, existing work on the parallelization of the Union-Find algorithm can be easily adapted to construct further parallel algorithms for mesh orientations.
McRae ATT, Mitchell L, Bercea, et al., 2016, Automated Generation and Symbolic Manipulation of Tensor Product Finite Elements, SIAM Journal on Scientific Computing, Vol: 38, Pages: S25-S47, ISSN: 1095-7197
We describe and implement a symbolic algebra for scalar and vector-valued finite elements, enabling the computer generation of elements with tensor product structure on quadrilateral, hexahedral, and triangular prismatic cells. The algebra is implemented as an extension to the domain-specific language UFL, the Unified Form Language. This allows users to construct many finite element spaces beyond those supported by existing software packages. We have made corresponding extensions to FIAT, the FInite element Automatic Tabulator, to enable numerical tabulation of such spaces. This tabulation is consequently used during the automatic generation of low-level code that carries out local assembly operations, within the wider context of solving finite element problems posed over such function spaces. We have done this work within the code-generation pipeline of the software package Firedrake; we make use of the full Firedrake package to present numerical examples.
Ham D, 2015, firedrake: an automated finite element system
Automated multiplatform code generation for the finite element method.
Heinis T, Ham DA, 2015, On-the-Fly Data Synopses: Efficient Data Exploration in the Simulation Sciences, SIGMOD RECORD, Vol: 44, Pages: 23-28, ISSN: 0163-5808
Luporini F, Varbanescu AL, Rathgeber F, et al., 2015, Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly, ACM Transactions on Architecture and Code Optimization, Vol: 11, Pages: 1-25, ISSN: 1544-3566
Hill J, Popova EE, Ham DA, et al., 2014, Adapting to life: ocean biogeochemical modelling and adaptive remeshing, OCEAN SCIENCE, Vol: 10, Pages: 323-343, ISSN: 1812-0784
AMCG, 2013, Fluidity/The Imperial College Ocean Model
Bertolli C, Betts A, Loriant N, et al., 2013, Compiler optimizations for industrial unstructured mesh CFD applications on GPUs, Pages: 112-126, ISSN: 0302-9743
Graphical Processing Units (GPUs) have shown acceleration factors over multicores for structured mesh-based Computational Fluid Dynamics (CFD). However, the value remains unclear for dynamic and irregular applications. Our motivating example is HYDRA, an unstructured mesh application used in production at Rolls-Royce for the simulation of turbomachinery components of jet engines. We describe three techniques for GPU optimization of unstructured mesh applications: a technique able to split a highly complex loop into simpler loops, a kernel specific alternative code synthesis, and configuration parameter tuning. Using these optimizations systematically on HYDRA improves the GPU performance relative to the multicore CPU. We show how these optimizations can be automated in a compiler, through user annotations. Performance analysis of a large number of complex loops enables us to study the relationship between optimizations and resource requirements of loops, in terms of registers and shared memory, which directly affect the loop performance. © Springer-Verlag Berlin Heidelberg 2013.
Du J, Fang F, Pain CC, et al., 2013, POD reduced-order unstructured mesh modeling applied to 2D and 3D fluid flow, COMPUTERS & MATHEMATICS WITH APPLICATIONS, Vol: 65, Pages: 362-379, ISSN: 0898-1221
Farrell PE, Ham DA, Funke SW, et al., 2013, AUTOMATED DERIVATION OF THE ADJOINT OF HIGH-LEVEL TRANSIENT FINITE ELEMENT PROGRAMS, SIAM JOURNAL ON SCIENTIFIC COMPUTING, Vol: 35, Pages: C369-C393, ISSN: 1064-8275
Ford R, Glover M, Ham D, et al., 2013, GungHo Phase 1 Computational Science Recommendations, Publisher: The Met Office, Forecasting Research Technical Report No: 587
Markall GR, Rathgeber F, Mitchell L, et al., 2013, Performance-portable finite element assembly using PyOP2 and FEniCS, Pages: 279-289, ISSN: 0302-9743
We describe a toolchain that provides a fully automated compilation pathway from a finite element domain-specific language to low-level code for multicore and GPGPU platforms. We demonstrate that the generated code exceeds the performance of the best available alternatives, without requiring manual tuning or modification of the generated code. The toolchain can easily be integrated with existing finite element solvers, providing a means to add performance portable methods without having to rebuild an entire complex implementation from scratch. © 2013 Springer-Verlag.
Markall GR, Slemmer A, Ham DA, et al., 2013, Finite element assembly strategies on multi-core and many-core architectures, INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Vol: 71, Pages: 80-97, ISSN: 0271-2091
Rognes ME, Ham DA, Cotter CJ, et al., 2013, Automating the solution of PDEs on the sphere and other manifolds in FEniCS 1.2, GEOSCIENTIFIC MODEL DEVELOPMENT, Vol: 6, Pages: 2099-2119, ISSN: 1991-959X
Farrell PE, Funke SW, Ham DA, et al., 2012, dolfin-adjoint
The dolfin-adjoint project automatically derives the discrete adjoint and tangent linear models from a forward finite element model written in the Python interface to Dolfin.
Hill J, Piggott MD, Ham DA, et al., 2012, On the performance of a generic length scale turbulence model within an adaptive finite element ocean model, OCEAN MODELLING, Vol: 56, Pages: 1-15, ISSN: 1463-5003
Rathgeber F, Markall GR, Mitchell L, et al., 2012, PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes, 25th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Publisher: IEEE, Pages: 1116-1123
Cotter CJ, Ham DA, 2011, Numerical wave propagation for the triangular P1(DG)-P2 finite element pair, JOURNAL OF COMPUTATIONAL PHYSICS, Vol: 230, Pages: 2806-2820, ISSN: 0021-9991
Ham DA, 2010, On techniques for modelling coastal and ocean flow with unstructured meshes
Markall GR, Ham DA, Kelly PHJ, 2010, Towards generating optimised finite element solvers for GPUs from high-level specifications, International Conference on Computational Science (ICCS), Publisher: ELSEVIER SCIENCE BV, Pages: 1809-1817, ISSN: 1877-0509
Markall GR, Ham DA, Kelly PHJ, 2010, Generating Optimised Finite Element Solvers for GPU Architectures, International Conference on Numerical Analysis and Applied Mathematics, Publisher: AMER INST PHYSICS, Pages: 787-790, ISSN: 0094-243X
We show that optimal implementations of a finite element solver written for a Graphics Processing Unit and a multicore CPU require the use of different algorithms and data formats. This motivates the use of code generation in order to produce efficient, maintainable implementations of the finite element method for GPU architectures.
Cotter CJ, Ham DA, Pain CC, 2009, A mixed discontinuous/continuous finite element pair for shallow-water ocean modelling, OCEAN MODELLING, Vol: 26, Pages: 86-90, ISSN: 1463-5003
Cotter CJ, Ham DA, Pain CC, et al., 2009, LBB stability of a mixed Galerkin finite element pair for fluid flow simulations, JOURNAL OF COMPUTATIONAL PHYSICS, Vol: 228, Pages: 336-348, ISSN: 0021-9991
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.