- Discretization, Solution, and Inversion for Large Systems of PDEs
Computer Science Department, University of Colorado Boulder, March 3, 2016
- We are often forced to make important decisions with imperfect and incomplete
data. In model-based inference, our efforts to extract useful information
from data are aided by models of what occurs where we have no observations:
examples range from climate prediction to patient-specific medicine. In many
cases, these models can take the form of systems of PDEs with
critical-yet-unknown parameter fields, such as initial conditions or material
coefficients of heterogeneous media.
A concrete example that I will present is to make predictions about the
Antarctic ice sheet from satellite observations, when we model the ice sheet
using a system of nonlinear Stokes equations with a Robin-type boundary
condition, governed by a critical, spatially varying coefficient. This talk
will present three aspects of the computational stack used to efficiently
estimate statistics for this kind of inference problem.
At the top is an posterior-distribution approximation for Bayesian inference,
that combines Laplace's method with randomized calculations to compute an
optimal low-rank representation.
Below that, the performance of this approach to inference is highly dependent
on the efficient and scalable solution of the underlying model equation, and
its first- and second- adjoint equations. A high-level description of a
problem (in this case, a nonlinear Stokes boundary value problem) may suggest
an approach to designing an optimal solver, but this is just the jumping-off
point: differences in geometry, boundary conditions, and other considerations
will significantly affect performance. I will discuss how the peculiarities
of the ice sheet dynamics problem lead to the development of an anisotropic
multigrid method (available as a plugin to the PETSc library for scientific
computing) that improves on standard approaches.
At the bottom, to increase the accuracy per degree of freedom of discretized
PDEs, I develop adaptive mesh refinement (AMR) techniques for large-scale
problems. I will present my algorithmic contributions to the p4est library
for parallel AMR that enable it to scale to concurrencies of O(10^6), as well
as recent work commoditizing AMR techniques in PETSc.
- Uniting Performance and Extensibility in Adaptive Finite Element Computations
CAAM Colloqium, Rice University, Houston, Texas, September 14, 2015
This talk will begin with a presentation of recent work that applies
p4est—a library for parallel adaptive mesh refinement—to large scale
computations, and ice sheet dynamics with highly anisotropic meshs and
global mantle convection simulations with billions of degrees of freedom.
The former is an example where extension of the datatypes and algorithms in
a library can help it to address unanticipated problems; the latter is an
example of how extensibility is crucial for the same library that performs
well on one process to peform well on a million. Time permitting, this will
lead into a discussion of work to expand the reach of adaptive methods in
PETSc, including work to make DMPlex—already a flexible interface for
unstructured meshes—even for flexible by adding support for the
non-conformal meshes that arise in AMR applications.
- Multilevel Methods for Forward and Inverse Ice Sheet Modeling
SIAM Conference on Computational Science and Engineering, Salt Lake City, Utah, March 14, 2015
We present our work on recovering ice sheet modeling parameter fields
(such as basal slipperiness) from observations, using both
deterministic and Bayesian inversion. The scalability of our adjoint-
and Hessian-based methods is determined by the scalability of two
sub-problems: the solution of the state PDEs (Stokes equations of ice
sheet dynamics), and the approximation and preconditioning of the
parameter-to-observation Hessian. For the former problem, we compare
the effectiveness of geometric and algebraic multigrid within the
solution of the state PDEs; for the latter, we discuss the use of
multilevel approximations to improve on Hessian approximation by
low-rank updates. The scalability of our work is tested on full-scale
models of the Antarctic ice sheet.
- Statistical Inversion for Basal Parameters for the Antarctic Ice Sheet
3rd SIAM Conference on Uncertainty Quantification, Savannah, Georgia, April 3, 2014
We formulate a Bayesian inference problem for the friction field at the
base of the Antarctic ice sheet from distributions for the observed
surface velocities and for the prior knowledge of the basal friction.
The dimension of the parameter space is large, and the map from
parameters to observations requires the solution of a system of
implicit nonlinear 3D PDEs. We approximate the posterior distribution
with a Gaussian centered at the maximum a posteriori point, with
covariance given by the inverse Hessian of the log posterior. By using
a low-rank approximation of the log likelihood, we are able to scale up
to the problem size of interest.
- Hybrid Quadtree/Octree AMR for Anisotropic Domains
16th SIAM Conference on Parallel Processing for Scientific Computing, Portland, Oregon, February 19, 2014
Many problems in geosciences, such as ice sheets, oceans, and the
atmosphere, involve thin, anisotropic domains. Researchers in these
areas are using 3D models where once they used 2D approximations. The
p4est library for parallel adaptive mesh refinement (AMR), which uses a
forest-of-octrees approach to AMR, only natively supports isotropic
mesh refinement, which is often insufficient for such problems. We
present an extension to this library with two refinement modes for
these anisotropic problems.
- Low-Cost Parallel Algorithms for 2:1 Octree Balance
26th IEEE International Parallel and Distributed Processing Symposium, Shanghai, China, May 22, 2012
The logical structure of a forest of octrees can be used to create
scalable algorithms for parallel adaptive mesh refinement (AMR), which
has recently been demonstrated for several petascale applications.
Among various frequently used octree- based mesh operations, including
refinement, coarsening, partitioning, and enumerating nodes, ensuring a
2:1 size balance between neighboring elements has historically been the
most expensive in terms of CPU time and communication volume. The 2:1
balance operation is thus a primary target to optimize. One important
component of a parallel balance algorithm is the ability to determine
whether any two given octants have a consistent distance/size relation.
Based on new logical concepts we propose fast algorithms for making
this decision for all types of 2:1 balance conditions in 2D and 3D.
Since we are able to achieve this without constructing any parent nodes
in the tree that would otherwise need to be sorted and communicated, we
can significantly reduce the required memory and communication volume.
In addition, we propose a lightweight col- lective algorithm for
reversing the asymmetric communication pattern induced by non-local
octant interactions. We have implemented our improvements as part of
the open- source “p4est” software. Benchmarking this code with both
synthetic and simulation-driven adapted meshes we are able to
demonstrate much reduced runtime and excellent weak and strong
scalability. On our largest benchmark problem with 5.13 x
1011 octants the new 2:1 balance algorithm executes in less
than 8 seconds on 112,128 CPU cores of the Jaguar Cray XT5