KozCG

This page describes the numerical method in KozCG method at a high level, for more details, see Papers.

The numerical method in KozCG is an element-based implementation of a finite element method for unstructured meshes, composed of tetrahedral cells. The solver is intended for the computing of compressible flows. A good reference for the finite element method itself is [1] and on flux-corrected transport is [2]. The implementation of the method uses an asynchronous-by-default, distributed-memory, task-parallel programming paradigm using the Charm++ runtime system.

The equations of compressible flow

The governing equations solved are the 3D unsteady Euler equations, governing inviscid compressible flow, written here in the flux-conservative form

\[ \begin{split} \frac{\partial U}{\partial t} + \frac{\partial F_j}{\partial x_j} = S,\qquad U = \begin{Bmatrix} \rho \\ \rho u_i \\ \rho E \end{Bmatrix}, \enskip F_j = \begin{Bmatrix} \rho u_j \\ \rho u_iu_j + p\delta_{ij} \\ u_j(\rho E + p) \end{Bmatrix}, \enskip S = \begin{Bmatrix} S_\rho \\ S_{u,i} \\ S_E \end{Bmatrix}, \end{split} \]

where the summation convention on repeated indices has been applied, and $\rho$ is the density, $u_i$ the velocity vector, $E=u_iu_i/2+e$ is the specific total energy, $e$ is the specific internal energy, and $p$ is the pressure. $S_\rho$ , $S_{u,i}$ , and $S_E$ are source terms that arise from the application of the method of manufactured solutions, used for verification; these source terms are zero when computing practical engineering problems. The system is closed with the ideal gas law equation of state

\[ \begin{split} p = \rho e (\gamma - 1), \end{split} \]

where $\gamma$ is the ratio of specific heats. The exact form of the equation of state is of secondary importance, any arbitrary (i.e. analytic or tabulated) equation of state can be used to relate density and internal energy to pressure.

Discretization in space

To arrive at a continuous Galerkin finite element method we start from the weak form of the governing equations above,

\[ \begin{split} \int_{\Omega} N^v \left(\frac{\partial U}{\partial t} + \frac{\partial F_j}{\partial x_j} - S\right)\mathrm{d}\Omega = 0, \quad v=1,2,\dots,n, \end{split} \]

which requires that the error in the numerical solution, sampled at $n$ discrete points, $v$ , using the weighting functions $N^v$ , vanish on the whole domain, $\Omega$ , in an integral sense. Numerically approximating the solution as $U \approx N^w \hat{U}_w$ , where $\hat{U}_w$ denotes the unknowns at the discrete node $w$ , leads to the Galerkin weak statement

\[ \begin{split} \int_{\Omega} N^v \bigg[N^w\frac{\partial\hat{U}_w}{\partial t} + \frac{\partial}{\partial x_j}F_j(N^w \hat{U}_w) - S(N^w\hat{U}_w)\bigg]\mathrm{d}\Omega = 0. \end{split} \]

Integrating the flux term by parts, neglecting the resulting boundary integral (assuming zero flux on the problem boundary), and applying $F_j(N^w \hat{U}_w) \approx N^w F_j(\hat{U}_w)$ and $S(N^w \hat{U}_w) \approx N^w S(\hat{U}_w)$ , yields the final weak form for the whole domain, $\Omega$ ,

\[ \begin{split} \int_{\Omega} N^v N^w \mathrm{d}\Omega \frac{\partial\hat{U}_w}{\partial t} - \int_{\Omega} N^w \mathrm{d}\Omega \frac{\partial N^v}{\partial x_j} F_j(\hat{U}_w) - \int_{\Omega} N^v N^w \mathrm{d}\Omega \enskip S(\hat{U}_w) = 0. \end{split} \]

All integrals above are evaluated by breaking up the domain, $\Omega$ , into sub-domains as a sum over integrals over discrete elements, $\Omega_e$ ,

\[ \begin{split} \sum_{\Omega_e \in v} \sum_{w \in \Omega_e} \int_{\Omega_e} N^v N^w \mathrm{d}\Omega_e \frac{\partial\hat{U}_w}{\partial t} & = \\ \sum_{\Omega_e \in v} \sum_{w \in \Omega_e} \int_{\Omega_e} N^w & \mathrm{d}\Omega_e \frac{\partial N^v}{\partial x_j} F_j(\hat{U}_w) + \sum_{\Omega_e \in v} \sum_{w \in \Omega_e} \int_{\Omega} N^v N^w \mathrm{d}\Omega \enskip S(\hat{U}_w), \end{split} \]

where the inner summation is over points $w$ forming $\Omega_e$ (gather) and the outer summation is over tetrahedra $\Omega_e$ connected to point $v$ (scatter). The above equations result in the following semi-discrete system of equations

\[ \begin{split} {\mbox{\boldmath$M$}}_c \mbox{\boldmath$\hat{U}$},_{t} = {\mbox{\boldmath$r$}}({\mbox{\boldmath$\hat{U}$}}), \end{split} \]

where the comma denotes a derivative. The size of the consistent mass matrix $M_c$ is $n \times n$ , where $n$ is the number of nodes of the computational mesh. According to the sum above, only those elements contribute to a given row $v$ of ${\mbox{\boldmath$M$}}_c$ which contain node $v$ (scatter), thus ${\mbox{\boldmath$M$}}_c$ is sparse.

Discretization in time

The equation above is discretized in time using a Lax-Wendroff (Taylor-Galerkin) scheme, implemented using two stages:

\[ \begin{split}\begin{aligned} U^{n+1/2} & = U^n + \frac{\Delta t}{2} U^n_{,t} = U^n - \frac{\Delta t}{2} \frac{\partial}{\partial x_j} F_j(U^n) + \frac{\Delta t}{2} S(U^n), \\ \Delta U & = U^{n+1} - U^n = \Delta t U^{n+1/2}_{,t} = -\Delta t \frac{\partial}{\partial x_j} F_j(U^{n+1/2}) + \Delta t S(U^{n+1/2}). \end{aligned}\end{split} \]

The above scheme combined with linear shape functions is identical to a two-stage Runge-Kutta Galerkin finite element method, and thus central differencing without damping. Stabilization is obtained by using constant shape functions for the half step solution where the gather results in element quantities, followed by a scatter step using linear shape functions resulting in nodal quantities. Assuming linear tetrahedra, the combined spatial and temporal discretization that achieves this yields the following staggered scheme:

\[ \begin{split}\begin{aligned} U^{n+1/2}_e & = \frac{1}{4}\sum_{w=1}^4 \hat{U}^n_w - \frac{\Delta t}{2} \sum_{w=1}^4 \frac{\partial N^w}{\partial x_j} F_j(\hat{U}^n_w) + \frac{\Delta t}{2} \frac{1}{4} \sum_{w=1}^4 S(\hat{U}^n_w),\\ \sum_{\Omega_e \in v} \sum_{w \in \Omega_e} \int_{\Omega_e} N^v N^w \mathrm{d}\Omega_e \Delta \hat{U}_w & = \Delta t \int_{\Omega_e} \frac{\partial N^v}{\partial x_j} F_j(U_e^{n+1/2}) \mathrm{d}\Omega_e + \Delta t \int_{\Omega_e} N^v S(U_e^{n+1/2}) \mathrm{d}\Omega_e \\ \end{aligned}\end{split} \]

where $U_e^{n+1/2}$ is the vector of element-centered solutions at the half step. Note that the first step discretizes the flux integral before integration by parts, while the second one after integration by parts, hence the difference in sign.

Flux-corrected transport

Flux-corrected transport (FCT) is a solution to circumvent the consequence of Godunov's theorem, which states that no linear scheme of order greater than 1 will yield monotonic solutions. Accordingly, FCT is a nonlinear scheme that combines a high-, and a low-order scheme using limiting.

In the FCT scheme used, the high-order solution at the new time step can be written as

\[ \begin{split} {\mbox{\boldmath$U$}}^{n+1} = {\mbox{\boldmath$U$}}^n + \Delta{\mbox{\boldmath$U$}}^h = {\mbox{\boldmath$U$}}^n + \Delta{\mbox{\boldmath$U$}}^l + (\Delta{\mbox{\boldmath$U$}}^h - \Delta{\mbox{\boldmath$U$}}^l) = {\mbox{\boldmath$U$}}^l + (\Delta{\mbox{\boldmath$U$}}^h - \Delta{\mbox{\boldmath$U$}}^l), \end{split} \]

where $\Delta {\mbox{\boldmath$U$}}^h$ and $\Delta {\mbox{\boldmath$U$}}^l$ denote the solution increments of the high-, and low-order schemes, respectively. In the equations above it is the last term that is limited in a way to avoid spurious oscillations as

\[ \begin{split} {\mbox{\boldmath$U$}}^{n+1} = {\mbox{\boldmath$U$}}^l + \mathrm{lim}(\Delta{\mbox{\boldmath$U$}}^h - \Delta{\mbox{\boldmath$U$}}^l), \end{split} \]

The high-order scheme scheme, given above is symbolically written as

\[ \begin{split} {\mbox{\boldmath$M$}}_c \Delta{\mbox{\boldmath$U$}}^h = {\mbox{\boldmath$r$}}. \end{split} \]

From the equation above we construct a low order scheme by lumping the mass matrix and adding mass diffusion

\[ \begin{split} {\mbox{\boldmath$M$}}_l \Delta{\mbox{\boldmath$U$}}^l = {\mbox{\boldmath$r$}} + {\mbox{\boldmath$d$}} = {\mbox{\boldmath$r$}} - c_\tau({\mbox{\boldmath$M$}}_l - {\mbox{\boldmath$M$}}_c) {\mbox{\boldmath$U$}} \end{split} \]

where ${\mbox{\boldmath$M$}}_l$ is the lumped mass matrix and $c_\tau$ is a diffusion coefficient. Using $c_\tau=1$ guarantees a monotone low order solution. If we rewrite the above equation as

\[ \begin{split} {\mbox{\boldmath$M$}}_l\Delta {\mbox{\boldmath$U$}}^h = {\mbox{\boldmath$r$}} + ({\mbox{\boldmath$M$}}_l - {\mbox{\boldmath$M$}}_c) \Delta {\mbox{\boldmath$U$}}^h \end{split} \]

the difference between the right hand sides of the high and low order schemes can be recognized as

\[ \begin{split} \mathrm{AEC} = \Delta {\mbox{\boldmath$U$}}^h - \Delta {\mbox{\boldmath$U$}}^l = {\mbox{\boldmath$M$}}_l^{-1} ({\mbox{\boldmath$M$}}_l - {\mbox{\boldmath$M$}}_c)(c_\tau{\mbox{\boldmath$U$}}^n + \Delta{\mbox{\boldmath$U$}}^h), \end{split} \]

also called as the anti-diffusive element contributions (AEC). The AEC is then limited, $C_e\cdot\mathrm{AEC}$ , and applied to advance the solution to the next time step using the equation above, where $0 \le C_e \le 1$ is the limit coefficient for a given element $e$ .

The limiting procedure

The weak sum above shows that the left hand side is the consistent mass matrix. ${\mbox{\boldmath$M$}}_c$ is lumped by summing the rows to the diagonals. Inverting ${\mbox{\boldmath$M$}}_c$ distributed across computers would be costly. Using a lumped (diagonal) matrix instead reduces computational cost and software complexity at the cost of some additional numerical error but does not reduce the order of accuracy of the method. If the mesh does not move and its topology does not change, the left hand side needs no update between time steps.

The high-order right hand side is computed by the two-step procedure given above. Since the two steps are staggered, the gather takes information from nodes to cell centers, followed by a scatter, moving information from cells to nodes, both steps are contained within a single right hand side calculation within a time step. Within the two steps there is no need for parallel communication as an element always resides on a given mesh partition and only mesh nodes are shared between processing elements (PEs).

A step of the limiting procedure is to compute the mass diffusion term. Using linear tetrahedra, this is given for each element by

\[ \big[{\mbox{\boldmath$M$}}_l - {\mbox{\boldmath$M$}}_c\big]_e = \frac{J_e}{120} \left\{ \begin{array}{cc} 3.0 & i = j \\ -1.0 & i \ne j \end{array} \right. \]

where $J_e=\overrightarrow{BA} \cdot (\overrightarrow{CA} \times \overrightarrow{DA})$ is the element Jacobian, computed from the triple product of the edge vectors of the tetrahedron given by vertices A,B,C, and D. The above equation is also used to compute the anti-diffusive element contributions. Once the AECs are computed for each element, the next step is to sum all positive (negative) anti-diffusive element contributions to node i

\[ \begin{split} P_i^{\pm} = \sum_e \begin{Bmatrix} \max \\ \min \end{Bmatrix} (0,\mathrm{AEC}_e) \end{split} \]

The limiting procedure requires the maximum and minimum nodal values of the low-order solution and the previous solution,

\[ \begin{split} {\mbox{\boldmath$U$}}_i^* = \begin{Bmatrix} \max \\ \min \end{Bmatrix} ({\mbox{\boldmath$U$}}^l_i,{\mbox{\boldmath$U$}}^n). \end{split} \]

Another alternative is to only consider the low order solution, when computing the allowed solution bounds, which leads to the so-called 'clipping limiter' in place of the equation above. This is followed by computing the maximum and minimum nodal values of all elements,

\[ \begin{split} {\mbox{\boldmath$U$}}_e^* = \begin{Bmatrix} \max \\ \min \end{Bmatrix} ({\mbox{\boldmath$U$}}^*_A,{\mbox{\boldmath$U$}}^*_B,{\mbox{\boldmath$U$}}^*_C,{\mbox{\boldmath$U$}}^*_D), \end{split} \]

then computing the maximum and minimum unknowns of the elements surrounding each node,

\[ \begin{split} {\mbox{\boldmath$U$}}_i^{\tiny\begin{matrix} \max \\ \min \end{matrix}} = \begin{Bmatrix} \max \\ \min \end{Bmatrix} ({\mbox{\boldmath$U$}}^*_1,{\mbox{\boldmath$U$}}^*_2,\dots,{\mbox{\boldmath$U$}}^*_m). \end{split} \]

The limit coefficients will be computed (see below) based on $P_i^{\pm}$ and the maximum and minimum increments and decrements the nodal solution values are allowed to achieve,

\[ \begin{split} Q_i^{\pm} = {\mbox{\boldmath$U$}}_i^{\tiny\begin{matrix} \max \\ \min \end{matrix}} - {\mbox{\boldmath$U$}}^l. \end{split} \]

Defining the ratios of positive and negative element contributions for each node $i$ that ensure monotonicity as

\[ \begin{split} R_i^{\pm} = \left\{\begin{matrix} \min(1,Q_i^{\pm}/P_i^{\pm}) & P_i^+ > 0 > P_i^- \\ 1 & \mathrm{otherwise} \end{matrix} \right., \end{split} \]

the limit coefficient for each element is taken as the most conservative ratio

\[ \begin{split} C_e = \min_{i \in \Omega_e} \left\{\begin{matrix} R_i^+ & \mathrm{AEC}>0 \\ R_i^- & \mathrm{AEC}<0 \end{matrix}\right.. \end{split} \]

The limited AEC is then scatter-added to nodes of the low-order solution:

\[ \begin{split} U_i^{n+1} = U_i^l + \sum_{i \in \Omega_e} C_e \cdot \mathrm{AEC}. \end{split} \]

The above procedure is general, works on the numerical solution (instead on fluxes or slopes), and does not require a Riemann solver. The same procedure can be used for each scalar in a system of equations. This works well for independent scalars, but for coupled system of equations additional techniques have been developed to reflect the coupled nature of the equations in the limiting procedure.

References

[2] Lohner, Morgan, Peraire, Vahdati, Finite element flux-corrected transport (FEM-FCT) for the Euler and Navier–Stokes equations, International Journal for Numerical Methods in Fluids, 7, 1093-1109, 1987.