Low-Mach number fluid scheme (variable-density, ideal-gas EOS) [WIP]#2571
Low-Mach number fluid scheme (variable-density, ideal-gas EOS) [WIP]#2571adigh23 wants to merge 15 commits into
Conversation
Registers a new "lowmach" fluid scheme via the factory. The type extends fluid_pnpn_t with no overrides, so the scheme is selectable from the .case file but behaves identically to the standard Pn-Pn solver. Physics (variable density, Q_T source, full-stress viscous) will be added incrementally by overriding init/step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The new step() body is a verbatim copy of fluid_pnpn_step. Output is bit-identical to the standard Pn-Pn solver, confirming the override dispatches correctly. This is the empty canvas onto which low-Mach physics will be layered in subsequent commits.
Override init() to read case.fluid.low_mach.{enabled,P0,R_specific,
temperature_field} from the case file, with sensible defaults.
At the top of step(), when low-Mach mode is enabled, resolve the
temperature field from the registry lazily and update the density field
pointwise via rho = P0 / (R_specific * T). The existing Pn-Pn residuals
still read rho as a point value, so this spatial update does not yet
affect the solution — it is infrastructure for the subsequent residual
replacement.
Output is bit-identical to the standard Pn-Pn solver when low-Mach is
disabled, and produces identical residuals when enabled with no scalar
present (lazy T lookup safely falls through).
Adds an abstract lowmach_residual module with pressure and velocity residual types, a CPU backend (lowmach_res_cpu), and a factory submodule. The CPU implementation accepts a Q_T (thermal divergence) field argument but currently ignores it; the body reproduces pnpn_res_cpu exactly. fluid_lowmach_pnpn_t now owns lm_prs_res, lm_vel_res allocatable members plus a zero-initialised Q_T field. When low_mach_enabled is true, step() routes pressure and velocity residuals through these new operators instead of the inherited pnpn ones. When false, the inherited path is used verbatim. Verification: tgv example produces bit-identical output with both low_mach disabled (old pnpn path) and enabled (new lowmach_residual path). The scaffolding is now in place for the Q_T source contribution and variable-property stress formulation in subsequent commits. Emits a neko_error at init time if low_mach is requested with NEKO_BCKND_DEVICE=1, since the device backend is not yet implemented.
Injects the thermal divergence source into the low-Mach pressure residual: p_res += (bd/dt) * Q_T * B Matches Nek5000 plan4.f convention (admcol3 with dtbd and bm1). Q_T is still held at zero by the scheme, so numerics are unchanged. Output remains bit-identical to pnpn. The signed contribution is now active in the residual and ready to be exercised when Q_T is computed from the energy equation in a subsequent commit.
Adds lowmach_update_Q_T which populates the thermal-divergence field as
Q_T = div(k grad T) / (rho * cp * T)
The formula uses the energy equation without heat release
(rho cp D_t T = div(k grad T)) so the advective term cancels and Q_T
reduces to a pure spatial expression. The weak-form gradient, strong
conversion via gather-scatter + B^-1, multiplication by k, weak
divergence via cdtp, reassembly, and pointwise division by (rho cp T)
mirror the sequence in the earlier lowmach-attempt-1 branch (which had
the algebra correct).
Reads k_conductivity and cp from case.fluid.low_mach.{k_conductivity,cp}
with defaults of 1.0 each.
Called from step() immediately after the density update. When the
temperature field is not present in the registry (current tgv example),
the routine returns early and Q_T stays at zero, so output remains
bit-identical to pnpn. A test case with a scalar will be needed to
exercise the computed Q_T path.
Replaces the point-value reads (rho%x(1,1,1,1), mu%x(1,1,1,1)) with pointwise field reads in both the pressure and velocity residuals. The Helmholtz coefficients c_Xh%h1 and c_Xh%h2 now vary spatially, which is the pre-requisite for variable-density / variable-viscosity low-Mach physics. When rho and mu are spatially uniform — as in the tgv example and whenever low-Mach is disabled or the temperature field is unavailable — the field and point-value paths give identical results, so output remains bit-identical to pnpn. This is the last pure constant-viscosity refinement. The next step is to switch to the full-stress viscous formulation so the ∇·(2μS − (2/3)μ(∇·u)I) tensor is evaluated correctly when μ varies across the domain.
Configures the TGV example with a scalar initialised to a spatially uniform temperature T=1 and enables the low-Mach path with P0=R_specific=k=cp=1. This exercises the full low-Mach pipeline: lazy T_ptr resolution from the registry, density update via EOS, Q_T computation, and residuals that consume rho and mu as spatial fields. With T uniform, Q_T vanishes and rho stays at 1, so fluid residuals are bit-identical to the standard pnpn run of the same configuration — the strongest check that the infrastructure is wired correctly. A follow-up case with a spatially varying T is the natural next step to exercise the non-trivial Q_T path.
Adaptation of the existing Boussinesq Rayleigh-Benard example to drive
fluid_lowmach_pnpn instead of fluid_pnpn. Two adjustments:
* Scalar temperature is shifted to the range [1, 2] (cold top at 1,
hot bottom at 2, DeltaT = 1 preserved) so that the ideal-gas EOS
rho = P0 / (R * T) used by the low-Mach scheme stays finite.
* The Boussinesq buoyancy source in the user file is evaluated as
rhs_w = Ra * Pr * (T - T_ref) with T_ref = 1, so the zero-buoyancy
state at the cold wall is preserved.
low_mach is enabled with P0 = R_specific = k_conductivity = cp = 1
and the temperature_field pointing at the scalar "temperature". rho
now varies across the domain (roughly 1 at the cold wall to 0.5 at the
hot wall) and Q_T = div(k grad T) / (rho cp T) is non-zero wherever
there is a thermal gradient. This is the first configuration that
exercises the full low-Mach pipeline with a meaningful spatial
variation in temperature.
Short 50-step run converges cleanly: stable CFL ~ 0.07, pressure
residuals in the 1e-4 to 1e-6 range, velocity and scalar residuals
small. Full 100-time-unit run left to the user.
Previously the BDF/EXT/OIFS rhs_maker calls passed rho%x(1,1,1,1)
(a single point value) while the velocity Helmholtz LHS used the full
rho field. This made the momentum equation internally inconsistent
whenever rho varied spatially.
Introduces three local subroutines that mirror rhs_maker_ext_cpu,
rhs_maker_bdf_cpu and rhs_maker_oifs_cpu but take rho as a field and
weight each node by rho(x). step() now branches to these when
low_mach_enabled is true; the inherited pnpn rhs_makers are still used
otherwise.
Verification:
* Disabled path: bit-identical to pnpn on the TGV smoke test.
* Enabled path with uniform rho (T=1 uniform): bit-identical to pnpn,
confirming the field-rho variant reduces to the scalar-rho one when
rho is uniform.
* Rayleigh-Benard case with variable rho: runs 50 steps to Normal end
with residuals in the expected range. Numerics differ slightly from
the previous run because the momentum equation is now self-consistent.
Developer guide and implementation notes for the low-Mach (variable-density, ideal-gas EOS) fluid scheme on this branch.
| **Energy** — solved by the Neko **scalar** named `temperature`: | ||
|
|
||
| ρ cp (∂T/∂t + u·∇T) = ∇·(λ∇T) |
There was a problem hiding this comment.
I guess this is a design decision you want to think about. Whether the fluid type itself should handle the scalar solve or whether the user should setup the scalar for it. The latter is probably easier programming-wise1
|
|
||
| | member | meaning | | ||
| |---|---| | ||
| | `low_mach_enabled` | master switch; if false, `step` runs vanilla pnpn | |
There was a problem hiding this comment.
Probably not needed outside the intial debugging phase.
| The decisive detail: `λ` and `cp` are the scalar's **spatially varying fields**, | ||
| resolved lazily from the registry as `"<temperature>_lambda_tot"` (fallback | ||
| `"_lambda"`) and `"_cp"` — i.e. Nek5000's `vdiff`/`vtrans` of the temperature | ||
| field. They are filled by the user `material_properties` hook (e.g. Sutherland | ||
| `λ(T)`). Using a **constant** `k`/`cp` instead (the original code) makes `Q_T` | ||
| inconsistent with the energy balance and was the cause of the heated-channel | ||
| blow-up. If the scalar fields are absent, it falls back to `k_cond`/`cp`. |
There was a problem hiding this comment.
Not sure I fully understand this, I guess we can discuss!
| `lm_rhs_ext_field_rho`, `lm_rhs_bdf_field_rho`, `lm_rhs_oifs_field_rho` are | ||
| variable-ρ counterparts of `rhs_maker_*_cpu`: they assemble the AB-extrapolated | ||
| forcing history and the BDF lagged-velocity term weighting **per node by ρ(x)** |
There was a problem hiding this comment.
I don't think you need new types here. The rhs makers we have should just treat variable density properly. Density is always an array in the fluid anyway,
| ```jsonc | ||
| "fluid": { | ||
| "scheme": "lowmach", | ||
| "full_stress_formulation": true, // REQUIRED (coupled stress operator) |
There was a problem hiding this comment.
I think you should just force this in the code, since it is required, there is no reason to let the user configure this.
| ## 10. Material properties / coupling (user side) | ||
|
|
||
| Variable `μ(T)`, `λ(T)`, `cp` are supplied through the Neko `material_properties` |
There was a problem hiding this comment.
It would maybe be good to provide some good default options, i.e. Sutherland's, so that one does not always need to code this in the user file. @tuananhdao Does the compressible solver use these kind of relations? I am thinking that maybe some code can be reused across the two schemes here.
| ### The fix — defect correction | ||
| Solve the true variable-ρ system `A_var p = b` by an outer iteration whose inner | ||
| solve uses a **constant reference operator** `A_ref` (hsmg's home turf): |
There was a problem hiding this comment.
Is this stuff something you invented, or does this follow the Nek5000 formulation? My thinking is that since we are basically porting the Nek5000 scheme here, we could/should, in the first instance, just match exactly what Nek5000 does and achieve the same solution in both codes for some validation flow problem. That would be a good milestone!
| procedure, pass(this) :: init => fluid_lowmach_pnpn_init | ||
| procedure, pass(this) :: step => fluid_lowmach_pnpn_step |
There was a problem hiding this comment.
You are clearly missing a free routine because you have allocatables and pointers inside the type.
| do concurrent (i = 1:n) | ||
| this%Q_T_field%x(i,1,1,1) = this%Q_T_field%x(i,1,1,1) & | ||
| + w1%x(i,1,1,1) | ||
| end do |
There was a problem hiding this comment.
Prefer calling field_math.f90 / math.f90 routines where possible, unless you are fusing a big kernel into one loop. This will make GPU porting easier later.
There was a problem hiding this comment.
Since this is the CPU path, I would rather say keep it as is. And in the near future replace do concurrent with do + directives + omp do (see #2335)
…ting
Two structural bugs in the variable-density (low-Mach) path, found and verified
while building an MMS validation against Nek5000 short_tests/lowMach_test:
* Q_T thermal divergence (fluid_lowmach_pnpn.f90): the conduction term
div(lambda grad T) was built with cdtp (the weak/transpose operator D^T),
which does NOT compose with Binv into a strong divergence -- it produced
O(1e2) oscillatory values even for an exactly-resolved T=2+x^2 (Laplacian=2).
Replaced with the dudxyz collocation-derivative chain used by Neko's div().
Q_T_field is now registered ('Q_T') so user hooks / field writers can read it,
and an OPTIONAL volumetric heat source <temperature>_qdot is added to Q_T
(matching Nek5000 userqtl_scig); off by default, so existing cases are
bit-for-bit unchanged.
* energy equation (scalar_pnpn.f90): in the low_mach rho*cp path the user
volumetric source q was bundled into the RHS with the advection and then the
whole RHS multiplied by rho*cp(x)=1/T, so the solver enforced ...+rho*cp*q
instead of +q, leaving a spurious (rho*cp-1)*q steady residual everywhere
T!=1. The source (and weak-BC flux) is now snapshotted raw BEFORE the rho*cp
advection weighting and re-added un-weighted, mirroring Nek5000's split of
makeuq (raw source) from convab (vtrans-weighted advection). NB this affects
every low-Mach case with a scalar source (heat release, species reactions).
Also carries the variable-density momentum/pressure residual operating on
spatial rho/mu fields (lowmach_res_cpu) and the field-valued rho RHS makers
(rhs_maker*, rhs_maker_cpu/sx/device), plus the laminar mu_tot refresh
(fluid_scheme_incompressible) and minor fluid_pnpn/scalar_scheme hooks.
Result: Neko reproduces and holds the lowMach_test manufactured solution to
discretization accuracy -- max err u 2.0e-5, T 3.0e-7, QTL 7.2e-3, flat over
2000 steps (t=2, dt=1e-3), Normal end, 0 non-convergences.
Method-of-manufactured-solutions validation of the low-Mach solver against
Nek5000 short_tests/lowMach_test (Tomboulides et al., JCP 146, 1998):
u = T = 0.5(3+tanh(x/0.2)), rho = 1/T, with the manufactured heat source q so
the thermal divergence Q_T = QTL = 0.5/d (1-tanh^2) holds exactly.
- lowMach_validation.case / .f90: low-Mach + temperature scalar, analytic
IC/BC (tanh), heat source q (via the <temperature>_qdot registry field),
and a compute hook reporting MAX errors of u, T, Q_T vs the closed forms.
- lowMach_nek.nmsh: Nek5000's own mesh (lowMach_test.re2 -> rea2nbin).
- render_validation.py + viz_profiles.png / viz_contour.png: visual overlay
of every Neko GLL point on the analytic curve.
- README.md: physics, how to run, and the three bugs this case fixed.
Holds the manufactured solution to discretization accuracy (u 2e-5, T 3e-7,
QTL 7e-3) over 2000 steps. Generated outputs (.fld, logs, test meshes) are
gitignored.
…wall, gravity) + HPC kit
Flow over a wall-mounted hump (gmsh hump.geo -> gmsh2nek -> hump.nmsh, 53k hexes,
z-periodic), low-Mach, with:
- SUSTAINED turbulent inflow via recycling (global_interpolation, recycle plane
upstream of the hump) seeded by a turbulent IC + a small near-inlet trip;
- lower wall HEATED for x>11 (smoothstep ramp to T_hot=2), cold upstream;
- gravity / buoyancy (reduced-gravity -y body force from the EOS density);
- GJP stabilization. All knobs at the top of heated_hump.f90.
Validated locally on a coarse mesh (recycling + low-Mach + heating + gravity run
to t=15, bounded, with hump separation and a developing heated thermal layer).
The full hump.nmsh + order 7 is HPC-scale (CPU-only low-Mach). Ships with
SHIP_TO_HPC.md, build_neko_hpc.sh, and a SLURM job.slurm template.
…notes hump_fine.geo: NxA40+NxB140+NxC170 streamwise, Ny48, Nz60, rGrow 1.08 -> ~1.0M hexes (vs 53,760). The .nmsh is ~240MB (gitignored; generate via gmsh2nek/rea2nbin or scp). SHIP_TO_HPC.md documents order/resource sizing (use order 5 not 7 for 1M elems) and the Reynolds number (Re_in on hump height; 500 stable-coarse, ~5000-10000 for the fine run).
Summary
Draft / work-in-progress. Adds a low-Mach-number formulation of the Pn/Pn
fluid scheme for variable-density flows at negligible acoustic effects, coupling
the density to temperature through an ideal-gas equation of state.
Opening early to share the approach and gather feedback. The branch is based on
an older
develop(merge base 2026-04-02, ~28 commits behind) and needs arebase, the remaining accelerator backends, and quantitative validation before
it is merge-ready — hence draft.
What it adds
A new fluid scheme
lowmach(fluid_lowmach_pnpn), registered through the fluidfactory, extending the incompressible Pn/Pn solver with:
rho = P0 / (R_specific * T); the density field is updatedeach step from the scalar temperature (
P0,R_specific,T_eos_min/max,temperature_fieldconfig keys).Q_Tin the pressure-Poisson residual — the velocity isno longer solenoidal;
div(u) = Q_Tis computed from the temperature field andfed into the pressure solve.
lowmach_residualmodule (+ CPU backendlowmach_res_cpu, factory) thatassembles the momentum and pressure residuals with field-valued
rhoandmu(spatially varying properties) instead of scalar constants.Files (relative to the merge base)
src/fluid/fluid_lowmach_pnpn.f90— new scheme (EOS plumbing,Q_TfromT,field-valued
rhoin the momentum RHS)src/fluid/lowmach_residual.f90,src/fluid/lowmach_res_fctry.f90,src/fluid/bcknd/cpu/lowmach_res_cpu.f90— residual module + CPU backendsrc/fluid/fluid_base_fctry.f90,src/Makefile.am,src/.depends— wiringexamples/rayleigh_benard/rayleigh_lowmach.{case,f90},examples/tgv/tgv_lowmach_scalar.case— validation/example casesTODO before merge-ready
develop🤖 Generated with Claude Code