feature/vector_filter_basis by mcgibbon · Pull Request #1018 · ai2cm/ace

mcgibbon · 2026-03-26T18:21:47Z

Opening this draft PR to discuss with Claude.

Changes:

symbol (e.g. fme.core.my_function) or script and concise description of changes or added feature
Can group multiple related symbols on a single bullet
Tests added
If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated

Resolves # (delete if none)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mcgibbon · 2026-03-26T18:26:20Z

+
+## Overview
+
+This document specifies an approach for handling directional (vector-valued) data in DISCO convolution on the sphere. Rather than constructing a continuous directional filter basis (as in the cube-gauged approach), this design uses **isotropic radial filters** combined with a **frame-rotation matrix** that correctly transforms vector inputs from each input point's local meridian frame into the output point's meridian frame.


The filters the design uses aren't isotropic, right? They will have an angular dependence. Or are they rotationally invariant, just they properly handle vector inputs and outputs?

I think it is a good idea that the physics be rotationally invariant, in the sense that for example if there is a convergence term for advection in x depending on u and the x-oriented filter, there must also be a corresponding convergence term in y depending on v and the y-oriented filter. Similarly for rotation-like filters. Is there a way to capture this? Describe how such filters would be represented and how it results in this desired behavior. Would it mean we store one set of filters that gets applied in some coherent way to both the x- and y- axis, using a rotation operator?

Claude: You're right — the effective filters are not isotropic. The radial basis functions ψ_k(r) are isotropic (depend only on geodesic distance), but the full filter tensors include angular dependence through cos(γ) and sin(γ), where γ is the frame rotation angle. So the filter tensor ψ_k(r)·cos(γ) has a dipole-like angular pattern — it picks out the component of the input aligned with one direction, while ψ_k(r)·sin(γ) picks out the perpendicular component.

The rotational invariance comes from how these filters are used in matched pairs. Yes — we store one set of radial basis functions ψ_k(r), and from it derive two filter tensors (psi_cos and psi_sin) that are always applied together via the rotation matrix structure. For example:

Scalar→vector (gradient-like): Applying psi_cos to a scalar field s gives the u-component, and psi_sin gives the v-component. Because both use the same ψ_k(r) and the angular factors cos(γ)/sin(γ) are geometrically determined (not learned), this automatically ensures that d/dx and d/dy are locked together — you can't have convergence in x without the corresponding convergence in y.

Vector→vector (frame rotation): The same psi_cos/psi_sin pair forms the 2×2 rotation matrix [cos γ, −sin γ; sin γ, cos γ] that transforms input vectors into the output frame. Again, one set of filters handles both components coherently.

Vector→scalar (divergence/curl-like): The cos/sin filters contract vector components into rotationally invariant scalars (div-like: cos·u + sin·v, curl-like: cos·v − sin·u).

The learned weights are shared between the u and v pathways — the weight can't independently scale the x vs y contribution, so the network can't break the rotational symmetry. I'll revise the document to clarify this and stop calling the filters "isotropic."

mcgibbon · 2026-03-26T18:27:42Z

+- **Scalar channels** `s(lat, lon)`: Frame-independent quantities (temperature, pressure, geopotential, etc.). No special treatment needed in convolution.
+- **u channels** `u(lat, lon)`: East component of a vector in the local meridian frame (eastward wind, etc.).
+- **v channels** `v(lat, lon)`: North component of a vector in the local meridian frame (northward wind, etc.).


Note the number of u and v channels are the same, but the number of scalar channels is independent of the number of vector channels. The vectors should be stored as a single tensor with a final [2] dimension.

Claude: Agreed. Forward signature will be forward(x_scalar, x_vector) where x_scalar: (B, N_s, H, W) and x_vector: (B, N_v, H, W, 2) with the last dimension being (u, v). Output follows the same convention. Will update the document.

mcgibbon · 2026-03-26T18:29:13Z

+[v_rotated] = [sin γ   cos γ] [v_in]
+```
+
+where `(u_in, v_in)` are the input vector components in the input's meridian frame, and `(u_rotated, v_rotated)` are the same vector expressed in the output's meridian frame.


Note that this rotation is represented by being stored within the pre-computed filter itself.

Claude: Yes — cos(γ) and sin(γ) are multiplied into ψ_k(r) during precomputation to produce psi_cos_fft and psi_sin_fft. No runtime rotation needed.

mcgibbon · 2026-03-26T18:54:41Z

+- **Vector norm → scalar:** `n = sqrt(u² + v²)` is frame-invariant and produces a true scalar.
+- **Norm-gated vectors:** `σ(n) · (u, v) / n` applies a nonlinearity to the vector magnitude while preserving direction.
+- **Scalar × vector → vector:** `s · (u, v)` multiplies a scalar channel with a vector pair to produce a new vector pair. This is how the network creates new vector features from scalar-vector interactions.
+- **Vector dot product → scalar:** `u₁·u₂ + v₁·v₂` produces a frame-invariant scalar from two vector pairs.
+- **2D cross product → scalar:** `u₁·v₂ − u₂·v₁` produces a pseudo-scalar (changes sign under reflection but is frame-rotation invariant).
+
+### Unsafe Nonlinearities
+
+- **Independent nonlinearities on u and v:** Applying `ReLU(u)`, `ReLU(v)` independently breaks rotational structure — the result depends on the frame orientation.
+- **Adding scalar to vector component:** `u + s` breaks vector transformation rules.


Let's only allow nonlinearities to be applied to the scalars. Don't apply nonlinearities to the vector channels. This should still be fine, as the vectors can have nonlinear behavior by virtue of their dependence on the scalar channels, right?

Claude: Correct. Vectors gain nonlinear behavior through the scalar pathway: scalars get nonlinearities applied, then the next layer's scalar→vector filters (psi_cos/psi_sin) produce new vector features that depend nonlinearly on the original inputs. This avoids frame-dependent nonlinearity issues entirely. I'll replace the nonlinearity section with a note about this constraint.

mcgibbon · 2026-03-26T18:56:51Z

+
+This approach is different in character:
+
+- **No angular filter modes.** The filters remain isotropic (radial only). Directional information enters through the vector channels, not through the filter shape.


Let's discuss this some more. The filters won't be purely isotropic, right? We need to take d/dx of scalar fields, it's just that this operation will be handled in a way that aligns it with the u component, while d/dy would be aligned with the v component of the vectors.

Claude: Right — as discussed in the thread on line 5, the effective filters have angular dependence via cos(γ)/sin(γ). For scalar→vector specifically: psi_cos applied to a scalar gives the d/dx-like component (→ u), psi_sin gives d/dy-like (→ v). Same ψ_k(r) underlies both, geometrically locked together. I'll replace "isotropic" throughout.

mcgibbon · 2026-03-26T19:13:23Z

+- **N_s, N_v (channel counts):** Number of scalar and vector channel pairs in the hidden representation. The network architecture determines these. A reasonable starting point is to match the current total channel count, reserving some for vector pairs.
+- **Nonlinearity design:** The choice of how to combine scalar and vector channels between layers (norm-gating, scalar-vector products, etc.) is an architectural decision. Simple norm-gating plus scalar×vector products provide the essential interactions.
+
+## Summary of Implementation Steps


Can we also outline the testing plan for this new feature? Especially testing that various properties of certain operations are satisfied using user-story level tests of that behavior. It is hard to fully understand how the mathematical details lead to these properties, so it would help to have this also proven out by tests.

Claude: Proposed testing plan (user-story-level behavioral tests):

Uniform vector field is preserved. Create a constant vector field (e.g., everywhere pointing east). After vector→vector convolution, the output should still be uniform and pointing east — frame rotations cancel for a constant field.

Longitude shift equivariance. Shift scalar and vector inputs by N longitude points. Verify outputs shift by N points (both scalar and vector channels). Tests that the FFT-based implementation preserves translational symmetry.

Scalar gradient produces correct vector. Apply scalar→vector to a scalar field with a known gradient (e.g., linear ramp in longitude → expect u output, zero v). Verify the output vector direction matches the gradient direction.

Divergence recovery. Create a radially diverging vector field from a point. Apply vector→scalar with divergence-like weights. Verify the output scalar is positive at the source.

Curl recovery. Create a rotating vector field. Apply vector→scalar with curl-like weights. Verify the result reflects the rotation.

Rotational invariance. Rotate input fields (shift longitude), verify that scalar outputs are identical and vector outputs rotate consistently.

Scalar-only consistency. When N_v=0, verify the module produces identical results to the existing DiscreteContinuousConvS2.

Regression baselines. Save reference outputs from known inputs/weights, verify they don't change across refactors (using validate_tensor_dict).

Your "longitude shift" and "rotational" tests seem to be the same test? Consider also doing a test that involves rotating the north pole somewhere else. The output field should be rotated the same way.

Claude: You're right, those were redundant. I'll merge them and add a proper full-sphere rotation test. Updated plan in the document:

Longitude shift equivariance: shift inputs by N longitude points, verify outputs shift identically.

Arbitrary rotation equivariance: rotate the entire input field (move the north pole elsewhere), verify the output rotates consistently — scalar outputs match under the rotation, vector outputs transform correctly into the new frame.

The arbitrary rotation test is stronger because longitude shifts are already guaranteed by the FFT structure, while a pole-moving rotation exercises the frame rotation geometry at every latitude.

mcgibbon · 2026-03-26T19:15:14Z

+- **Scalar output from scalar input:** Standard — weight contracts over `(c_in, k)`.
+- **Scalar output from vector input:** Weight contracts over `(c_in, k)` separately for `conv_u` and `conv_v`, producing a scalar from vector components (analogous to divergence or a directional projection).
+- **Vector output from vector input:** Weight contracts over `(c_in, k)` for `conv_u` and `conv_v` to produce output `u` and `v` components. The weight can independently scale and mix the u/v components.
+- **Vector output from scalar input:** The weight assigns a scalar intermediate feature to a vector output channel. This implicitly creates a preferred direction in the meridian frame, and is therefore not rotationally equivariant. But if `u_out` and `v_out` channels are created from the same scalar feature with appropriate weights, the network can learn latitude-dependent directional structures (e.g., patterns aligned with Coriolis deflection).


How is this weight defined? It should correspond to rotationally-invariant (but not isotropic) physics.

Claude: The weight has separate blocks per type interaction, with u/v sharing weights to enforce rotational invariance. Each interaction that involves vectors has two independent weight components (analogous to a complex weight: one "real" and one "imaginary" or equivalently one "stretch" and one "90°-rotation"):

W_ss (N_s_out, N_s_in, K): scalar→scalar via psi_scalar. Standard, 1 weight.

W_sv (N_v_out, N_s_in, K, 2): scalar→vector. Component 0 ("gradient"): u = w·contraction(psi_cos, s), v = w·contraction(psi_sin, s). Component 1 ("perp gradient"): u = w·(-contraction(psi_sin, s)), v = w·contraction(psi_cos, s). The perpendicular gradient is physically meaningful — e.g. geostrophic wind from pressure.

W_vs (N_s_out, N_v_in, K, 2): vector→scalar. Component 0 ("divergence"): s = w·(psi_cos·u + psi_sin·v). Component 1 ("curl"): s = w·(psi_cos·v − psi_sin·u).

W_vv (N_v_out, N_v_in, K, 2): vector→vector. Component 0 ("stretch"): same weight on both components of frame-rotated vector. Component 1 ("rotate"): applies 90° rotation (-conv_v, conv_u) with shared weight.

Key constraint: within each block, u and v pathways share the same scalar weight, so the network can't break rotational symmetry. The 2-component structure for vector interactions captures the full space of rotationally-invariant linear maps (divergence+curl, gradient+perp-gradient, stretch+rotation).

mcgibbon · 2026-03-26T19:18:11Z

+- **Scalar output from scalar input:** Standard — weight contracts over `(c_in, k)`.
+- **Scalar output from vector input:** Weight contracts over `(c_in, k)` separately for `conv_u` and `conv_v`, producing a scalar from vector components (analogous to divergence or a directional projection).
+- **Vector output from vector input:** Weight contracts over `(c_in, k)` for `conv_u` and `conv_v` to produce output `u` and `v` components. The weight can independently scale and mix the u/v components.
+- **Vector output from scalar input:** The weight assigns a scalar intermediate feature to a vector output channel. This implicitly creates a preferred direction in the meridian frame, and is therefore not rotationally equivariant. But if `u_out` and `v_out` channels are created from the same scalar feature with appropriate weights, the network can learn latitude-dependent directional structures (e.g., patterns aligned with Coriolis deflection).


The vectors should not be assigned values directly from scalar inputs, rather they should only be assigned values that correspond to specific filters (which may have an orientation or be isotropic) being applied to the scalars, and then that value being mapped into the vectors. I think this is what you mean, but it's not entirely clear. Requiring this mapping be rotationally invariant means that the isotropic filter would be assigned into the u and v components with exactly the same contribution - there's no directional dependence.

Claude: Agreed. In the updated design, scalar→vector exclusively uses the oriented filters: u = W_sv · contraction(psi_cos, s), v = W_sv · contraction(psi_sin, s). The directional structure comes from the geometry (cos γ, sin γ), not the weights. There is no path from psi_scalar to vector outputs — an isotropic filter applied to a scalar carries no directional information and can't meaningfully populate a vector channel.

Key changes to the design document: - Replace "isotropic filter" framing: filters have angular dependence via cos(γ)/sin(γ), making them rotationally invariant but not isotropic - Vector channels stored as (B, N_v, H, W, 2) tensor with (u,v) last dim - Weight structure enforces rotational invariance: four separate blocks (W_ss, W_sv, W_vs, W_vv) with shared u/v weights; vector interactions have 2 components each (gradient/perp-grad, div/curl, stretch/rotate) - No direct psi_scalar→vector path; scalar→vector uses oriented filters - Nonlinearities only on scalar channels; vectors gain nonlinearity indirectly through the scalar pathway - Add module interface (VectorDiscoConvS2) with explicit forward signature - Add testing plan with 8 behavioral tests including arbitrary rotation equivariance - Update implementation steps for new file organization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New file fme/core/disco/_vector_convolution.py with: - _precompute_vector_convolution_tensor_s2: extends the existing precomputation to compute cos(γ) and sin(γ) (frame rotation angle) at each filter support point, producing three sets of filter values (scalar, cos-modulated, sin-modulated) with shared normalization - build_vector_psi_fft: orchestrates the full pipeline from precomputation through banded FFT construction, reusing the scalar banding parameters for cos/sin tensors Tests verify: - Scalar values and indices match existing _precompute_convolution_tensor_s2 - Pythagorean identity cos²(γ) + sin²(γ) = 1 at all support points - Zero frame rotation on the same meridian (Δlon = 0) - sin(γ) antisymmetry under Δlon → -Δlon - Scalar FFT tensor matches the existing pipeline end-to-end - Correct shapes and compatibility with multiple filter basis types Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New nn.Module in fme/core/disco/_vector_convolution.py: - Forward pass: 5 FFT contraction calls (3 on scalar inputs, 2 on vector inputs), assembles rotationally invariant intermediates (frame-rotated vector, divergence, curl), then applies 4 weight blocks - Weight structure: W_ss (scalar→scalar), W_sv (scalar→vector with gradient + perpendicular gradient), W_vs (vector→scalar with divergence + curl), W_vv (vector→vector with stretch + 90° rotation) - Each vector weight block has 2 components sharing the same scalar weight for u and v pathways, enforcing rotational invariance - Bias on scalar outputs only (vector bias would break symmetry) - Handles zero-channel edge cases (N_s=0 or N_v=0) - Exported as VectorDiscoConvS2 from fme.core.disco Tests added: - Forward shape for various channel configurations - Zero-channel edge cases (scalar-only, vector-only, mixed) - Scalar-only output matches existing DiscreteContinuousConvS2 - Longitude shift equivariance for both scalar and vector channels - R_x(pi) rotation equivariance for scalar→scalar and vector→vector - Gradient flow through all four weight blocks - Scalar gradient direction at equator - Divergence and curl detection from vector fields - Bias correctness (affects only scalar outputs) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extend the vector filter precomputation tests to verify that the two components of the vector filter (cos(γ) and sin(γ)) are perpendicular for every supported filter basis type: - Pythagorean identity (cos²+sin²=1): parametrized over piecewise linear, morlet, and zernike bases. Verifies pointwise perpendicularity. - Weighted orthogonality: for each (kernel, output latitude), verifies the inner product of cos(γ) and sin(γ) patterns over the filter support is near zero. Tests that the two components are orthogonal as functions, not just pointwise unit vectors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New abstraction that pairs two FilterBasis instances: one for isotropic operations (scalar->scalar, vector->vector) and one for oriented operations (scalar->vector, vector->scalar). The two bases can have different kernel shapes, allowing independent radial resolution. Includes factory function get_vector_filter_basis and exports from fme.core.disco. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The module now uses independent scalar and vector filter bases with potentially different kernel sizes (K_s, K_v): - build_vector_psi_fft evaluates both bases, producing 5 FFT tensors (psi_scalar, psi_s_cos, psi_s_sin from scalar basis; psi_v_cos, psi_v_sin from vector basis) with separate gather indices - Forward pass uses 7 contraction calls when K_s != K_v (collapses to 5 when bases are identical, via the precomputation cache) - Weight shapes: W_ss (K_s), W_vv (K_s, 2), W_sv (K_v, 2), W_vs (K_v, 2) - Weight init accounts for mixed K_s/K_v fan-in per output type - Constructor accepts VectorFilterBasis or kernel_shape (backward compat) Tests updated: - build_vector_psi_fft tests use VectorFilterBasis - New test for different K_s/K_v shapes and weight dimensions - New test for FFT shapes with different bases - All existing behavioral tests pass unchanged via kernel_shape compat Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rewrite the design document as a standalone reference for the VectorDiscoConvS2 module: - Lead with scalar vs vector filter type distinction and the input×filter→output type table - Describe VectorFilterBasis with independent K_s and K_v - Document all 5 precomputed FFT tensors (3 scalar-basis, 2 vector-basis) with separate banding and gather indices - Document 7 contraction calls (collapsing to 5 when K_s = K_v) - Document symmetry properties: exact longitude equivariance, R_x(pi) equivariance for ss/vv but not cross-type paths - Update module interface to show VectorFilterBasis constructor - Update testing plan to match implemented tests - Remove references to "existing" code or previous document versions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…iance The cross-type paths (sv, vs) now use the bearing angle φ and frame correction β instead of the frame rotation γ for their angular modulation. This makes all four paths equivariant under R_x(π): - sv path: uses cos(φ)/sin(φ) (bearing at output) instead of cos(γ)/sin(γ) - vs path: uses cos(β)/sin(β), equivalent to frame-rotating the input vector to the output frame then projecting onto the bearing direction - vv path: unchanged, still uses cos(γ)/sin(γ) for frame rotation - ss path: unchanged, isotropic The precomputation now stores all three angular pairs (φ, β, γ) at each support point. At θ≈0 (input coincides with output), the bearing direction is undefined, so cos(φ)/sin(φ) and cos(β)/sin(β) are zeroed — the frame rotation γ remains valid there. 7 FFT tensors are precomputed (3 from scalar basis, 4 from vector basis) instead of the previous 5. The R_x(π) equivariance test now runs with all four paths active in a single module, verifying full equivariance. Design document updated with the three-angle framework (φ, β, γ=φ−β), which angle is used where, and updated symmetry properties. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

logs/extract_transcript.py converts Claude Code JSONL session files into readable transcripts. Takes a session ID and log folder name, finds the source file under ~/.claude/projects/, and writes both JSONL (one turn per line) and Markdown (GitHub-renderable with blockquoted user messages and collapsed tool calls). Output filenames include the session start timestamp and session ID prefix for traceability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Session 9c55f925-6cb6-48f1-813b-a63ec61d5191, started 2026-03-26. Covers design discussion, PR review responses, implementation of VectorDiscoConvS2 with bearing-angle equivariance, and VectorFilterBasis refactor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This session covers the evolution from cube-gauged directional filters to the simpler vector-typed approach using meridian-frame convolution with per-input-point frame rotation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ShallowWaterStepper uses a VectorDiscoConvS2 layer with fixed weights to approximate gradient and divergence operators for the linearized shallow water equations on the sphere: dh'/dt = -H0 div(V) dV/dt = -g grad(h') - f x V Uses nondimensional parameters (g=1, H0=1, omega=0.5) with RK4 time integration. Coriolis forcing applied pointwise after convolution. Tests verify: - Resting state stability - Mass conservation (<1% drift over 200 steps) - Energy conservation (ratio ~1.0 over 500 steps) - Perturbation generates motion - No NaN blowup over 1000 steps - Output shape preservation Video script (scripts/run_shallow_water.py) generates GIF/MP4 of a Gaussian height perturbation radiating gravity waves. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The bearing angle convention (φ=0 → south) means component 0 in the weight matrix produces the perpendicular gradient (flow along contours) rather than the true gradient. Switch to component 1 for both the sv (gradient) and vs (divergence) paths. Also: - Video script uses fme.core.device.get_device() for GPU acceleration - Add --lon0 argument for bump center longitude (default 90°) - Move default bump longitude to 90° for better visualization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…to feature/vector_filter_basis

ScalarVectorProduct performs pointwise scaling and 90-degree rotation of vector features by scalar features. Weight shape is (N_s, N_v, 2) where the last dimension selects scale vs rotate. This is the minimal frame-consistent bilinear interaction between scalar and vector fields. The shallow water stepper now uses this module for the Coriolis term (f × V) instead of manual torch.stack operations. 9 tests for the module cover: shape, zeros, pure scale, pure rotation, Coriolis pattern, multi-channel summation, magnitude preservation, and gradient flow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mcgibbon · 2026-03-27T20:37:39Z

The shallow water model produces plausible videos of evolution. For no-rotation they look exactly like what I'd expect. For with-rotation, I have much less intuition on what the solution should look like, but the solution looks plausible.

Add section to vector_filter_basis.md describing: - ScalarVectorProduct motivation (convolution + scalar nonlinearity alone cannot represent pointwise scalar×vector products) - Block structure: conv → activation → MLP → ScalarVectorProduct → residual - Ordering rationale: MLP before ScalarVectorProduct so enriched scalar features drive vector modulation - What each block can represent (gradient, divergence, Coriolis, etc.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

VectorDiscoBlock (block.py) is the repeating unit for scalar-vector networks: conv → activation → residual MLP → residual ScalarVectorProduct → residual. Key design decisions: - MLP is residual (s + MLP(s)) so zero weights = identity passthrough - ScalarVectorProduct is residual (v + sv_product(s, v)) so conv vector output passes through and scalars add corrections - activation="none" option for physics-based models ScalarVectorProduct moved to its own file (scalar_vector_product.py), test renamed accordingly. Old modules.py removed. Stepper refactored to use a single VectorDiscoBlock with 2 scalar channels (h, f) and 1 vector channel (V). Coriolis f is passed through the conv via W_ss and rotates V via the ScalarVectorProduct. Energy tests split: exact conservation without rotation (omega=0), bounded drift with rotation (smoothed f introduces small energy non-conservation since the Coriolis orthogonality is approximate). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Block changes: - Add pointwise (delta function) scalar skip: 1x1 conv that adds unfiltered scalar input to conv output. Lets features like the Coriolis parameter pass through without spatial smoothing. - Add PointwiseVectorTransform: per-channel scale+rotate for vectors, analogous to 1x1 conv for scalars. - ScalarVectorProduct now acts on INPUT vectors (x_vector), not conv output. This ensures exact Coriolis no-work property (f x V · V = 0). Stepper uses the pointwise scalar skip for f, giving exact energy conservation with rotation (ratio = 1.000000 over 200 steps). Tests sped up: smaller grid (16x32 vs 32x64), fewer steps. Suite runs in ~15s vs ~60s. Energy test now expects exact conservation with rotation (< 5% drift tolerance). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Transcript of the session implementing HybridCoordinateStepper in hybrid sigma-pressure coordinates, including the sigma_dot bug fix and all tests. https://claude.ai/code/session_01PvzyiGXcVTKQAnjBVKa1ta

- 2026-03-29T0422-e5d70b83: 2 turns, 50 responses - 2026-03-29T0422-f2dcec29: 4 turns, 114 responses - 2026-03-29T1355-122a6ced: 17 turns, 258 responses https://claude.ai/code/session_01PvzyiGXcVTKQAnjBVKa1ta

W_vs2 adds a diagonal scalar-grad-dot-vector → scalar operator to VectorDiscoConvS2, enabling exact V·∇s computation without extra DISCO passes. This is needed for temperature and humidity advection in the multi-level primitive equations. VectorDotProduct computes weighted |V|² → scalar (frame-invariant), used for kinetic energy KE_k = ½|V_k|² in block steppers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Implements the hydrostatic primitive equations on K fixed pressure levels with frozen VectorDiscoConvS2 weights. Evolves velocity (via Lamb-form momentum with hydrostatic PGF), temperature (full dry adiabatic equation with omega vertical velocity), and humidity (passive tracer advection). Uses RK4 time integration. Mean-subtraction of geopotential before differentiation eliminates the O(|phi|*eps) discretization error from grad(const). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…Block Encodes the multi-level primitive equations in a single VectorDiscoBlock pass by treating levels as channels: scalars are [T, KE, zeta, delta] per level plus Coriolis f; vectors are V per level. A separate adv_conv using the W_vs2 pathway handles T and q horizontal advection. Uses VectorDotProduct for frame-invariant KE computation and frozen lower-triangular W_sv weights for the hydrostatic PGF. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extends the primitive equations to sigma coordinates (sigma = p/p_s) with prognostic surface pressure. Adds p_s tendency from column- integrated divergence, sigma-dot vertical velocity, and vertical advection terms. Supports optional hyperdiffusion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Implements the most general vertical coordinate system p_k = a_k + b_k * p_s, encompassing both pure sigma and pure pressure as special cases. Adds hybrid vertical velocity C_k, modified PGF correction, and vertical advection with spatially-varying layer thicknesses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extends the block stepper approach to hybrid coordinates. Replaces Python for-loops with frozen Conv2d layers for vertical operations (hydrostatic recurrence, vertical velocity integration, vertical finite differences). Only 2 DISCO passes total. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Validates PrimitiveEquationsStepper with a classical warm-dome test: a temperature anomaly at 40N creates a geopotential gradient that drives anticyclonic circulation via Coriolis deflection. Demonstrates hydrostatic vertical coupling (upper levels show stronger response). Updates shallow_water __init__.py to export all new stepper classes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Jablonowski-Williamson 2006 baroclinic instability test using the SigmaCoordinateStepper. The analytical JW balance does not hold exactly under the discrete DISCO operators, so the simulation shows a gravity-wave adjustment phase rather than clean baroclinic wave development. Kept as a reference for future work on iterative initialization or spectral filtering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Describes the progression from single-level shallow water through isobaric primitive equations, sigma coordinates, and hybrid coordinates, along with the block stepper variants that encode multi-level dynamics in a single VectorDiscoBlock pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…to feature/vector_filter_basis

Consolidate all horizontal operations (dynamics, advection, diffusion) into one DISCO forward pass instead of three, bringing the block stepper closer to a pure NN architecture. - Add residual=False option to VectorDiscoBlock - Expand block scalar channels from 4K+2 to 6K+2 (add T, q per level) - Encode horizontal advection (-V·∇T, -V·∇q) via W_vs2 in the block - Encode diffusion (ν∇²) via W_ss (scalars) and W_vv (vectors) using Laplacian weights derived from Legendre polynomial eigenvalues - Remove standalone adv_conv, grad_conv, div_conv and diffusion helpers - Remove diffusion_order parameter (only order-1 was used) - Add regression and diffusion-comparison tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Same pattern as the hybrid equations block: merge advection and diffusion into the single VectorDiscoBlock pass. - Expand from 4K+1 to 6K+1 scalar channels (add T, q per level) - Move advection from adv_conv to W_vs2 in the block - Add diffusion via W_ss/W_vv Laplacian weights (reuses helper) - Use residual=False, remove adv_conv/grad_conv/div_conv/diffusion helpers - Remove diffusion_order parameter - Add regression and diffusion-comparison tests - Update divergence extraction test for new channel layout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add isobaric flag to HybridCoordinateBlockStepper that skips vertical advection of V/q, PGF correction, and surface pressure tendency. PrimitiveEquationsBlockStepper now composes a HybridCoordinateBlockStepper (isobaric=True) and wraps its interface to hide the constant p_s. - Add isobaric_coefficients() to convert pressure levels to hybrid b=0 - Add isobaric branch in compute_tendencies (simplified omega from C_conv) - Reduce primitive_equations_block.py from ~290 lines to ~100 lines - All 78 shallow water tests pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…Blocks Architecture: scalar encoder (Conv2d) -> vector encoder (PointwiseVectorTransform) -> N stacked VectorDiscoBlocks -> scalar/vector decoders. The network is level-agnostic; levels are packed into channels by the caller. Zero-initializes decoder, sv_product, MLP output layers, and pointwise_vv for well-behaved initial forward pass. With 2 residual blocks, scalar variance stays within ~2x of input; vector variance runs ~3x due to the DISCO bearing filter contractions not being variance-preserving (will be addressed in a follow-up commit). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nuation The DISCO basis is normalized to preserve the mean of constant fields (basis_norm_mode="mean"), which is correct for encoding physics operators. But spatial averaging inherently reduces variance (by Cauchy-Schwarz, filter weights summing to 1 implies sum of squares < 1). The bearing filters (psi*cos(phi), psi*sin(phi)) attenuate further because the zero-mean angular modulation causes additional cancellation. The old init assumed unit-variance filter outputs, under-scaling the weights. The fix probes each filter contraction with white noise at construction time to measure the actual attenuation (~0.12x for bearing, ~0.67x for isotropic), then incorporates these factors into the fan-in computation so the weight magnitude compensates. Raw conv output is now ~1.35x input variance (was asymmetric before). Through 2 residual blocks: scalars ~3x, vectors ~6x (cross-talk from scalar->vector path). Layer norm would address the residual accumulation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Apply ConditionalLayerNorm to scalars after the residual connection in each block. Only scalars are normalized — vectors pass through unnormalized so their magnitudes remain physically meaningful. Post-norm controls total scalar variance by normalizing the accumulated state (initial + residual), keeping scalars at ~2x regardless of depth. Vector variance grows linearly (~1x per block from W_sv), which will be addressed by W_vv damping in a follow-up. The norm supports conditioning on noise, positional embeddings, and labels via ContextConfig, matching the existing ConditionalLayerNorm infrastructure. Physics stepper blocks pass post_norm=False. With 12 blocks: scalar ~2x, vector ~13x (linear, trainable). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Instead of zero-initializing W_vv, set the diagonal to -0.2 on the identity-like basis function (k=0, d=0 stretch). With the residual connection this gives v_new = 0.8*v_old + W_sv_contribution, a contractive map that bounds vector variance regardless of depth while preserving ~7% of input signal through 12 blocks. Combined with scalar post-norm, both paths are well-behaved through deep stacks: scalars ~2x, vectors ~4x at 12 blocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Scale W_sv to 0.7x its fan-in-scaled value and initialize W_vv as a mild diagonal contraction (alpha=0.1 on k=0, d=0 basis). With the residual connection this gives v_new = 0.9*v_old + 0.7*W_sv_contribution, bounding vector variance to ~4x at 12 blocks while retaining ~28% of input signal (vs 7% at alpha=0.2). Scalar post-norm keeps scalars at ~2x. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Implements the StepConfigABC/StepABC interface wrapping VectorDiscoNetwork. Winds are treated as vectors, normalized by a single speed scale (no bias) derived from the normalizer's per-variable stds. Coriolis and KE are pre-computed and injected as additional scalar inputs. Key design: - Wind variables identified by prefix match (eastward_wind_0, ...) - wind_scale = mean_over_levels(sqrt(std_u² + std_v²)) - Winds always use residual prediction (input + network output) - Prognostic scalars use residual when residual_prediction=True - Diagnostic scalars are always full-field - Registered as "vector_disco" in StepSelector Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add initial draft of vector filter basis notes

a8fd671

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mcgibbon commented Mar 26, 2026

View reviewed changes

mcgibbon and others added 15 commits March 26, 2026 19:36

Merge branch 'feature/vector_filter_basis' of github.com:ai2cm/ace in…

c5159ec

…to feature/vector_filter_basis

mcgibbon and others added 11 commits March 27, 2026 20:52

Add session log for multi-level shallow water / hybrid coordinates work

657396f

Transcript of the session implementing HybridCoordinateStepper in hybrid sigma-pressure coordinates, including the sigma_dot bug fix and all tests. https://claude.ai/code/session_01PvzyiGXcVTKQAnjBVKa1ta

Add session logs for three additional sessions

d6f6820

- 2026-03-29T0422-e5d70b83: 2 turns, 50 responses - 2026-03-29T0422-f2dcec29: 4 turns, 114 responses - 2026-03-29T1355-122a6ced: 17 turns, 258 responses https://claude.ai/code/session_01PvzyiGXcVTKQAnjBVKa1ta

mcgibbon and others added 13 commits April 15, 2026 17:12

Merge branch 'feature/vector_filter_basis' of github.com:ai2cm/ace in…

b947455

…to feature/vector_filter_basis


		## Overview

		This document specifies an approach for handling directional (vector-valued) data in DISCO convolution on the sphere. Rather than constructing a continuous directional filter basis (as in the cube-gauged approach), this design uses isotropic radial filters combined with a frame-rotation matrix that correctly transforms vector inputs from each input point's local meridian frame into the output point's meridian frame.


		This approach is different in character:

		- No angular filter modes. The filters remain isotropic (radial only). Directional information enters through the vector channels, not through the filter shape.

Conversation

mcgibbon commented Mar 26, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcgibbon commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants