Skip to content

GPU/CUDA acceleration via CUDA.jl #3

@jc-macdonald

Description

@jc-macdonald

Add GPU acceleration to OpEngine.jl using CUDA.jl. Priority feature — the main motivation for the Julia port over Python.

Motivation

TRIDENT-scale simulations (300 trait bins × 100 depth levels × 1000s of timesteps) are compute-bound on the reaction term evaluation and diffusion matrix solves. GPU parallelism maps naturally onto:

  • Trait-axis parallelism: each trait bin's reaction term is independent
  • Spatial-axis parallelism: each depth level's reaction term is independent
  • Batch parallelism: ensemble runs over parameter sweeps (model-criticism Pareto studies)

Tasks

  • Abstract array backend: AbstractArray throughout so CuArray drops in
  • GPU-compatible reaction term evaluation (avoid scalar indexing)
  • GPU-compatible diffusion operator (tridiagonal solve on GPU — use CUSOLVER or batched Thomas algorithm)
  • GPU-compatible IMEX time-stepping
  • Benchmark: CPU vs GPU for TRIDENT-scale problem sizes
  • Batch solver: run N parameter sets simultaneously on GPU (one kernel launch per ensemble)
  • Optional dependency: CUDA.jl as an extension package (OpEngineCUDAExt)

Dependencies

Cross-references

  • trident: primary consumer — GPU enables full resolution sweeps for the convergence studies
  • model-criticism / ModelCriticism.jl: Pareto front computation over solver settings benefits from batch GPU execution

Boundary

This is solver acceleration — no changes to OpSystem.jl or the model specification layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions