Skip to content

QuEST #747 — Qureg checkpointing (save/load to file via ADIOS2) [unitaryHACK]#793

Open
mk0dz wants to merge 1 commit into
QuEST-Kit:develfrom
mk0dz:feat/747-checkpointing
Open

QuEST #747 — Qureg checkpointing (save/load to file via ADIOS2) [unitaryHACK]#793
mk0dz wants to merge 1 commit into
QuEST-Kit:develfrom
mk0dz:feat/747-checkpointing

Conversation

@mk0dz

@mk0dz mk0dz commented Jun 13, 2026

Copy link
Copy Markdown

/claim #747

The ask

Two new API functions in qureg.cpp:

void  saveQuregToFile(Qureg qureg, const char* fn);
Qureg createQuregFromFile(const char* fn);

Save numQubits, statevector/density type, and all amplitudes — but not
deployment details or derivable quantities. Handle GPU + distributed deployments
without excessive memory. Implement via ADIOS2, behind a CMake flag.

Design

Concern Approach
What's stored only numQubits, isDensityMatrix, and the global amplitude array (+ sizeof(qcomp) for a load-time precision check). No deployment flags, no numAmps.
Deployment-agnostic ADIOS2 global array: each node writes its slice at offset util_getGlobalIndexOfFirstLocalAmp(qureg), count numAmpsPerNode. A file saved on N ranks/GPU loads correctly on M ranks/CPU — the restored Qureg auto-deploys via createQureg/createDensityQureg.
GPU syncQuregFromGpu before save, syncQuregToGpu after load — one host↔device copy of the already-resident local amps.
Distribution adios2::ADIOS(comm_getMpiComm()) + adios2::cxx_mpi when QUEST_ENABLE_MPI; serial ADIOS2 otherwise.
All precisions amplitudes stored as a raw int8 byte array, because ADIOS2 cannot represent long double (fp80) as a native type. fp32/fp64/fp80 all work uniformly; the recorded sizeof(qcomp) guards against loading a file into a mismatched-precision build.
Memory writes the already-in-memory local array directly; no full-state duplication.

Files changed (+207 lines, 6 files)

 CMakeLists.txt                  QUEST_ENABLE_CHECKPOINTING option, find_package(ADIOS2), link adios2::cxx[_mpi]
 quest/include/config.h.in       QUEST_COMPILE_CHECKPOINTING macro
 quest/include/qureg.h           saveQuregToFile / createQuregFromFile declarations
 quest/src/api/qureg.cpp         the two implementations (guarded by QUEST_COMPILE_CHECKPOINTING)
 quest/src/core/validation.cpp   not-compiled + precision-mismatch validators + report messages
 quest/src/core/validation.hpp   their declarations

When checkpointing isn't compiled, both functions report a clear user error
(validate_quregCheckpointingIsCompiled).

Build & verify

cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DQUEST_ENABLE_CHECKPOINTING=ON
cmake --build build --target QuEST
# round-trip test (scratch/test_747.cpp): save -> load -> assert exact amps
* ADIOS2 must be installed to build with the flag.

Results (this machine)

case params match max amplitude error
statevector (8q, CPU) 0.000e+00
density matrix (5q, CPU) 0.000e+00
statevector (8q, GPU) 0.000e+00

Built and verified with ADIOS2 2.12.1, and additionally in a combined
-DQUEST_ENABLE_CUDA=ON -DQUEST_ENABLE_CHECKPOINTING=ON build (GPU path).
Regression: existing QuEST test cases pass against the checkpoint build (the
changes are purely additive) — createQureg, createDensityQureg,
getQuregAmps, initRandomPureState: All tests passed (555 assertions, 4 cases).

Notes for review

  • The amplitude byte-blob is intentional (ADIOS2 has no long double); if you'd
    prefer typed std::complex storage for fp32/fp64 with a byte fallback only for
    fp80, that's an easy follow-up — say the word.
  • Distributed (MPI) path is written but not exercised here (no local MPI launcher);
    it follows ADIOS2's standard global-array idiom and the existing QuEST MPI plumbing.

…OS2 (QuEST-Kit#747)

Optionally (-DQUEST_ENABLE_CHECKPOINTING=ON) persist a Qureg to disk and
restore it, for HPC job resilience. Stores only numQubits, isDensityMatrix and
the amplitudes (as a raw byte global-array, since ADIOS2 lacks long double),
plus sizeof(qcomp) for a load-time precision check. Deployment-agnostic via
ADIOS2's global array + util_getGlobalIndexOfFirstLocalAmp offsets, so a file
saved under one GPU/distribution scheme loads under another; one host<->device
copy handles GPU. Round-trip verified exact for CPU statevector, CPU density
matrix and GPU statevector.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant