Skip to content

[Misc] Reject ndarray struct passed to template-annotated kernel param#608

Draft
hughperkins wants to merge 3 commits intomainfrom
hp/reject-ndarray-in-template
Draft

[Misc] Reject ndarray struct passed to template-annotated kernel param#608
hughperkins wants to merge 3 commits intomainfrom
hp/reject-ndarray-in-template

Conversation

@hughperkins
Copy link
Copy Markdown
Collaborator

Issue: #

Brief Summary

copilot:summary

Walkthrough

copilot:walkthrough

@hughperkins hughperkins force-pushed the hp/reject-ndarray-in-template branch from ae92ccc to 2003fbb Compare May 1, 2026 18:37
Passing structs containing ndarrays through qd.template() bypasses
argument pruning — every ndarray field gets registered regardless of
whether the kernel uses it — and inflates the cached launch context,
causing a measured 42% launch overhead on real workloads.  Raise a
clear QuadrantsCompilationError guiding users to use a concrete
dataclass type annotation instead.

Co-authored-by: Cursor <cursoragent@cursor.com>
@hughperkins hughperkins marked this pull request as draft May 1, 2026 18:40
@hughperkins hughperkins force-pushed the hp/reject-ndarray-in-template branch from 2003fbb to 337795f Compare May 1, 2026 18:40
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🟡 python/quadrants/lang/ast/ast_transformers/function_def_transformer.py:192-198 — This PR removes the only writer of ctx.global_context.struct_ndarray_launch_info (the inner _register_ndarray helper) and the only writer of ctx.global_context.ndarray_to_any_array (the cache[key] = arr line in the same helper), leaving roughly half a dozen named pieces of infrastructure as dead/unreachable code: _struct_ndarray_launch_info_by_key (kernel.py:324), the launch-time read at kernel.py:435-437, the entire stale-cache guard at kernel.py:466-481, the _set_struct_ndarray_args call at kernel.py:511-513, the static helpers _resolve_struct_ndarray (kernel.py:596) and _set_struct_ndarray_args (kernel.py:606), the struct_ndarray_launch_info field on ASTTransformerGlobalContext (ast_transformer_utils.py:185), _promote_ndarray_if_declared (ast_transformer.py:649), and the ndarray_to_any_array lookup branches in impl.py:212, function_def_transformer.py:276, and function_def_transformer.py:298. Not a functional bug — failing paths still fail and succeeding paths still succeed — but a meaningful chunk of code is now unreachable, and the comment at impl.py:207-208 ("AnyArray cache populated by _predeclare_struct_ndarrays during kernel compilation") is now actively misleading. Suggest removing this infrastructure as part of this PR or a follow-up before it lands.

    Extended reasoning...

    What is dead and why

    This PR replaces the body of _predeclare_struct_ndarrays with code that only raises QuadrantsCompilationError. The two helpers it removed — the inner _register_ndarray and the implicit cache[key] = arr / launch_info.append(...) writes — were the sole producers of two pieces of compile-context state:

    1. ctx.global_context.struct_ndarray_launch_info (initialized to [] in ast_transformer_utils.py:185)
    2. ctx.global_context.ndarray_to_any_array (initialized to {} in ast_transformer_utils.py:184)

    After this PR, grep confirms there are zero remaining writers to either. So both are permanently empty for the lifetime of any kernel compile.

    The infrastructure left dangling

    struct_ndarray_launch_info was consumed by an entire stale-cache-guard subsystem in Kernel:

    • kernel.py:324self._struct_ndarray_launch_info_by_key: dict[CompiledKernelKeyType, list] = {} is still initialized.
    • kernel.py:435-437 — at materialize-time, self._struct_ndarray_launch_info_by_key[key] = getattr(ctx.global_context, "struct_ndarray_launch_info", []). The getattr always returns [] now.
    • kernel.py:466-481 — the mutable-ndarray stale-cache guard. The outer if self._struct_ndarray_launch_info_by_key: (dict-truthy) is still True after at least one kernel materializes, but the inner if struct_nd_info: (list-truthy) at 468 is always False because the value is []. So self._mutable_nd_cached_val is always assigned [], and the if self._mutable_nd_cached_val: block at 477-481 is unreachable.
    • kernel.py:511-513struct_nd_info = self._struct_ndarray_launch_info_by_key.get(key) always returns [] (falsy), so _set_struct_ndarray_args is never called.
    • kernel.py:596 _resolve_struct_ndarray and kernel.py:606 _set_struct_ndarray_args — only invoked from kernel.py:480 and kernel.py:513 respectively, both unreachable per above. The static methods themselves are dead.
    • ast_transformer_utils.py:185 — the field initializer is now write-only-by-default and read-but-always-empty.

    ndarray_to_any_array was consumed at three sites:

    • impl.py:209-220subscript checks the cache to promote a bare Ndarray to an AnyArray proxy. The gc.ndarray_to_any_array.get(id(value)) lookup at 212 always returns None, so the only outcome for a bare Ndarray reaching this code path is the Ndarray ... not registered as a kernel parameter raise at 216-220.
    • ast_transformer.py:649-658_promote_ndarray_if_declared is now a no-op for every input (it always returns value unchanged).
    • function_def_transformer.py:276, 298_transform_func_arg looks up promoted ndarrays for qd.Tensor / dataclass-field cases. promoted is always None so the ternary always falls through to the original data/data_child.

    The misleading comment

    impl.py:207-208 still says:

    # Ndarray from struct template field: resolve via the AnyArray cache populated by _predeclare_struct_ndarrays
    # during kernel compilation.

    This comment is now actively wrong: _predeclare_struct_ndarrays no longer populates that cache (it raises). A future reader stepping through subscript will follow the comment expecting a cache hit, then be confused that get(...) returns None for every input.

    Step-by-step proof for one path

    Take the launch hot path for any kernel that compiles successfully under this PR:

    1. materialize runs _predeclare_struct_ndarrays, which either raises (template struct contains an ndarray) or returns silently without writing to either context dict.
    2. After successful materialization, kernel.py:435-437 runs: self._struct_ndarray_launch_info_by_key[key] = getattr(ctx.global_context, "struct_ndarray_launch_info", []) → assigns [].
    3. On launch_kernel, kernel.py:466 if self._struct_ndarray_launch_info_by_key: is True (dict has the key), so we enter the branch.
    4. kernel.py:467 struct_nd_info = self._struct_ndarray_launch_info_by_key.get(key)[].
    5. kernel.py:468 if struct_nd_info: is False (empty list), so we hit the else at 472 → self._mutable_nd_cached_val = [].
    6. kernel.py:477 if self._mutable_nd_cached_val: is False, so kernel.py:478-481 (the _resolve_struct_ndarray call site) is skipped.
    7. kernel.py:511 struct_nd_info = self._struct_ndarray_launch_info_by_key.get(key)[] again.
    8. kernel.py:512 if struct_nd_info: is False, so kernel.py:513 (the _set_struct_ndarray_args call site) is skipped.
    9. _resolve_struct_ndarray and _set_struct_ndarray_args therefore never execute. They are unreachable.

    Severity & fix

    This is not a behavioral bug — the kernel still launches correctly, and the new raise covers the path that this infrastructure used to serve. It is, however, a meaningful amount of unreachable code (one instance dict, one global-context field, one static helper, one set-args helper, one lookup helper, three call sites, and a misleading docstring comment) that AGENTS.md asks PRs to keep tidy. Suggested cleanup:

    • Delete _struct_ndarray_launch_info_by_key, _mutable_nd_cached_key, _mutable_nd_cached_val, _resolve_struct_ndarray, _set_struct_ndarray_args, and the kernel.py:435-437 / 466-481 / 511-513 read sites.
    • Delete ASTTransformerGlobalContext.struct_ndarray_launch_info and ndarray_to_any_array (or keep the latter and add a # always empty after PR #608 comment if you want to defer).
    • Delete _promote_ndarray_if_declared and its ndarray_to_any_array-using callers in function_def_transformer.py and the subscript lookup in impl.py — replace impl.py:209-220 with the bare raise since that is the only outcome now.
    • Either fix or delete the impl.py:207-208 comment.

    Either do this in this PR or open an immediate follow-up so the cleanup does not get lost.

  • 🟡 python/quadrants/lang/ast/ast_transformers/function_def_transformer.py:209-216 — Error message in _predeclare_struct_ndarrays hard-codes "annotated as qd.template()" and advises "Use a concrete struct annotation instead of qd.template()", but the entry guard at lines 238-240 also accepts qd.Tensor annotations. When a user writes x: qd.Tensor and passes a struct/dataclass containing a bare Ndarray attribute, the raised message references an annotation the user never wrote. Fix by formatting the wording from the actual arg_meta.annotation (e.g. emit "qd.Tensor" vs "qd.template()").

    Extended reasoning...

    What's wrong

    _predeclare_struct_ndarrays builds a list of parameters to walk using:

    is_template = anno is annotations.template or isinstance(anno, annotations.template)
    is_tensor_anno = anno is _TensorClass
    if not (is_template or is_tensor_anno):
        continue

    So both qd.template()-annotated and qd.Tensor-annotated parameters can reach _walk_obj. But the two new raise QuadrantsCompilationError(...) sites inside _walk_obj (the dataclass branch at L209-216 and the __dict__ branch at L226-233) hard-code:

    Kernel parameter '{param_name}' is annotated as qd.template(), but ...

    …and the remediation hint says:

    Use a concrete struct annotation (e.g. a @dataclass type hint) instead of qd.template() for struct parameters that contain ndarrays.

    When the user actually wrote x: qd.Tensor, both the diagnosis ("annotated as qd.template()") and the advice ("instead of qd.template()") are factually wrong and will confuse anyone debugging.

    Reachability — step-by-step proof

    1. User writes a kernel def k(x: qd.Tensor): ... and passes a dataclass instance whose field is a bare qd.ndarray (i.e. not a qd.Tensor wrapper).
    2. _template_mapper_hotpath._extract_arg routes non-Ndarray values for qd.Tensor annotations through _TENSOR_T_FIELD_MARKER, so the kernel proceeds to _predeclare_struct_ndarrays.
    3. In the loop, anno is _TensorClassis_tensor_anno=True. The guard passes.
    4. val = ctx.py_args[i] is the dataclass; it is not a _TensorClass (so no unwrap), not an _ndarray.Ndarray (so no continue), is a dataclass ⇒ _walk_obj(val, i, ()) runs.
    5. _walk_obj finds the Ndarray field and raises with the hard-coded "annotated as qd.template()" message — even though the user wrote qd.Tensor.

    Addressing the refutation

    One verifier argued the qd.Tensor branch is unreachable in practice, claiming a non-Tensor struct passed to a qd.Tensor-annotated arg would error elsewhere first. That objection only covers the Tensor → Field._unwrap() sub-path (where Field's __dict__ indeed contains no Ndarrays). It does not cover the more direct path above, where a plain dataclass-with-ndarray value is passed to x: qd.Tensor and routes through _TENSOR_T_FIELD_MARKER. Two other verifiers traced exactly this path and confirmed reachability. Even if reachability were marginal, the is_tensor_anno branch was deliberately added to the guard — emitting a wrong-annotation message in any branch the guard explicitly admits is a self-inconsistency.

    Impact

    Error-message correctness only; no behavioral or correctness regression. Just user confusion in the qd.Tensor case.

    Fix

    Compute an annotation string once per param (e.g. anno_str = "qd.Tensor" if is_tensor_anno else "qd.template()") and pass it into _walk_obj, then format both raise sites with that string. The remediation hint should likewise be conditional (or simply drop the "instead of qd.template()" clause).

    🔬 also observed by verify-runtime

  • 🔴 python/quadrants/lang/ast/ast_transformers/function_def_transformer.py:207-216 — The new rejection in _predeclare_struct_ndarrays fires for qd.Tensor-annotated dataclass fields whose runtime value is NDARRAY-backed, regressing the previously supported pattern exercised by tests/python/test_tensor_annotation_in_func.py::test_tensor_struct_field_kernel_data_oriented[ndarray] and ::test_tensor_struct_vector_field_roundtrip[ndarray] (both call qd.tensor(..., backend=qd.Backend.NDARRAY) and stash the wrapper on a frozen dataclass passed via qd.template()). The check unwraps the Tensor wrapper before the isinstance(_ndarray.Ndarray) test, so the wrapper is indistinguishable from a raw qd.ndarray. Fix: skip when the value was a qd.Tensor wrapper (capture wrap state before _unwrap()), or only reject when the field is a bare qd.ndarray. The same flaw exists in the non-dataclass __dict__ branch.

    Extended reasoning...

    What the bug is

    FunctionDefTransformer._predeclare_struct_ndarrays now rejects any template-annotated struct that contains an Ndarray impl. Inside _walk_obj (lines 207-216), the code does:

    child = getattr(obj, field.name)
    if isinstance(child, _TensorClass):
        child = child._unwrap()
    if isinstance(child, _ndarray.Ndarray):
        raise QuadrantsCompilationError(...)

    The order is the problem: _unwrap is called before the isinstance(Ndarray) test, so a qd.Tensor wrapper holding an Ndarray impl is treated identically to a bare qd.ndarray.

    Why qd.Tensor was previously fine

    qd.tensor(dtype, shape=..., backend=Backend.NDARRAY) returns _wrap_impl(impl.ndarray(...)) (python/quadrants/_tensor.py:228), i.e. a Tensor wrapper whose _impl is a ScalarNdarray / VectorNdarray / MatrixNdarray (all subclasses of _ndarray.Ndarray). Pre-PR, this path went through _register_ndarray and worked via ndarray_to_any_array resolution. The FIELD-backend variant accidentally still works post-PR only because Tensor._unwrap() of a FIELD-backed tensor returns a Field (not an Ndarray), so the new isinstance check skips it. The new rejection therefore silently changes the meaning of a qd.Tensor-typed dataclass field based on the runtime backend the caller picked — clearly unintended.

    Step-by-step proof using an existing test

    tests/python/test_tensor_annotation_in_func.py:56 (test_tensor_struct_field_kernel_data_oriented parametrized with backend=Backend.NDARRAY):

    1. t = qd.tensor(qd.i32, shape=(6,), backend=qd.Backend.NDARRAY)_wrap_impl(impl.ndarray(...)) → a Tensor wrapper, t._impl is a ScalarNdarray.
    2. @dataclasses.dataclass(frozen=True) class S: vals: qd.Tensor; s = S(vals=t).
    3. @qd.kernel def fill(st: qd.template()): ...; fill(s).
    4. In _predeclare_struct_ndarrays: anno = qd.template()is_template = True; val = s is a dataclass → _walk_obj(s, 0, ()) runs.
    5. Field valschild = t (_TensorClass); child = child._unwrap() returns the ScalarNdarray impl.
    6. isinstance(child, _ndarray.Ndarray) is True → raises QuadrantsCompilationError.

    The same chain breaks test_tensor_struct_vector_field_roundtrip[ndarray] (tests/python/test_tensor_annotation_in_func.py:226) and any other [ndarray] parametrization that stuffs an NDARRAY-backed qd.Tensor into a dataclass / data_oriented struct passed via qd.template(). The PR does not modify the test file, so CI on the NDARRAY parametrizations regresses.

    Inconsistency that confirms the bug is unintended

    At the top level in this same function (lines 242-246) the code explicitly handles a Tensor wrapper holding an Ndarray and continues without raising — the rejection only fires inside structs. There is no good reason for the same wrapper-around-Ndarray to be allowed at the top level but rejected one level of nesting deeper.

    Impact and fix

    Impact: blocks the PR, since two existing tests fail on the [ndarray] parametrization, and any user code following the documented qd.Tensor-as-struct-field pattern (pinned as a stork-23 invariant in the test module's docstring) breaks based on the runtime backend choice.

    Fix: distinguish qd.Tensor wrappers from raw qd.ndarray by capturing the wrap state before unwrapping, e.g.:

    child = getattr(obj, field.name)
    was_tensor_wrapper = isinstance(child, _TensorClass)
    if was_tensor_wrapper:
        child = child._unwrap()
    if not was_tensor_wrapper and isinstance(child, _ndarray.Ndarray):
        raise QuadrantsCompilationError(...)

    The same change is needed in the non-dataclass __dict__ branch (lines 226-233). Equivalently, you could test field.type is qd.Tensor and only reject when the annotation is qd.ndarray. Either preserves the user's annotation-level intent rather than gating the check on the runtime backend.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Coverage Report (a44badcdd)

File Coverage Missing
🟢 python/quadrants/lang/ast/ast_transformers/function_def_transformer.py 100%
🟢 tests/python/test_tensor_annotation_in_func.py 100%

Diff coverage: 100% · Overall: 67% · 31 lines, 0 missing

Full annotated report

The 5 ndarray-backend variants of struct-via-template tests now assert
that passing a struct containing ndarrays through qd.template() raises
QuadrantsCompilationError.  Field-backend variants are unchanged.

Co-authored-by: Cursor <cursoragent@cursor.com>
@hughperkins hughperkins force-pushed the hp/reject-ndarray-in-template branch from a44badc to 733a160 Compare May 2, 2026 05:21
@hughperkins
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22d4341ceb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +236 to +241
raise QuadrantsCompilationError(
f"Kernel parameter '{param_name}' is annotated as qd.template(), but "
f"'{param_name}.{attr_path}' is a qd.ndarray. Passing ndarrays through "
f"template structs is not supported because it bypasses argument pruning "
f"and degrades launch performance. Use a concrete struct annotation "
f"(e.g. a @dataclass type hint) instead of qd.template() for struct "
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Document new template-struct ndarray compile failure

This change introduces a user-visible API behavior change by turning previously valid qd.template() struct usage with qd.ndarray fields into a hard QuadrantsCompilationError in _predeclare_struct_ndarrays, but the commit does not update any user docs under docs/. Per AGENTS.md ("Keep user-facing docs in sync with public API changes"), this should be documented so users understand the new restriction and migration path.

Useful? React with 👍 / 👎.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Coverage Report (22d4341ce)

File Coverage Missing
🟢 python/quadrants/lang/ast/ast_transformers/function_def_transformer.py 100%
🟢 tests/python/test_tensor_annotation_in_func.py 100%

Diff coverage: 100% · Overall: 67% · 31 lines, 0 missing

Full annotated report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant