Skip to content

refactor(trace): pass scalar Trace internal args by value and unify naming#861

Closed
IvanaGyro wants to merge 1 commit into
claude/stride-viewfrom
claude/trace-sig-refactor
Closed

refactor(trace): pass scalar Trace internal args by value and unify naming#861
IvanaGyro wants to merge 1 commit into
claude/stride-viewfrom
claude/trace-sig-refactor

Conversation

@IvanaGyro
Copy link
Copy Markdown
Collaborator

Stacked on #849 → stride_view. Base is claude/stride-view (the new stride_view PR). This PR's diff is only the Trace signature refactor. Review/merge #849 and stride_view first, then this PR retargets to master.

Summary

The Trace internal dispatchers (Trace_internal_*, cuTrace_internal_*) and the Tracefunc_oii function-pointer typedef took their scalar arguments (is_2d, Ndiag, Nelem, ax1, ax2) by const reference. Passing scalars by const reference adds an indirection without benefit; the convention in this codebase is to take small parameters by value. Vector and Tensor arguments stay by const reference.

Several dispatchers also named the same parameters inconsistently (tn, ndiag, nelem in the integer-typed overloads versus Tn, Ndiag, Nelem elsewhere). Unify on Tn / Ndiag / Nelem across every overload so the signatures read uniformly.

Scope is strictly the Trace API surface: only Tracefunc_oii and the eleven Trace dispatcher signatures change. Other linalg_internal typedefs are left untouched.

Why split

This is a pure mechanical refactor that's easy to review on its own; landing it first keeps the follow-up "compute Trace in the container layer" PR focused on the algorithm rather than mixing in cosmetic signature churn.

Test plan

  • openblas-cpu: full suite 955/955 passed (includes fix(sum): use pairwise summation for floating-point linalg::Sum #849's re-enabled Sum-accuracy tests + new StrideView tests from the stride_view PR; no Trace behavior change so the existing DenseUniTensorTest.Trace and friends still pass).
  • mkl-cpu: full suite 955/955 passed.
  • pre-commit (clang-format-14 + EOF + whitespace) clean.

Draft — opened for review.


Generated by Claude Code

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the CPU and GPU trace internal functions by changing the parameter passing of cheap-to-copy types (such as bool and cytnx_uint64) from const references to pass-by-value, as well as standardizing parameter casing. The review feedback highlights several instances of unused parameters in functions like _trace_nd, _trace_nd_gpu, and Trace_internal_b, and suggests commenting out or omitting their names to prevent potential compiler warnings.

Comment on lines +24 to +27
void _trace_nd(Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag, cytnx_uint64 Nelem,
const std::vector<cytnx_uint64> &accu,
const std::vector<cytnx_uint64> &remain_rank_id,
const std::vector<cytnx_int64> &shape, const cytnx_uint64 &ax1,
const cytnx_uint64 &ax2) {
const std::vector<cytnx_int64> &shape, cytnx_uint64 ax1, cytnx_uint64 ax2) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The parameters Nelem, accu, remain_rank_id, and shape are completely unused in the _trace_nd function (since the direct indexing loop is commented out). To prevent compiler warnings (such as -Wunused-parameter), we can comment out or omit the parameter names in the function definition while keeping the signature intact for the dispatchers.

Suggested change
void _trace_nd(Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag, cytnx_uint64 Nelem,
const std::vector<cytnx_uint64> &accu,
const std::vector<cytnx_uint64> &remain_rank_id,
const std::vector<cytnx_int64> &shape, const cytnx_uint64 &ax1,
const cytnx_uint64 &ax2) {
const std::vector<cytnx_int64> &shape, cytnx_uint64 ax1, cytnx_uint64 ax2) {
void _trace_nd(Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag, cytnx_uint64 /*Nelem*/,
const std::vector<cytnx_uint64> &/*accu*/,
const std::vector<cytnx_uint64> &/*remain_rank_id*/,
const std::vector<cytnx_int64> &/*shape*/, cytnx_uint64 ax1, cytnx_uint64 ax2) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is resolved by the stacked design rather than this commit. In #862, _trace_nd is replaced entirely by a single TraceImpl<T> that uses every parameter it derives (it computes the strided base offset and reduces the diagonal in place), so there is no commented-out body and no unused parameter to silence. This PR (#861) is only the signature/naming refactor of the pre-existing Contract-based bodies; on its own branch I omitted the unused names in the stub/2d paths (see f894e02), and the ND body that still had the commented-out loop is rewritten in the next PR.


Generated by Claude Code

Comment on lines +172 to +176
void Trace_internal_b(bool is_2d, Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag,
cytnx_uint64 Nelem, const std::vector<cytnx_uint64> &accu,
const std::vector<cytnx_uint64> &remain_rank_id,
const std::vector<cytnx_int64> &shape, const cytnx_uint64 &ax1,
const cytnx_uint64 &ax2) {
const std::vector<cytnx_int64> &shape, cytnx_uint64 ax1,
cytnx_uint64 ax2) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Trace_internal_b function only throws an error and does not use any of its parameters. Commenting out or omitting the parameter names prevents potential -Wunused-parameter compiler warnings.

Suggested change
void Trace_internal_b(bool is_2d, Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag,
cytnx_uint64 Nelem, const std::vector<cytnx_uint64> &accu,
const std::vector<cytnx_uint64> &remain_rank_id,
const std::vector<cytnx_int64> &shape, const cytnx_uint64 &ax1,
const cytnx_uint64 &ax2) {
const std::vector<cytnx_int64> &shape, cytnx_uint64 ax1,
cytnx_uint64 ax2) {
void Trace_internal_b(bool /*is_2d*/, Tensor &/*out*/, const Tensor &/*Tn*/, cytnx_uint64 /*Ndiag*/,
cytnx_uint64 /*Nelem*/, const std::vector<cytnx_uint64> &/*accu*/,
const std::vector<cytnx_uint64> &/*remain_rank_id*/,
const std::vector<cytnx_int64> &/*shape*/, cytnx_uint64 /*ax1*/,
cytnx_uint64 /*ax2*/) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in f894e02Trace_internal_b (the bool stub that only raises an unsupported-type error) now has all parameter names commented out: void Trace_internal_b(Tensor & /*out*/, const Tensor & /*Tn*/, const bool /*is_2d*/, ...), so -Wunused-parameter stays quiet while the shared dispatch signature is preserved.


Generated by Claude Code

Comment on lines +25 to +28
void _trace_nd_gpu(Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag, cytnx_uint64 Nelem,
const std::vector<cytnx_uint64> &accu,
const std::vector<cytnx_uint64> &remain_rank_id,
const std::vector<cytnx_int64> &shape, const cytnx_uint64 &ax1,
const cytnx_uint64 &ax2) {
const std::vector<cytnx_int64> &shape, cytnx_uint64 ax1, cytnx_uint64 ax2) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The parameters Nelem, accu, remain_rank_id, and shape are completely unused in the _trace_nd_gpu function. Commenting out or omitting the parameter names in the function definition prevents potential -Wunused-parameter compiler warnings while keeping the signature intact for the dispatchers.

    void _trace_nd_gpu(Tensor &out, const Tensor &Tn, cytnx_uint64 Ndiag, cytnx_uint64 /*Nelem*/,
                       const std::vector<cytnx_uint64> &/*accu*/,
                       const std::vector<cytnx_uint64> &/*remain_rank_id*/,
                       const std::vector<cytnx_int64> &/*shape*/, cytnx_uint64 ax1, cytnx_uint64 ax2) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the CPU case: in this PR (#861) the GPU _trace_nd_gpu is still the Contract-based body and does read its parameters, so there's nothing unused here yet. The next PR (#850) replaces it with the TraceNdKernel launcher, where the host helper now decodes the layout on the device and uses every derived value — no unused parameter remains. The stub cuTrace_internal_b does get its names commented out in f894e02.


Generated by Claude Code

Copy link
Copy Markdown
Collaborator Author

@IvanaGyro IvanaGyro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is so trivial. It should be merged into #862 and #850

Copy link
Copy Markdown
Collaborator Author

@IvanaGyro IvanaGyro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Review written by Claude on behalf of @IvanaGyro.)

LGTM. This is a behavior-preserving mechanical refactor: scalar/bool args go from const& to by-value (the right call for cytnx_uint64/bool — passing a primitive by const-ref is pure overhead), and the tn/ndiag/nelem names are unified to Tn/Ndiag/Nelem to match the rest of the family. The CPU .cpp/.hpp, GPU .cu/.hpp, and the Tracefunc_oii typedef all move in lockstep, so the dispatch table stays type-consistent.

No new tests needed — the trace bodies are unchanged and the existing *UniTensorTest.Trace / gpu_Trace suites cover the surface. Verified the cumulative stack (through #850) builds and all Trace tests pass on both openblas-cuda and mkl-cuda.

One nit, non-blocking: the PR title here is the sig-refactor, but note the GPU body is still the Contract-based implementation at this point in the stack — the actual CUDA kernels arrive in #850. That's fine for a stacked series; just flagging so the commit message isn't read as "GPU kernels landed here."

@IvanaGyro IvanaGyro force-pushed the claude/trace-sig-refactor branch from dd601ed to 92ae99d Compare May 31, 2026 08:37
@IvanaGyro IvanaGyro force-pushed the claude/stride-view branch from d85fd93 to 1e86067 Compare May 31, 2026 12:18
…aming

The Trace internal dispatchers (Trace_internal_*, cuTrace_internal_*) and the
Tracefunc_oii function-pointer typedef took their scalar arguments (is_2d,
Ndiag, Nelem, ax1, ax2) by const reference. Passing scalars by const reference
adds an indirection without benefit; the convention in this codebase is to
take small parameters by value. Vector and Tensor arguments remain by const
reference.

Several dispatchers also named the same parameters inconsistently
(tn, ndiag, nelem in the integer-typed overloads versus Tn, Ndiag, Nelem
elsewhere). Unify on Tn/Ndiag/Nelem across every overload so the signatures
read uniformly.

This commit is scoped to the Trace API surface; other linalg_internal
typedefs are left untouched.

Co-authored-by: Claude <noreply@anthropic.com>
@pcchen pcchen added this to the v1.1.0 milestone Jun 1, 2026
@IvanaGyro IvanaGyro force-pushed the claude/trace-sig-refactor branch from 92ae99d to f894e02 Compare June 2, 2026 02:11
Copy link
Copy Markdown
Collaborator Author

Acting on your "merge into #862 and #850" — done.

Rebased the chain so #861 is dropped: stride-viewtrace-cpu (#862) → gallant-franklin-4g4rY (#850). The signature-and-naming refactor that was #861 is now absorbed into #862's first commit on the CPU side (Tracefunc_oii typedef + dispatcher signatures end at the simplified Tensor(const Tensor&, cytnx_uint64, cytnx_uint64) shape directly), and into #850 on the GPU side (cuTrace_internal.cu/.hpp move straight from the old Contract-based dispatchers to the new kernel-based ones with the simplified signature). Verified openblas-cpu 965/965 on both rebased branches; cuTrace_internal.cu re-verified with nvcc -c (zero errors).

Closing #861 now.


Generated by Claude Code

@IvanaGyro IvanaGyro closed this Jun 2, 2026
@IvanaGyro IvanaGyro deleted the claude/trace-sig-refactor branch June 2, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants