Skip to content

fix(graph): classify multi-assigned instance field as field-reassign, not uid-collision#525

Merged
coseto6125 merged 1 commit into
mainfrom
fix/field-reassign-not-collision
May 31, 2026
Merged

fix(graph): classify multi-assigned instance field as field-reassign, not uid-collision#525
coseto6125 merged 1 commit into
mainfrom
fix/field-reassign-not-collision

Conversation

@coseto6125
Copy link
Copy Markdown
Owner

Problem

An instance field assigned in more than one place — self.x = … in __init__ then again in another method (Python), this.x = … (JS/TS), @x = … (Ruby), a Rust struct field re-set, etc. — emits two RawNodes with identical (kind=Property, path, owner, name), i.e. a uid collision. The second is correctly deduped to one node.

But this is the most common OO idiom, not a parser bug. It was classified as uid-collision ("true parser ambiguity worth investigating") and recorded a BlindSpot, which made ecp inspect / ecp impact report traversal may be incomplete for ordinary code — a false "no data" signal that misleads the consuming LLM.

Found via dogfooding

On a real repo, ecp inspect --name warmup_bot reported:

uid-collision: first=Property…CacheManager_refresh_task second=Property…CacheManager_refresh_task
note: traversal may be incomplete

_refresh_task is a single field set in __init__ (line 89) and reassigned in warmup_bot (line 235). Nothing is ambiguous; the graph has exactly one node for it. The BlindSpot was pure noise that flagged the traversal as unreliable.

Fix

classify_collision (the shared collision classifier in resolution/builder.rs) now returns field-reassign for a non-C-family Property self-collision, and the build dedup path records no BlindSpot for that kind — the dedup loses no information (both nodes ARE the same field). The tombstone still runs, so node-position alignment is unchanged.

Untouched: real collisions still report uid-collision (Variable/Impl/Class duplicates), C-family Property stays ifdef-redef (header redefinition, not an OO field), method overloads stay method-overload.

Why not "add line to the uid"

Including the assignment line in the uid would split one field into N distinct nodes — scattering its references across duplicates and making ecp impact less complete, not more. The uid (kind, path, owner, name) correctly expresses "one class's one field = one identity"; the bug was in how the resulting self-collision was classified, not in the identity.

Tests

  • New field_reassign_not_collision.rs: reassigned field dedups to one node, emits no uid-collision BlindSpot, and a non-field duplicate Variable still reports uid-collision (guards against over-suppression).
  • Unit classify_non_c_property_is_field_reassign (py/js/Rust → field-reassign; C-family → ifdef-redef); updated the existing fall-through test (its old assumption that Property falls through to uid-collision was the bug).
  • Full ecp-analyzer suite green.
  • E2E: re-indexed a real repo; warmup_bot no longer emits the collision BlindSpot and get_for_search impact no longer flags traversal-incomplete.

Language-neutral: the fix is in the shared classifier, covering every OO language's field-reassignment at once — the 14-language dimension is satisfied at the classification layer rather than per-parser.

… not uid-collision

An instance field assigned in more than one place — `self.x = …` in
`__init__` then again in another method (Python), `this.x = …` (JS/TS),
`@x = …` (Ruby), Rust struct field re-set, etc. — emits two RawNodes
with identical (kind=Property, path, owner, name), i.e. a uid collision.
The second is correctly deduped to one node, but this is the most common
OO idiom, NOT a parser bug.

Previously it classified as `uid-collision` ("true parser ambiguity worth
investigating") and recorded a BlindSpot. That BlindSpot made
`ecp inspect`/`ecp impact` report `traversal may be incomplete` for
ordinary code (e.g. CacheManager._refresh_task, set in __init__ and
warmup_bot) — false "no data" signal that misleads the consuming LLM.

`classify_collision` now returns `field-reassign` for a non-C-family
`Property` self-collision, and the build dedup path records NO BlindSpot
for it (the dedup loses no information — both nodes ARE the same field).
The tombstone still runs, so node-position alignment is unchanged.
Real collisions (Variable/Impl/Class duplicates, C-family ifdef-redef,
method overloads) are untouched.

Language-neutral: the fix lives in the shared collision classifier, so it
covers every OO language's field-reassignment at once. Tests drive
classify_collision + GraphBuilder directly (py/js/Rust fixtures) and
assert real non-field duplicates still report uid-collision.
@coseto6125 coseto6125 enabled auto-merge (squash) May 31, 2026 01:52
@coseto6125 coseto6125 added the merge-queue Opt-in to Mergify merge queue label May 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

ecp impact cache (0 symbols) — internal, used by ecp dev pr-analyze

[]

@github-actions github-actions Bot added the ecp:risk-low ecp signal label May 31, 2026
@coseto6125 coseto6125 merged commit 9911cef into main May 31, 2026
18 checks passed
@coseto6125 coseto6125 deleted the fix/field-reassign-not-collision branch May 31, 2026 02:17
coseto6125 added a commit that referenced this pull request May 31, 2026
The v0.6.2 section was generated by the release-PR (#524) bump before
#523/#525/#526 merged, so it listed only #520/#521/#522. Backfill the
three before tagging v0.6.2 so the release notes match what the tag
actually ships:
- #525 field-reassign collision fix (Bug Fixes)
- #523 cypher prop-filter refactor (Refactor)
- #526 `--file` flag rename (Chore)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ecp:risk-low ecp signal merge-queue Opt-in to Mergify merge queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant