fix(graph): classify multi-assigned instance field as field-reassign, not uid-collision#525
Merged
Merged
Conversation
… not uid-collision
An instance field assigned in more than one place — `self.x = …` in
`__init__` then again in another method (Python), `this.x = …` (JS/TS),
`@x = …` (Ruby), Rust struct field re-set, etc. — emits two RawNodes
with identical (kind=Property, path, owner, name), i.e. a uid collision.
The second is correctly deduped to one node, but this is the most common
OO idiom, NOT a parser bug.
Previously it classified as `uid-collision` ("true parser ambiguity worth
investigating") and recorded a BlindSpot. That BlindSpot made
`ecp inspect`/`ecp impact` report `traversal may be incomplete` for
ordinary code (e.g. CacheManager._refresh_task, set in __init__ and
warmup_bot) — false "no data" signal that misleads the consuming LLM.
`classify_collision` now returns `field-reassign` for a non-C-family
`Property` self-collision, and the build dedup path records NO BlindSpot
for it (the dedup loses no information — both nodes ARE the same field).
The tombstone still runs, so node-position alignment is unchanged.
Real collisions (Variable/Impl/Class duplicates, C-family ifdef-redef,
method overloads) are untouched.
Language-neutral: the fix lives in the shared collision classifier, so it
covers every OO language's field-reassignment at once. Tests drive
classify_collision + GraphBuilder directly (py/js/Rust fixtures) and
assert real non-field duplicates still report uid-collision.
Contributor
ecp impact cache (0 symbols) — internal, used by
|
coseto6125
added a commit
that referenced
this pull request
May 31, 2026
The v0.6.2 section was generated by the release-PR (#524) bump before #523/#525/#526 merged, so it listed only #520/#521/#522. Backfill the three before tagging v0.6.2 so the release notes match what the tag actually ships: - #525 field-reassign collision fix (Bug Fixes) - #523 cypher prop-filter refactor (Refactor) - #526 `--file` flag rename (Chore)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
An instance field assigned in more than one place —
self.x = …in__init__then again in another method (Python),this.x = …(JS/TS),@x = …(Ruby), a Rust struct field re-set, etc. — emits two RawNodes with identical(kind=Property, path, owner, name), i.e. a uid collision. The second is correctly deduped to one node.But this is the most common OO idiom, not a parser bug. It was classified as
uid-collision("true parser ambiguity worth investigating") and recorded a BlindSpot, which madeecp inspect/ecp impactreporttraversal may be incompletefor ordinary code — a false "no data" signal that misleads the consuming LLM.Found via dogfooding
On a real repo,
ecp inspect --name warmup_botreported:_refresh_taskis a single field set in__init__(line 89) and reassigned inwarmup_bot(line 235). Nothing is ambiguous; the graph has exactly one node for it. The BlindSpot was pure noise that flagged the traversal as unreliable.Fix
classify_collision(the shared collision classifier inresolution/builder.rs) now returnsfield-reassignfor a non-C-familyPropertyself-collision, and the build dedup path records no BlindSpot for that kind — the dedup loses no information (both nodes ARE the same field). The tombstone still runs, so node-position alignment is unchanged.Untouched: real collisions still report
uid-collision(Variable/Impl/Class duplicates), C-familyPropertystaysifdef-redef(header redefinition, not an OO field), method overloads staymethod-overload.Why not "add line to the uid"
Including the assignment line in the uid would split one field into N distinct nodes — scattering its references across duplicates and making
ecp impactless complete, not more. The uid(kind, path, owner, name)correctly expresses "one class's one field = one identity"; the bug was in how the resulting self-collision was classified, not in the identity.Tests
field_reassign_not_collision.rs: reassigned field dedups to one node, emits nouid-collisionBlindSpot, and a non-field duplicate Variable still reportsuid-collision(guards against over-suppression).classify_non_c_property_is_field_reassign(py/js/Rust → field-reassign; C-family → ifdef-redef); updated the existing fall-through test (its old assumption that Property falls through touid-collisionwas the bug).ecp-analyzersuite green.warmup_botno longer emits the collision BlindSpot andget_for_searchimpact no longer flags traversal-incomplete.Language-neutral: the fix is in the shared classifier, covering every OO language's field-reassignment at once — the 14-language dimension is satisfied at the classification layer rather than per-parser.