Summary
In C++, a method call whose receiver is another call's result loses the
receiver's type during extraction. The call is recorded as a bare method name,
so resolution can't tell which class it belongs to. When two classes share a
method name, the call silently resolves to whichever class was indexed first
— or doesn't resolve at all. This corrupts callers, callees, impact, and
trace, and (worse than a miss) does so silently, with a plausible-looking
wrong edge.
This hits some of the most common C++ idioms: singletons (Foo::instance()),
factories, and chained getters.
Reproduction
Two classes sharing a method name writeLog. Logger sorts first, so it wins
any name-only tie. Every call below targets Metrics.
logger.hpp / metrics.hpp:
class Logger { public: static Logger& instance(); void writeLog(const std::string&); };
class Metrics { public: static Metrics& instance(); void writeLog(const std::string&); };
(+ out-of-line definitions in a .cpp)
app.cpp:
void a() { Metrics::instance().writeLog("x"); } // chained singleton
void b() { auto& m = Metrics::instance(); m.writeLog("x"); } // stored in auto
Index, then query callers of each writeLog.
Expected
Both a and b are callers of Metrics::writeLog.
Actual (0.9.8 / main)
a resolves to Logger::writeLog ❌ (mis-attributed to the first-indexed
same-named method).
b produces no calls edge to any writeLog ❌ (the auto receiver type
is never recovered).
Only the explicitly-typed form Metrics& m = Metrics::instance(); m.writeLog()
resolves correctly today.
Why it matters
CodeGraph's value is letting an agent answer flow/impact questions without
reading source. A silently wrong caller edge is worse than a missing one: the
agent trusts a plausible-but-incorrect relationship. And the trigger is
universal — any codebase with two same-named methods (extremely common) hits it
the moment a singleton/factory call is involved.
Root cause
For Foo::instance().bar(), tree-sitter produces a field_expression whose
receiver is the inner Foo::instance() call. The call extractor only keeps a
receiver when it's a plain identifier, so a call-expression receiver is dropped
and the callee degrades to the bare name bar. Resolution then falls back to
name matching, which ties between same-named methods and picks the first-indexed
one. C++ also captures no return-type information today (method signatures are
empty), so there's nothing to recover the receiver's type from.
Scope / proposed direction
Resolve the receiver by what the inner call returns, which requires first
capturing C++ return types during extraction. In order of value:
Deliberately out of scope (need a real type environment — symbol tables +
overload resolution by argument types — not a heuristic; can be tracked
separately):
- Deep chains
a().b().c().
- Multi-level member access
h.mgr.view().render().
- Overload-correct selection,
typedef/using alias resolution, templated
return types, inherited methods.
Safety: every inferred type should be validated against the graph (the class
must actually have the method) before an edge is created, so a wrong guess falls
through silently rather than producing a wrong edge.
Environment
- Reproduced on
0.9.8 and main.
- Language: C++ (tree-sitter).
Summary
In C++, a method call whose receiver is another call's result loses the
receiver's type during extraction. The call is recorded as a bare method name,
so resolution can't tell which class it belongs to. When two classes share a
method name, the call silently resolves to whichever class was indexed first
— or doesn't resolve at all. This corrupts
callers,callees,impact, andtrace, and (worse than a miss) does so silently, with a plausible-lookingwrong edge.
This hits some of the most common C++ idioms: singletons (
Foo::instance()),factories, and chained getters.
Reproduction
Two classes sharing a method name
writeLog.Loggersorts first, so it winsany name-only tie. Every call below targets
Metrics.logger.hpp/metrics.hpp:(+ out-of-line definitions in a
.cpp)app.cpp:Index, then query callers of each
writeLog.Expected
Both
aandbare callers ofMetrics::writeLog.Actual (0.9.8 /
main)aresolves toLogger::writeLog❌ (mis-attributed to the first-indexedsame-named method).
bproduces nocallsedge to anywriteLog❌ (theautoreceiver typeis never recovered).
Only the explicitly-typed form
Metrics& m = Metrics::instance(); m.writeLog()resolves correctly today.
Why it matters
CodeGraph's value is letting an agent answer flow/impact questions without
reading source. A silently wrong caller edge is worse than a missing one: the
agent trusts a plausible-but-incorrect relationship. And the trigger is
universal — any codebase with two same-named methods (extremely common) hits it
the moment a singleton/factory call is involved.
Root cause
For
Foo::instance().bar(), tree-sitter produces afield_expressionwhosereceiver is the inner
Foo::instance()call. The call extractor only keeps areceiver when it's a plain identifier, so a call-expression receiver is dropped
and the callee degrades to the bare name
bar. Resolution then falls back toname matching, which ties between same-named methods and picks the first-indexed
one. C++ also captures no return-type information today (method signatures are
empty), so there's nothing to recover the receiver's type from.
Scope / proposed direction
Resolve the receiver by what the inner call returns, which requires first
capturing C++ return types during extraction. In order of value:
Foo::instance().bar(),Foo::getInstance()->bar()(any accessor name, not justinstance).WidgetFactory::create().draw()resolves on
Widget, notWidgetFactory.openSession()->run().autolocal first, plusnew/std::make_unique/std::make_shared/ casts / direct construction.manager.view().render().Deliberately out of scope (need a real type environment — symbol tables +
overload resolution by argument types — not a heuristic; can be tracked
separately):
a().b().c().h.mgr.view().render().typedef/usingalias resolution, templatedreturn types, inherited methods.
Safety: every inferred type should be validated against the graph (the class
must actually have the method) before an edge is created, so a wrong guess falls
through silently rather than producing a wrong edge.
Environment
0.9.8andmain.