Skip to content

C++: method calls through singletons, factories, and chained getters resolve to the wrong class (or not at all) #645

@stabey

Description

@stabey

Summary

In C++, a method call whose receiver is another call's result loses the
receiver's type during extraction. The call is recorded as a bare method name,
so resolution can't tell which class it belongs to. When two classes share a
method name, the call silently resolves to whichever class was indexed first
— or doesn't resolve at all. This corrupts callers, callees, impact, and
trace, and (worse than a miss) does so silently, with a plausible-looking
wrong edge.

This hits some of the most common C++ idioms: singletons (Foo::instance()),
factories, and chained getters.

Reproduction

Two classes sharing a method name writeLog. Logger sorts first, so it wins
any name-only tie. Every call below targets Metrics.

logger.hpp / metrics.hpp:

class Logger  { public: static Logger&  instance(); void writeLog(const std::string&); };
class Metrics { public: static Metrics& instance(); void writeLog(const std::string&); };

(+ out-of-line definitions in a .cpp)

app.cpp:

void a() { Metrics::instance().writeLog("x"); }              // chained singleton
void b() { auto& m = Metrics::instance(); m.writeLog("x"); } // stored in auto

Index, then query callers of each writeLog.

Expected

Both a and b are callers of Metrics::writeLog.

Actual (0.9.8 / main)

  • a resolves to Logger::writeLog ❌ (mis-attributed to the first-indexed
    same-named method).
  • b produces no calls edge to any writeLog ❌ (the auto receiver type
    is never recovered).

Only the explicitly-typed form Metrics& m = Metrics::instance(); m.writeLog()
resolves correctly today.

Why it matters

CodeGraph's value is letting an agent answer flow/impact questions without
reading source. A silently wrong caller edge is worse than a missing one: the
agent trusts a plausible-but-incorrect relationship. And the trigger is
universal — any codebase with two same-named methods (extremely common) hits it
the moment a singleton/factory call is involved.

Root cause

For Foo::instance().bar(), tree-sitter produces a field_expression whose
receiver is the inner Foo::instance() call. The call extractor only keeps a
receiver when it's a plain identifier, so a call-expression receiver is dropped
and the callee degrades to the bare name bar. Resolution then falls back to
name matching, which ties between same-named methods and picks the first-indexed
one. C++ also captures no return-type information today (method signatures are
empty), so there's nothing to recover the receiver's type from.

Scope / proposed direction

Resolve the receiver by what the inner call returns, which requires first
capturing C++ return types during extraction. In order of value:

  • Singletons / self-returning accessors: Foo::instance().bar(),
    Foo::getInstance()->bar() (any accessor name, not just instance).
  • Factories returning a different type: WidgetFactory::create().draw()
    resolves on Widget, not WidgetFactory.
  • Free-function factories: openSession()->run().
  • The same patterns stored in an auto local first, plus new /
    std::make_unique / std::make_shared / casts / direct construction.
  • Single-level member chains: manager.view().render().

Deliberately out of scope (need a real type environment — symbol tables +
overload resolution by argument types — not a heuristic; can be tracked
separately):

  • Deep chains a().b().c().
  • Multi-level member access h.mgr.view().render().
  • Overload-correct selection, typedef/using alias resolution, templated
    return types, inherited methods.

Safety: every inferred type should be validated against the graph (the class
must actually have the method) before an edge is created, so a wrong guess falls
through silently rather than producing a wrong edge.

Environment

  • Reproduced on 0.9.8 and main.
  • Language: C++ (tree-sitter).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions