Skip to content

New ReAct tool: compare_services(s1, s2, window) #84

@VarunGitGood

Description

@VarunGitGood

Why

A frequent investigation move is "compare what these two services emitted in the same window" — for example, did the auth service start erroring before or after the gateway? Today the LLM has to make 3–4 separate tool calls (get_service_summary for each, search_logs per service, then mental diffing) and the comparison happens implicitly in the model's reasoning, which is slow and easy to get wrong.

A dedicated tool returns a side-by-side diff in one call: per-service error/warning counts, top-N repeating signatures, first/last timestamp inside the window, and the signatures that appear in one service but not the other (the actual interesting signal).

Scope (in)

  • New tool compare_services(s1: str, s2: str, time_from: str, time_to: str, project_id: uuid|None) -> dict in repi/investigation/tools.py.
  • Returns:
    {
      "s1": { "service": "...", "error_count": ..., "warning_count": ...,
              "first_ts": "...", "last_ts": "...",
              "top_signatures": [{ "signature": "...", "count": ... }, ...] },
      "s2": { ... same shape ... },
      "only_in_s1": [ "<signature>", ... ],
      "only_in_s2": [ "<signature>", ... ],
      "shared": [ "<signature>", ... ]
    }
  • Signatures extracted with repi/retrieval/cluster_view.extract_signature (same util build_timeline uses).
  • Single SQL pass per service via aggregate query; no per-row Python looping.
  • Schema entry in TOOL_SCHEMAS so the LLM discovers it.
  • Mentioned in the ReAct system prompt as "prefer this over two separate get_service_summary calls when comparing services".

Scope (out)

  • More than two services (a future compare_many could come later; two-arg covers the most common case).
  • Cross-project comparison.

Acceptance

  • Tool returns a populated diff against a seeded eval dataset that has overlapping and divergent signatures.
  • A scripted scenario that previously took 4+ tool calls finishes in 1.
  • Unit test in tests/investigation/test_tools.py covers only_in_s1 / only_in_s2 / shared partitioning.

Files

  • repi/investigation/tools.py — add compare_services + register in TOOL_SCHEMAS.
  • repi/investigation/react_loop.py — dispatch table.
  • tests/investigation/test_tools.py

Metadata

Metadata

Assignees

Labels

featureNew featurereact-qualityReAct loop reasoning improvements

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions