Conversation
…ndsurf, replit (WEB-4771) (#205) * fix(linux/discovery): drop residue/collision fallbacks for cursor, windsurf, replit (WEB-4771) Three Linux-only detectors reported a tool that wasn't really installed: - Cursor: fell back to ~/.cursor dir existence (survives uninstall, shared with Cursor CLI/rules) -> phantom row. Drop it; binary gates only. - Windsurf: fell back to ~/.windsurf dir existence (~475 MB, survives uninstall) -> phantom row. Drop it; binary gates only. - Replit: `which replit` backstop name-collided with the PyPI `replit` package's console script -> phantom Replit Desktop for any dev who `pip install replit`'d. Drop the backstop; the /usr/lib/replit resource-tree gate is authoritative. Remove the now-orphaned _check_replit_command helper. The macOS/Windows variants already gate on the app/binary only, so this just brings Linux to parity. Pure detection-accuracy fix. Tests: new test_cursor_residue_detection + test_windsurf_residue_detection (residue-only -> not detected; real binary -> still detected); reworked the replit backstop test into test_which_replit_pypi_collision_not_detected (`which replit` resolves but no resource tree -> not detected). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(linux/replit): drop replit --version fallback; scrub tracking ids from comments Address review on the version side of the same name-collision: get_version's `replit --version` fallback (_version_via_command) is subject to the identical PyPI `replit` collision — an asar-only Desktop install on a machine with the PyPI package would report the PyPI version instead of "Unknown", contradicting the docstring. Drop the fallback (and the now-unused run_command/VERSION_TIMEOUT imports); an asar-only install yields None -> detect() reports "Unknown" as documented. Update the affected version/residue tests (their run_command stubs are now obsolete). Also remove ticket/PR identifiers from the in-code comments and docstrings added by this change, keeping the explanations professional. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Fetch the backend's unresolved-UUID worklist, match each against the local Claude session files (Claude Code + CoWork folders), and report the resolved name + tools back via the single-server scan endpoint with the originating UUID. Only UUIDs the backend asked for are sent; HTTP via curl per the Zscaler constraint.
Call run_sweep at the end of the discovery run so connector UUIDs self-heal on the tool's existing periodic cadence. Best-effort — never affects the discovery outcome.
Device sweep: resolve Claude connector UUIDs
| def _normalize_url(url): | ||
| return (url or "").rstrip("/") |
There was a problem hiding this comment.
Scheme-Less Domains Skip Sweep
When discovery is run with a domain like api.example.com, the main reporting path accepts it by adding https://, but this new sweep builds api.example.com/api/v1/... directly. Curl then fails before fetching unresolved UUIDs, and the caller only logs that at debug level, so connector resolution silently never runs for that accepted input.
| def _normalize_url(url): | |
| return (url or "").rstrip("/") | |
| def _normalize_url(url): | |
| url = (url or "").strip() | |
| if url and "://" not in url: | |
| url = f"https://{url}" | |
| return url.rstrip("/") |
| for entry in (data.get("remoteMcpServersConfig") or []): | ||
| if not isinstance(entry, dict): | ||
| continue | ||
| uuid = (entry.get("uuid") or "").strip().lower() |
There was a problem hiding this comment.
The backend worklist is an opaque list of connector UUIDs, but the sweep lowercases both the local key and the value later sent back as connector_uuid. If the backend stored a mixed-case opaque ID and compares it exactly, the local session entry matches but the POST uses a different ID, so the resolver can reject it or update the wrong lookup key.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 99e8c54. Configure here.
|
|
||
|
|
||
| def _normalize_url(url): | ||
| return (url or "").rstrip("/") |
There was a problem hiding this comment.
Sweep skips https scheme normalization
High Severity
The _normalize_url function in sweep_connectors.py only strips trailing slashes, unlike the utils.normalize_url used by other discovery components which also adds an https:// scheme. This difference means the connector sweep can fail to reach the backend when provided with a bare domain, even if other discovery operations succeed.
Reviewed by Cursor Bugbot for commit 99e8c54. Configure here.
|
|
||
| files = [] | ||
| for sub in SESSION_SUBDIRS: | ||
| folder = base / sub |
There was a problem hiding this comment.
Sweep reads one user’s Claude dir
Medium Severity
The read_local_connectors function only checks the current user's Claude session files. On Linux, when the sweep runs as root or a service account, this means it misses connectors located in other users' home directories, preventing their resolution. This contrasts with how other discovery processes find user-specific data.
Reviewed by Cursor Bugbot for commit 99e8c54. Configure here.
| sent, failed, matched = run_sweep(args.domain, args.api_key) | ||
| logger.info(f"Connector UUID sweep: resolved {sent}, failed {failed}, matched {matched}") | ||
| except Exception as sweep_err: | ||
| logger.debug(f"Connector UUID sweep failed: {sweep_err}") |
There was a problem hiding this comment.
Sweep runs under watchdog
Medium Severity
run_sweep runs at the end of the main try, after the scan completed event, but before finally sets _finished and cancels the watchdog timer. Each unresolved connector can spend up to ~90s in subprocess timeouts, so a long sweep can still trigger _abort for exceeding args.timeout, sending a failed scan event and os._exit(1) after the backend already recorded completion.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 99e8c54. Configure here.


Note
Medium Risk
Changes inventory accuracy on Linux and adds authenticated backend calls reading local Claude session data; sweep failures are isolated from scan success but wrong detection still affects reported tools.
Overview
Adds a best-effort connector UUID sweep at the end of each discovery run: the agent fetches unresolved UUIDs from the backend, maps them from local Claude Code / CoWork session JSON, and POSTs display names and tools via
sweep_connectors.run_sweep(also runnable standalone with curl).Linux detection false positives are tightened to match macOS/Windows: Cursor and Windsurf no longer treat leftover
~/.cursor/~/.windsurfas installs—only real binaries count. Replit Desktop drops thewhich replitandreplit --versionpaths that collided with the PyPIreplitpackage; detection stays on the Electron resource tree (app.asarorpackage.json).Tests cover residue-only homes, real binaries, and the Replit PyPI collision case.
Reviewed by Cursor Bugbot for commit 99e8c54. Bugbot is set up for automated code reviews on this repo. Configure here.
Greptile Summary
This PR tightens Linux AI-tool discovery and adds a Claude connector sweep. The main changes are:
replitcommand.Confidence Score: 1/5
The connector sweep can silently skip valid discovery runs that use a scheme-less domain.
scripts/coding_discovery_tools/sweep_connectors.py
Important Files Changed
~/.cursorresidue fallback while preserving binary-based detection.replitcommand.~/.windsurfresidue fallback while preserving binary-based detection.Reviews (1): Last reviewed commit: "Merge pull request #220 from websentry-a..." | Re-trigger Greptile
Context used:
Learned From
websentry-ai/ai-gateway-data#448