From 930026d526d3e9b7de49444d2dd4180039b1ad01 Mon Sep 17 00:00:00 2001
From: Vishnu <79318686+zeus-12@users.noreply.github.com>
Date: Tue, 2 Jun 2026 22:00:15 +0530
Subject: [PATCH 1/5] Release Staging to Main (#154)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* feat: add cross-platform --set-cron support (macOS launchd, Linux systemd/crontab, Windows Task Scheduler) (#126)

* feat: add cross-platform --set-cron support for user-level onboard and discover

Extends setup-scheduled-scan.sh to support Linux (systemd/crontab) in
addition to macOS launchd, and adds --command discover|onboard with
--discovery-key flag so the scheduled job can re-run the full onboard
flow — not just discovery.

Adds setup-scheduled-scan.ps1 (Windows) using Task Scheduler with
StartWhenAvailable so missed runs catch up on next boot. Credentials
stored via cmdkey (Windows Credential Manager). Verified empirically
on Azure VM via deallocate/start cycle.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape JSON values and scope umask in Linux credential storage

- Escape backslash and double-quote in api_key/discovery_key/domain
  before writing to scheduled-creds.json. A literal " in any value
  previously corrupted the file and broke the regex parser at run time.
- Run the credential write inside a subshell so umask 077 is scoped to
  that operation and does not leak into the rest of the install.

Caught by greptile review on the original PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: systemd PATH for onboard wrapper + EDR-safe Windows install.ps1 path

- setup-scheduled-scan.sh: add Environment=PATH= to the systemd --user
  service unit. The default --user PATH is minimal and excludes
  ~/.local/bin, /usr/local/bin, and homebrew paths, so the onboard
  wrapper failed to locate the 'unbound' CLI at run time.
- setup-scheduled-scan.ps1: cache install.ps1 under %LOCALAPPDATA%\Unbound
  instead of downloading to TEMP and deleting on each run. The
  download-execute-delete pattern is flagged as suspicious by EDR tools;
  a stable script path under the app data dir is recognised as the
  install location. Also adds a catch around the discover wrapper so
  network failures are logged rather than left silent.

Both caught by greptile review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prepend user-binary dirs to PATH in scheduled wrapper

cron and systemd --user invoke the wrapper with a minimal PATH that
excludes ~/.local/bin, ~/.npm-global/bin, /usr/local/bin, and homebrew
paths — exactly where 'unbound' lives after 'npm install -g'. The
crontab fallback path therefore failed the onboard branch on every
scheduled run with 'unbound: command not found', visible only in the
log file.

Setting PATH in the wrapper itself covers both crontab and systemd
fallback paths, complementing the Environment=PATH= already set on the
systemd unit.

Caught by greptile review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: discovery_key validation in wrappers, accurate print statements, $args shadow

setup-scheduled-scan.sh:
- Validate DISCOVERY_KEY is present in the onboard branch of the wrapper
  before invoking unbound, so a missing credential fails with a clear log
  message rather than a confusing empty-arg error downstream.
- Fix success message: "runs once on install" → "runs on install + at
  each login via RunAtLoad" (RunAtLoad fires on every bootstrap, not once).
- Fix Linux success message to distinguish systemd catch-up vs crontab
  no-catch-up behaviour.
- Replace "$0 --uninstall" in both macOS and Linux success messages with
  "unbound discover unschedule" — $0 resolves to 'bash' when the script
  is piped via curl | bash, making the hint non-actionable.

setup-scheduled-scan.ps1:
- Validate $DiscoveryKey before invoking unbound in the onboard wrapper
  branch to surface a clear log error instead of a silent empty-arg run.
- Rename $args → $cmdArgs in the onboard wrapper branch to avoid
  shadowing the PowerShell built-in automatic variable $args.
- Fix header comment: "schtasks" → "Register-ScheduledTask" to match
  the actual cmdlet used in the script.
- Fix catch-up comment: remove stray "launchd" cross-OS reference.
- Replace raw PS1 script path in uninstall hint with "unbound discover
  unschedule" — users invoke uninstall via the CLI, not the raw script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: domain guard in shell wrapper + tighten ~/.unbound dir + validate downloaded install.ps1

setup-scheduled-scan.sh:
- Add $DOMAIN presence check at the top of the wrapper's discover branch
  so a missing Keychain/creds-file entry fails with a clear log line
  instead of silently invoking install.sh with --domain "" on every run.
  Mirrors the existing guard in the Windows wrapper.
- Move ~/.unbound mkdir inside the umask 077 subshell so the credential
  directory itself is 0700 (defense-in-depth — the file is already 0600).

setup-scheduled-scan.ps1:
- Validate the downloaded install.ps1 before executing it (length > 100
  bytes, first line looks like PowerShell). Mirrors the shell wrapper's
  size + shebang check, so an HTML error body returned with HTTP 200 is
  not blindly handed to powershell -ExecutionPolicy Bypass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pin install.sh/install.ps1 to an immutable commit SHA at cron-setup time

At the moment the cron is installed, resolve the current HEAD SHA of
coding-discovery-tool via the GitHub API and bake that SHA into the
generated wrapper script's install URL. Raw GitHub content served at a
specific commit SHA is immutable — future pushes to main cannot change
what the daily job executes.

Falls back to 'main' with a visible warning when the API is unreachable
(e.g. no network at install time). The pin is refreshed automatically
every time the user re-runs --set-cron or discover schedule.

setup-scheduled-scan.sh:  resolve_install_ref() resolves the SHA via
  curl + python3 (both already required by install.sh); updates
  SCAN_SCRIPT_URL before create_wrapper_script() writes it into the
  wrapper.

setup-scheduled-scan.ps1: resolves the SHA via Invoke-RestMethod
  (built into PowerShell 3+); passes it as -InstallRef to
  Create-WrapperScript, which replaces the UNBOUND_INSTALL_REF
  placeholder in the single-quoted here-string before writing to disk.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix: pin install.sh/install.ps1 to an immutable commit SHA at cron-setup time"

This reverts commit 48790286b90e87c7ce37ec820eeebe49d730b25d.

* fix: propagate exit code from wrapper script to Task Scheduler

Without exit $ec the wrapper always exits 0, so Task Scheduler records
LastTaskResult = 0x0 even when install.ps1 or unbound onboard fails.
Adds exit $ec before the closing heredoc so failed runs are visible
in the task history and retry-on-failure policies can trigger.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: capture wrapper exit code under set -e and migrate legacy LaunchAgent

Two issues addressed:

1. The generated wrapper runs under set -euo pipefail, so a non-zero exit
   from install.sh or unbound onboard terminated the script before the
   "Scheduled run failed" log line and before the trailing exit propagated
   the status. Every failure appeared as a successful run.

   Fix: capture EXIT_CODE inline via "|| EXIT_CODE=$?" (the canonical
   set -e bypass) and add an explicit "exit $EXIT_CODE" so launchd /
   systemd / cron observe the real status.

2. install_macos only unloaded the new "ai.getunbound.scheduled" label.
   Users upgrading from a previous version still had the old
   "ai.getunbound.discovery" agent loaded — both fired daily at 09:00
   after the upgrade.

   Fix: add a migration step at the top of install_macos that boots out
   the legacy label and removes its plist before registering the new one.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: harden Linux systemd→crontab fallback and PowerShell CLM handling

setup-scheduled-scan.sh
  - install_linux previously gated on [ -d /run/systemd/system ] and ran
    `systemctl --user daemon-reload` directly. On containers, WSL2, CI
    runners and headless servers, the system systemd dir exists but no
    user instance is running — daemon-reload exited non-zero, set -e
    killed the script after credentials and the wrapper had already
    been written, and the crontab fallback was never attempted.
  - Introduces systemd_user_available() that adds a `systemctl --user
    status` probe (the only check that catches the "no user bus" case).
  - install_linux_systemd now returns non-zero instead of aborting when
    daemon-reload or enable fails. install_linux tears down the unit
    files and falls through to crontab so the user is never left with
    credentials on disk and no scheduler.

setup-scheduled-scan.ps1
  - Wraps `Add-Type -Language CSharp` in try/catch. Constrained Language
    Mode (enforced by AppLocker / WDAC on locked-down enterprise fleets
    — exactly the environments most likely to deploy a discovery tool)
    blocks Add-Type and `-ExecutionPolicy Bypass` does NOT bypass CLM.
    Without the try/catch the wrapper logged only "=== Starting ===" on
    every run with no indication of what failed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: bake unbound path at install time and detect auth errors in wrapper

Resolves two known limitations:
1. PATH resolution — bakes the unbound binary path resolved at install
   time into the wrapper (falls back to PATH search at runtime) so the
   scheduled job survives shell profile changes, nvm switches, and
   custom npm-prefix installs.
2. API key rotation — after a non-zero exit, tail the log and emit an
   actionable HINT when a 401/Invalid-key/Unauthorized pattern is found,
   telling the operator which command to re-run with new credentials.

Applies to both the sh (macOS/Linux) and ps1 (Windows) wrappers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify LogonType=Interactive is intentional for personal-device use

Adds a comment explaining the design rationale so reviewers don't flag
it as an oversight: Interactive avoids storing a plaintext password and
is correct for developer laptops. Notes the path forward for headless
deployments if ever required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard grep -vF exit code in Linux crontab uninstall

grep -vF exits 1 when it matches no lines (i.e. the Unbound entry was
the only crontab line). Under set -euo pipefail, pipefail propagates
that exit code through the pipeline and set -e aborts the script before
crontab - executes — leaving the cron entry alive after credentials and
the install dir are already gone.

Wraps the grep in a subshell with || true, matching the identical guard
already present in install_linux_crontab at line 512.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use || true on full pipeline in crontab uninstall path

Previous fix wrapped grep in a subshell which is correct but non-idiomatic.
The cleaner form is || true on the whole pipeline — with pipefail, all pipeline
stages still execute (grep writes nothing, crontab - clears the crontab), and
|| true suppresses the non-zero exit code that grep produces when it selects
no lines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use systemctl --user show to probe user bus availability

systemctl --user status (no unit arg) exits 1 when the user manager is
in degraded state — a failed xdg-portal, pipewire, or gnome-keyring unit
is enough. This is the normal condition on most desktop sessions, so the
probe was silently falling back to crontab for the majority of Linux
users even when the user bus was fully operational.

systemctl --user show queries the manager's own properties and exits 0
as long as the user bus is reachable, regardless of individual unit
health. It is the correct reachability probe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass credentials via env vars to avoid Win32_Process.CommandLine exposure

Win32_Process.CommandLine is readable by any authenticated user without
elevation, and Windows Event Log 4688 (process-creation auditing, common
in enterprise SIEMs) captures full command lines. Passing -ApiKey <value>
as a CLI flag would have put raw API keys into security logs.

discover case: set UNBOUND_API_KEY / UNBOUND_DOMAIN before spawning, pass
-Command with backtick-escaped $env: references so the child process
command line contains the variable name, not the value. finally block
clears the env vars in all exit paths.

onboard case: drop --api-key / --discovery-key from cmdArgs entirely, set
UNBOUND_API_KEY / UNBOUND_DISCOVERY_KEY before spawning unbound. Requires
unbound-cli to read these env vars (see unbound-cli PR change).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass credentials via env vars in shell wrapper to avoid ps/cmdline exposure

Mirrors the same fix already applied to the PS1 wrapper. On Linux,
/proc/pid/cmdline is readable by any local user without elevation; on
macOS, ps output is world-readable. Passing --api-key / --discovery-key
as CLI arguments therefore exposes raw keys to every process on the
machine for the lifetime of the subprocess.

setup-scheduled-scan.sh (generated wrapper):
- discover: UNBOUND_API_KEY="$API_KEY" invoke, --api-key dropped from args
- onboard:  UNBOUND_API_KEY + UNBOUND_DISCOVERY_KEY in env, keys dropped from args
- chmod 700 on wrapper (not +x): outer umask is typically 022 so +x
  would yield 755 (world-executable); 700 restricts to owner only

install.sh:
- main() prepends --api-key from $UNBOUND_API_KEY when the flag is
  absent from explicit args, so the scheduled wrapper can invoke it
  without putting the key value on the command line
- $# -eq 0 guard now also passes when UNBOUND_API_KEY is set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape apostrophes in paths embedded in single-quoted PS expressions

On machines where the Windows username contains an apostrophe (e.g.
O'Brien), $env:LOCALAPPDATA and the npm global bin path both contain a
literal apostrophe. Embedding these paths directly inside a single-quoted
PowerShell expression ('$installScript', '$resolvedUnbound') produces an
unbalanced quote that breaks the expression and silently fails every
scheduled discover or onboard run.

Fix: apply -replace "'", "''" before interpolating into single-quoted
contexts. In PowerShell single-quoted strings, '' is the escape sequence
for a literal apostrophe, so O'Brien becomes O''Brien and the expression
remains syntactically valid.

Two sites fixed:
- discover wrapper: $safeInstallScript used in -Command string
- baked-path replacement: $escapedUnbound used in Test-Path / string literals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear all credentials before storing new ones in Install-ScheduledTask

If a user reinstalls with a different command (e.g. switching from onboard
to discover), credentials from the previous install that are not present in
the new invocation (e.g. discovery_key) were left in Windows Credential
Manager. Calling Remove-AllCredentials before storing new values makes
Install-ScheduledTask idempotent — same behaviour as the macOS path which
unconditionally deletes all four credentials before writing new ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard npm config get prefix against set -e abort when npm absent

`npm config get prefix` exits 127 when npm is not installed; under
`set -euo pipefail` that propagated a fatal error mid-install, leaving
credentials written to disk with no diagnostic message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: log onboard spawn failure on Windows; remove stale macOS Keychain creds on migration

- PS1 wrapper onboard branch: add catch block so a spawn failure
  (e.g. access denied, corrupted binary) is logged instead of silently
  falling through with an uninitialised exit code. Mirrors the existing
  catch block in the discover branch. $ec pre-set to 1 so the failure
  is recorded even if the catch block itself throws.
- SH install_macos: after removing the legacy ai.getunbound.discovery
  plist/LaunchAgent, also delete the four Keychain entries stored under
  that old service name. Without this, --uninstall (which only cleans
  the current name) leaves stale credentials on upgraded systems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): replace cmdkey /pass: with CredWrite P/Invoke for credential storage

cmdkey /pass:<secret> exposes API keys in the Windows process command
line, which Event Log 4688 (process-creation auditing) captures
persistently — exactly the enterprise environments this tool targets.

Replace Store-Credential with a CredWrite P/Invoke call (mirroring the
CredRead already in the scheduled wrapper) so secrets never appear in
any command line. cmdkey /delete: is retained for removal since
deletion exposes no secret.

Add-Type CLM guard: if PowerShell Constrained Language Mode blocks
Add-Type, exit with a descriptive error rather than silently falling
back to the insecure cmdkey /pass: form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address anonpran review — credential exposure, json_escape scope, uninstall cleanup

P0-1: Windows discover wrapper used -Command to invoke install.ps1,
causing UNBOUND_API_KEY to be bound as a param and appear in child
process command lines. Switch to -File so the only thing in the
child's command line is the script path. Update install.ps1 to
resolve ApiKey/Domain from UNBOUND_API_KEY/UNBOUND_DOMAIN env vars
(matching the pattern already in install.sh) when not passed as params.

P1-2: json_escape in store_credentials_linux only escapes \ and ".
Add a comment documenting the intentionally limited scope — API keys
and domain URLs never contain control characters so the simpler form
is sufficient.

P3-1: remove_credentials_linux left ~/.unbound/ behind after removing
the creds file. Add rmdir on the parent directory; rmdir is a no-op
when the directory is non-empty, so it is safe when other components
share the directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: note pre-existing Python CLI arg exposure in install.sh and install.ps1

The Python subprocess receives --api-key as a CLI argument, making the
key visible in /proc/pid/cmdline, ps, and Event Log 4688 for the
duration of the Python process. This is a pre-existing limitation of
the ai_tools_discovery.py entry point; both install scripts already
avoid exposing the key at the shell/PS level via UNBOUND_API_KEY env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: sentinel file to prevent RunAtLoad+StartCalendarInterval double-run

macOS launchd fires RunAtLoad (every boot/login) and StartCalendarInterval
(daily 09:00) as two independent triggers. When the machine boots before
09:00, both fire in the same morning. The wrapper now writes a timestamp to
$LOG_DIR/.last-run-ts after each execution and skips if the last run was
within 8 hours.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Junie support for Windows (#137)

* feat: Junie support for Windows

Junie (JetBrains' AI coding agent) ships on Windows too — as a CLI and as a
JetBrains IDE plugin — but coding-discovery only detected it on macOS. This
adds Windows parity (config lives in %USERPROFILE%\.junie, same layout as
macOS).

- WindowsJunieDetector — detects ~\.junie; multi-user via is_running_as_admin
  scanning C:\Users.
- WindowsJunieRulesExtractor — global ~\.junie\*.md + project-level .junie\*.md;
  project walk uses get_windows_system_directories and the is_user_level_tool_dir
  guard so user-scope rules aren't reported as project-scope.
- WindowsJunieMCPConfigExtractor — reads ~\.junie\mcp\mcp.json per user.
- Wired Windows branches into create_junie_detector and the Junie MCP/rules
  factories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address Greptile review on Windows Junie

- Global Junie rules were tagged scope="project": _detect_rule_scope did
  not recognize .junie. Added .junie to its user-config dir set, and pass
  explicit scope="user" when extracting global rules so user-level rules
  are classified correctly regardless of path resolution.
- Admin scan in the rules extractor skipped the system/default account
  exclusion (Public, Default, etc.) that the detector and MCP extractor
  already applied.
- All three Windows Junie files reinvented the admin/non-admin loop. They
  now use the shared scan_windows_user_directories helper, which centralises
  the branching, excludes system accounts, and handles PermissionError —
  fixing the exclusion bug and removing the duplicated logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Junie MCP parent_levels 2 -> 3 so path resolves to home

~\.junie\mcp\mcp.json needs 3 parent levels to resolve to ~ (home),
matching how every other global MCP config keys its `path`. With 2 it
returned ~\.junie, inconsistently keying Junie's entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add 45 unit tests for macOS and Windows Junie extraction

Covers detector, rules extractor, and MCP config extractor for both
platforms - detection with/without .junie dir, version parsing from
config.json/settings.json, global vs project-level rule extraction,
walker skipping of user-level dirs, and multi-user root scanning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add new flags to rm overdump of logs

* added payload flag

* refactored and added test cases

* Add ~/.unbound/discovery-cache.json local cache + lock + heartbeat

New module that backs per-tool payload-hash dedup and prevents concurrent scans from piling up. Cache file is JSON; lock is an mtime-heartbeat file under ~/.unbound/discovery.lock.

* Integrate per-tool hash dedup into discovery main

Acquire the discovery lock, start the heartbeat, compute payload_hash per (tool, home_user), skip upload when the local cache shows the same hash, and update the cache on a successful send.

* Stabilize payload hash by stripping volatile fields and sorting lists

Expand _strip_ephemeral so cosmetic re-scans dedup correctly: drop plugins[*].installed_at, mcpServers[*].scan.scanned_at/error, mcpServers[*].oauth.{clientId,callbackPort}, and canonicalise filesystem-walk-order-dependent lists (projects, plugins, rules, skills, mcpServers, scan.tools, permissions.{allow,deny,ask}_rules).

* Skip other tools' ~/.<dir>/ from project-scope walkers

Project-scope MCP/rules/skills walkers (filesystem from /) were adopting .mcp.json / .agents / .cursor entries inside ~/.codex, ~/.cursor extensions, etc. as projects of the wrong tool. New is_home_dotdir_descendant predicate wired into walk_for_tool_directories, walk_for_mcp_configs_generic, walk_for_claude_project_mcp_configs, and the Cursor skills _walk_for_skills uniformly — covers any tool that uses the standard ~/.toolname/ convention without per-tool maintenance.

* Accept api_key via UNBOUND_API_KEY env (keeps argv clean)

Hook-triggered invocations can now pass the api_key through the subprocess env so it never appears in argv / /proc/<pid>/cmdline. CLI --api-key remains the default for MDM and direct-script usage.

* Address PR review: tighten dotdir helper, drop dead funcs, lift import

- is_home_dotdir_descendant now matches only the canonical /Users/<u>/.foo and /home/<u>/.foo positions; non-standard mounts like /srv/home/<u>/.config or /data/Users/shared/.cache no longer false-match.
- Remove unused stamp_last_run, is_debounced, _parse_iso, and DEBOUNCE_SECONDS from cache.py — the hook owns the debounce + last_run_at stamp directly, the cache module was never the call site for these.
- Move heartbeat_start() into the try block in ai_tools_discovery main so a heartbeat thread-spawn failure can't leak the discovery lock (release_lock now always runs).
- Lift is_home_dotdir_descendant import in walk_for_tool_directories from function body to module level — no circular import (mcp_extraction_helpers only does lazy imports from macos_extraction_helpers).

* added guardrail to print summary if not --payload, --summary

* fix(macos): skip RunAtLoad when caller already ran the scan (#141)

* chore: mark --no-run-at-load as internal-only in arg parser comment (#148)

The flag is passed automatically by the unbound CLI when a scan already
ran immediately before cron setup; it is not a user-facing option and
is intentionally absent from usage().

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: implement KiloCode version extraction (macOS + Windows) (#149)

* fix: implement KiloCode version extraction (macOS + Windows)

Kilo Code ships standard VS Code extension metadata in
~/.vscode/extensions/kilocode.Kilo-Code-X.Y.Z/package.json (and
~/.cursor/extensions/ for Cursor), but both KiloCode detectors hardcoded
"version": "Unknown" and returned None from get_version(). A stale
docstring claimed "version extraction removed per requirements" — the
sibling Roo Code and Cline detectors had the same stub in early 2025
and were both rewritten to read package.json in commit a31ecfa
(Feb 2026). KiloCode was missed.

Apply the same package.json read pattern KiloCode's siblings use, with
a fall-back to the folder-name version suffix when package.json is
unreadable, and surface the resolved version through both detect() and
get_version() (the latter walks every admin user when running elevated).

Linux equivalent will land via the feat/linux-full-support branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): delegate get_version() to detect() (Greptile review)

Reviewer flagged that get_version() walked extension dirs directly while
detect() also checked that the extension settings dir + IDE app were
present. A partially-uninstalled KiloCode (leftover ~/.vscode/extensions
folder but no globalStorage and no IDE) would let get_version() return
a stale version while detect() returned None — they could disagree.

Make get_version() delegate to detect() and read 'version' off the
result so the install-gating logic is the single source of truth.
Matches the Roo Code / Cline pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address anonpran review + add Replit version detection

KiloCode (PR #149 review by anonpran):
- Scope _get_extension_version_for_user to the IDE that triggered detection.
  Previously looped all SUPPORTED_IDES and returned the first match in any
  extensions dir (.vscode first), so a Cursor install with a leftover
  ~/.vscode/extensions/kilocode.Kilo-Code-* would return the VS Code
  version against the Cursor install_path. Now takes ide_name explicitly,
  same as Roo Code / Cline siblings.
- Drop dead `except IndexError` around rsplit("-", 1)[1] — the `if "-"`
  guard already ensures the split returns 2 elements.
- Add docstring to _get_extension_version_for_user.
- Add tests/test_kilocode_version_extraction.py covering package.json
  reads, folder-suffix fallback, and IDE scoping (must NOT cross-leak
  VS Code version to a Cursor install).

Replit version extraction (macOS + Windows):
- macOS: Read CFBundleShortVersionString from /Applications/Replit.app/
  Contents/Info.plist (same source Cursor/Windsurf use); fall back to
  Contents/Resources/app/package.json.
- Windows: Walk %LOCALAPPDATA%\Programs and Program Files for the
  install dir, read resources\app\package.json; fall back to the .exe
  FileVersion via PowerShell. The earlier "doesn't expose version
  information in a standard way" comment was wrong — Replit Desktop
  is a standard Electron app.

Linux versions for Antigravity + Replit will land via the
feat/linux-full-support branch alongside the matching KiloCode fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode, replit): address the two edge-case flags

1. KiloCode folder-name fallback preserved pre-release suffix.
   rsplit("-", 1)[1] truncated pre-release semver — kilocode.Kilo-Code-
   1.2.3-pre.5 collapsed to "pre.5" instead of "1.2.3-pre.5". Replace
   with a regex that captures full semver including pre-release/build
   metadata from the end of the folder name. Add tests covering -pre.5
   and -beta.1 forms.

2. Replit Windows PowerShell path quoting.
   repr(str(path)) produced 'C:\\Users\\…' (literal double-backslash
   inside PowerShell's single quotes), which doesn't reliably resolve.
   Switch to a hand-built single-quoted string with single-quote
   doubling, plus -LiteralPath so PowerShell doesn't try to glob the
   path. Inside single quotes, backslashes are already literal — no
   double-backslash dance needed.

Both bugs only fire on the fallback branch (after package.json fails
to parse), so they couldn't produce worse output than the pre-PR
"Unknown" — but the fix is cheap and removes the footgun.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address greptile + anonpran follow-ups on PR #149

Test coverage:
- Drop @unittest.skipUnless from TestMacOSKiloCodeVersion and
  TestWindowsKiloCodeVersion. _get_extension_version_for_user is pure
  pathlib + JSON parsing, so the IDE-scoping regression guard (which
  proves a Cursor install doesn't inherit a leftover VS Code version)
  must execute on every CI runner — previously it ran only on a dev's
  macOS/Windows box and was silently skipped on the Linux runner that
  produced the "553 passed" claim.

Import hygiene / cross-OS consistency:
- Windows Replit imported COMMAND_TIMEOUT while macOS imported
  VERSION_TIMEOUT for the same lookup. Same value (30s), but mixing
  the import names across sibling files is exactly the inconsistency
  Greptile and anonpran flagged. Switch Windows to VERSION_TIMEOUT.
- Replit detect() now wraps get_version() in `or "Unknown"` on both
  macOS and Windows. Without it, a data-dir-only install emitted
  `"version": null` while a KiloCode equivalent emitted `"Unknown"`
  for the same case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: bring Windows KiloCode test coverage to macOS parity

Greptile flagged the Windows test class as thin (2 cases vs 7 on macOS).
Add the 3 missing parity tests: folder-suffix fallback, pre-release
suffix preservation, Cursor extensions-dir routing. The Linux runner
will now execute all 12 cases instead of the previous 9.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode/macos): require single-IDE pairing for globalStorage + .app

The fallback loop in _check_user_for_kilocode (macOS only) updated
ide_installed but never updated ide_with_extension. So when
globalStorage was found in IDE-A but only IDE-B was actually installed,
the version lookup ran against IDE-A's extensions dir — either
returning a stale leftover version or falling through to "Unknown"
even when IDE-B genuinely had Kilo Code installed with a readable
package.json.

The fallback was always wrong-headed: globalStorage in an uninstalled
IDE doesn't mean Kilo Code is currently usable from a different IDE.
Replace it with a single pass that requires BOTH globalStorage AND
matching .app for the same IDE.

Add three regression tests:
- test_rejects_globalstorage_when_matching_ide_not_installed: trap config
  (Code globalStorage + Cursor.app, no Cursor globalStorage) now returns
  None instead of an incoherent dict.
- test_prefers_ide_with_both_globalstorage_and_app: when both IDEs have
  globalStorage but only Cursor.app is installed, version comes from
  Cursor's extensions dir — NOT from a stale Code leftover.
- test_returns_result_when_first_ide_has_both: happy path.

Windows and Linux detectors never had this fallback loop — they
correctly return None when the first IDE check fails — so this fix is
macOS-only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* perf(replit/macos): avoid redundant _check_application_installation()

When detect() finds an app install, it then calls get_version() which
in turn calls _check_application_installation() again as a safety guard
— two filesystem syscalls where one suffices. Greptile flagged it as
a minor inefficiency.

Add an optional app_path parameter to get_version() so detect() can
pass the already-resolved Path. Falls back to the internal check when
called directly (keeping standalone-call safety). Matches the macOS
Antigravity detector's existing signature.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode/windows): require live IDE install + globalStorage pairing

Mirror the macOS fix (f87c864) to Windows. AppData survives an IDE
uninstall on Windows, so the previous detector — which trusted
globalStorage as proof of an IDE install — would report stale state.

Concrete scenario flagged in review:
- User installed VS Code, added KiloCode (creates
  %AppData%\Code\User\globalStorage\kilocode.Kilo-Code\ AND
  %USERPROFILE%\.vscode\extensions\kilocode.Kilo-Code-X.Y.Z\)
- User uninstalled VS Code (both AppData and the extensions folder
  persist on disk)
- User installed Cursor + KiloCode in Cursor

The old detector found Code globalStorage first, marked ide_with_extension
= 'Code', then surfaced the *VS Code* leftover version against a 'Code'
install_path even though only Cursor is actually installed. Before this
PR opened, that path returned 'Unknown'; after the version extraction
landed, it would start returning a stale-but-plausible version.

Add a Windows-shaped _check_ide_installation that requires the IDE's
.exe under one of the conventional install roots:
- %LOCALAPPDATA%\Programs\
- %ProgramFiles%, %ProgramFiles(x86)% (with C:\Program Files fallbacks
  when the env vars are unset, e.g. service-account scans)

Three regression tests mirror the macOS suite: trap config (Code
globalStorage + uninstalled Code + live Cursor without globalStorage),
preference for Cursor when both have globalStorage but only Cursor is
installed, and the happy path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(replit/macos) + tests: use app_path, add Replit version coverage

macOS Replit get_version() still referenced self.APPLICATION_PATH for
both the Info.plist and package.json reads even after the redundancy
fix added an app_path parameter — so passing app_path was effectively a
no-op and the function always read from /Applications/Replit.app
regardless of caller. Greptile caught it. Replace both references
with the passed-through app_path so the parameter has real effect.

Add tests/test_replit_version_extraction.py — 10 cases, no platform
skips so the Linux/macOS/Windows CI runners all execute them:

macOS (4 cases):
- returns None when no install
- reads Info.plist first via the explicit app_path AND asserts the
  call doesn't reference /Applications/Replit.app — the regression
  guard for this commit
- falls back to package.json when defaults read returns nothing
- returns None when both sources fail

Windows (6 cases):
- returns None when no install
- reads version from resources/app/package.json
- skips missing candidate dirs and tries the next one
- falls back to PowerShell FileVersion when package.json is broken
- escapes single quotes in paths (asserts user' -> user'' in the PS
  command — the regression guard for the earlier quoting fix)
- candidate_install_paths includes both LOCALAPPDATA\Programs and
  Program Files family roots

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(replit): trim version-extraction tests to regression-guard set

Drop 5 trivial/sanity tests from tests/test_replit_version_extraction.py
that didn't lock in a real regression:
- macOS test_returns_none_when_no_install (trivial)
- macOS test_falls_through_to_none_when_both_sources_missing (edge case)
- Windows test_returns_none_when_no_install_found (trivial)
- Windows test_skips_candidate_when_directory_missing (sanity)
- Windows test_candidate_install_paths_includes_user_and_system_dirs
  (implementation detail)

Keep the 5 that earn their slot:
- macOS test_reads_plist_first_when_passed_explicit_app_path — locks
  in the app_path-vs-self.APPLICATION_PATH fix Greptile flagged.
- macOS test_falls_back_to_package_json_when_plist_unreadable —
  exercises the secondary read path.
- Windows test_reads_version_from_package_json — primary success path.
- Windows test_falls_back_to_powershell_when_package_json_unreadable —
  asserts -LiteralPath is in the command, locks in the quoting fix.
- Windows test_escapes_single_quotes_in_path — asserts ' -> '' so a
  path containing a quote can't escape the PS single-quoted literal.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(replit/windows): symmetrise _candidate_install_paths with KiloCode

Greptile flagged that Windows Replit's _candidate_install_paths drops
each root entirely when its env var is unset, while the KiloCode
sibling falls back to Path.home()-derived / C:\Program Files defaults.
Real-world impact is small (env vars are always set in normal user
sessions) but matters for restricted service accounts where AppData
or Program Files env vars get stripped — and the symmetry just reads
cleaner.

Mirror the KiloCode helper's fallback structure here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat: full Linux support for unbound discover and onboard (#128)

* feat: full Linux support for unbound discover and onboard

Adds Linux implementations for every AI coding tool detector and
extractor, bringing Linux to full parity with macOS and Windows
(detectors, rules, MCP config, settings, skills), plus cross-platform
--set-cron support for user-level onboard/discover.

Highlights:
- linux/ package mirroring macos/windows for all 17 tools, wired through
  coding_tool_factory and ai_tools_discovery (multi-user /home + /root,
  Docker/CI root-only containers, /etc/machine-id device id).
- linux_extraction_helpers with Linux-aware system-path skipping and a
  walk_for_tool_directories that does not blocklist /home.
- Claude Cowork (~/.config/Claude) and Junie (~/.junie) Linux parity.
- Plugin provenance (plugin_lookup) threaded through Linux Claude/Cursor
  skills and Claude MCP extractors.
- setup-scheduled-scan cron support hardened across Linux/macOS/Windows.
- tests/test_cowork_skills_extraction_linux.py.

Rebased cleanly onto staging (which already contains plugin provenance
#129); main-only plan-detection commits are intentionally excluded — they
reach staging via the normal main->staging path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux Junie MCP parent_levels 2 -> 3 so path resolves to home

~/.junie/mcp/mcp.json needs 3 parent levels to resolve its reported path
to ~ (home), matching every other global MCP config. Same fix applied to
Windows Junie in PR #137.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: scheduler setup no longer aborts on npm-absent Linux hosts

create_wrapper_script ran NPM_BIN=$(npm config get prefix 2>/dev/null)/bin
with no failure guard, so under 'set -euo pipefail' a host without npm
aborted the entire scheduler setup before the wrapper or cron entry was
written. Guard the lookup with '|| true' and only append the bin dir when
a prefix resolves. NPM_BIN now carries its own trailing colon when set and
is empty otherwise, so an npm-absent host no longer leaves a leading ':'
in the wrapper PATH (which would have put the CWD on PATH).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux multi-user gaps in plugin scan, cursor DB, managed MCP, user-home enum

Audited every `Darwin`/`Windows` branch across the codebase and found five
spots that had no Linux counterpart. Each one silently dropped real data
on a Linux MDM/agent runner:

1. ai_tools_discovery._extract_claude_code_plugins multi-user scan only
   covered Darwin and Windows. On Linux, running as root only checked
   /root/.claude/plugins — plugins installed under /home/<user> were
   invisible, so discovery reported "No plugins found" even when a
   plugin was installed.

2. Same gap in the Cursor plugin multi-user scan.

3. ai_tools_discovery user-home enumeration (fallback when user_rules
   provide no project_path) lacked a Linux branch — used the current
   process's home instead of iterating /home/<user> + /root.

4. mcp_extraction_helpers._get_managed_mcp_path returned None on Linux.
   Added /etc/ClaudeCode/managed-mcp.json (system-wide convention).

5. utils._get_cursor_db_path returned None on Linux, breaking Cursor plan
   detection. Added the XDG path ~/.config/Cursor/User/globalStorage/
   state.vscdb.

All five Linux branches use get_all_users_linux() + the established
"/root" vs "/home/<user>" special-case so they match the rest of the
PR's multi-user handling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: skip macOS-only cowork extractor test on non-Darwin

TestMacOSClaudeCoworkSkillsExtractor exercises MacOSClaudeCoworkSkillsExtractor
specifically, and that extractor intentionally skips its final-step dedup
when running as root. The explicit-sessions-root tempdir path is therefore
ordering-dependent on ext4 (Linux), causing the dedup assertion to flake
under root — pre-existing on main, not introduced by this PR.

Skip the class on non-Darwin so Linux CI/runners don't see a failure that's
not actually about the Linux work. The Linux equivalent
(tests/test_cowork_skills_extraction_linux.py + LinuxClaudeCoworkSkillsExtractor)
dedups unconditionally and already covers the same behaviour on Linux.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: Linux unit-test parity with macOS for skills/cline/cursor extractors

Adds direct unit-test coverage for the Linux skills extractors mirroring
the macOS pattern, and updates cursor plan-detection to cover the Linux
SQLite DB path now that _get_cursor_db_path resolves it on Linux.

New files (20 new tests total):
- tests/test_claude_skills_extraction_linux.py — TestLinuxExtractorEndToEnd
  (4 methods) + TestLinuxExtractorAgents (3) mirror the macOS classes for
  LinuxClaudeSkillsExtractor (~/.claude/{skills,commands,agents}/).
- tests/test_cursor_skills_extraction_linux.py — TestLinuxCursorSkillsExtractor
  (4) + TestLinuxCursorSkillsExtractorWithCommands (5) for the Cursor flow
  including .cursor/ and .agents/ parent dirs.
- tests/test_cline_skills_extraction_linux.py — TestLinuxClineSkillsExtractor
  (5 methods) for the Cline flow.

Pattern: Linux extractors iterate get_linux_user_homes() instead of using
Path.home() + is_running_as_root, so the new tests patch
linux.<tool>.skills_extractor.get_linux_user_homes to return [fake_home]
in place of the three-decorator macOS pattern.

tests/test_cursor_plan_detection.py:
- The Linux branch used to assert "returns None (unsupported)". Now that
  _get_cursor_db_path resolves ~/.config/Cursor/User/globalStorage/state.vscdb
  on Linux, replaced with three correctly-shaped tests:
    * test_get_cursor_db_path_linux — creates the file, expects the path
    * test_get_cursor_db_path_linux_file_not_exists — mirrors the macOS
      file-not-exists assertion
    * test_get_cursor_db_path_unsupported_platform — moved to FreeBSD so
      it still validates the unsupported-platform fallthrough path.

Local macOS: 549 passed (was 527).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape heredoc runtime vars in wrapper dedup check

LAST_RUN_FILE, _last, _now, and the arithmetic expansions were bare
$/$(...) inside the unquoted <<WRAPPER_EOF heredoc, causing them to be
expanded by the setup shell at install time instead of at wrapper
runtime. The 8-hour dedup check consequently never fired.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add 10 Linux Junie unit tests (detector, rules, MCP config)

Mirrors the macOS/Windows Junie test coverage for the Linux
implementation - detection, version parsing, global rule extraction,
MCP config reading, and multi-user scanning via get_linux_user_homes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address principal engineer review — B1-B4 + H1/H2/H5

B1 (mcp_extraction_helpers): add _iter_admin_user_homes() helper that
delegates to get_linux_user_homes() on Linux, which correctly includes
/root alongside /home/*. Replace all 5 Linux admin-scan loops to use it
instead of iterating Path("/home") directly.

B2 (utils): hardcode "root" in Docker/CI fallback instead of
Path.home().name which could return "app" when HOME is overridden.

B3 (detect_openclaw): add timeout=10 to subprocess.run(["ps","aux"])
to prevent hangs on loaded systems.

B4 (windsurf): get_version() now checks _USER_RELATIVE_PATHS via
get_linux_user_homes() so user-local installs return a real version.

H1 (linux_extraction_helpers + ai_tools_discovery): extract
linux_home_for_user(username) helper; replace 7 inline ternaries.

H2 (mcp_extraction_helpers): _iter_admin_user_homes() eliminates the
5 copy-pasted Linux admin-detection blocks.

H5 (gemini_cli, opencode): add _USER_RELATIVE_PATHS fallback scan
(~/.local/bin, ~/.npm-global/bin) when which finds nothing, so
installs are detected when user profiles are not sourced (cron/systemd).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux Claude Cowork detection in per-user iteration

user_tool_detector._detect_claude_cowork() had only Darwin and Windows
branches — Linux fell through the else into the Windows AppData path,
which never exists on Linux, so Cowork was silently skipped on Linux
even when sessions dir was present.

Add the Linux branch using ~/.config/Claude/local-agent-mode-sessions
(matches LinuxClaudeCoworkDetector._sessions_dir_for_user).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: implement KiloCode version extraction (Linux)

Mirror the macOS/Windows fix from fix/kilocode-version-extraction
(PR #149): Linux KiloCode detector hardcoded 'version': 'Unknown' even
though Kilo Code ships standard VS Code extension metadata in
~/.vscode/extensions/kilocode.Kilo-Code-X.Y.Z/package.json (and the
~/.cursor/extensions/ equivalent for Cursor).

Apply the same package.json read pattern its Linux Roo Code and Cline
siblings already use, with a fall-back to the folder-name version
suffix when package.json is unreadable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): delegate Linux get_version() to detect() (Greptile review)

Mirror the PR #149 review fix: get_version() must not surface a stale
version when detect() would return None (e.g. leftover
~/.vscode/extensions folder but no kilocode globalStorage settings dir).
Delegate to detect() so install-gating logic stays the single source
of truth.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(linux): version extraction for Antigravity, Replit, and scoped KiloCode

KiloCode (mirrors PR #149 anonpran review fix):
- Scope _get_extension_version_for_user to the IDE that triggered detection.
  Previously looped all SUPPORTED_IDES and returned the first hit, so a
  Cursor install with a leftover ~/.vscode/extensions/kilocode.Kilo-Code-*
  would have returned the VS Code version against the Cursor install_path.
- Drop dead `except IndexError` around rsplit("-", 1)[1] — guarded.

Antigravity (Linux):
- Read version from resources/app/product.json (preferred) and
  resources/app/package.json (fallback) across the common install paths
  (/opt/Antigravity, /usr/lib/antigravity, etc. + ~/.local/share/...).
  Antigravity is a VS Code fork, so the resource layout matches what
  Cursor/Windsurf already use elsewhere in this codebase.

Replit (Linux):
- Read version from resources/app/package.json across system + per-user
  install paths. Final fallback: `replit --version`. The earlier "doesn't
  expose version information in a standard way" comment was wrong — Replit
  Desktop is a standard Electron app.

Tests:
- tests/test_linux_version_extraction.py covers all three detectors:
  package.json reads, fallbacks, and KiloCode IDE-scoping (must NOT
  cross-leak VS Code version onto a Cursor install).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): preserve pre-release suffix in folder-name fallback (Linux)

rsplit("-", 1)[1] truncated semver pre-release suffixes — e.g.
kilocode.Kilo-Code-1.2.3-pre.5 returned just "pre.5". Switch to a
regex that captures the full version (incl. pre-release/build metadata)
from the end of the folder name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: drop dead imports flagged in principal-engineering review

anonpran's PR #128 review (https://github.com/websentry-ai/coding-
discovery-tool/pull/128#issuecomment-4583525626) approved the PR with
three trivial dead-import nits. Removing them so Pylint stays clean:

- linux/codex/mcp_config_extractor.py: is_running_as_root never used
  in this module (root-vs-user logic lives in get_linux_user_homes()
  which is already imported and used).
- linux/cursor/mcp_config_extractor.py: extract_cursor_mcp_from_dir
  was imported but only walk_for_cursor_mcp_configs + read_global_
  mcp_config are actually called.
- linux/windsurf/mcp_config_extractor.py: same shape as cursor —
  extract_windsurf_mcp_from_dir imported but never referenced.

No behaviour change. 579 tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: drop dead Linux branch from extract_global_mcp_config_with_root_support

Greptile flagged the Linux branch in this helper as a latent
early-return bug (would silently drop non-first-user configs if any
future Linux extractor invoked it). The branch was preparation-for-
future-use that no current Linux extractor reaches.

Removing it instead. Linux MCP extractors already follow a better
pattern — per-user accumulation across get_linux_user_homes() — so
the helper is macOS/Windows-only by intent. Replace the branch with
a comment that points future authors at the correct Linux template.

Zero behaviour change today: macOS callers hit the Darwin branch,
Windows callers hit the Windows branch, no Linux callers exist. 579
tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: fix over-indented try block in extract_global_mcp_config_with_root_support

Greptile cosmetic note — the try block under the admin-homes loop was
indented 16 spaces instead of 12 (a pre-existing artifact, syntactically
valid but inconsistent). Re-align to the 4-space loop body. No behaviour
change; 579 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

* fix: allow Linux in main() OS guard so discovery actually runs (#152)

PR #128 added full Linux support (70 modules, 52 factory branches,
linux_extraction_helpers, and 3 Linux branches in main()), but the
squash-merge into staging retained the exit-3 OS guard at the top of
main():

    if current_platform not in ("Darwin", "Windows"):
        ... "AI tool discovery is not supported on {platform}"
        sys.exit(3)

This guard fires BEFORE detector init, so on Linux `unbound discover`
exits 3 immediately and skips the scan — making every downstream Linux
branch unreachable dead code. Reproduced end-to-end on an Ubuntu VM via
the real install.sh entry path against staging: clones staging, runs
ai_tools_discovery.py, exits 3.

Narrow the guard to include Linux. Genuinely unsupported platforms
(*BSD, AIX, etc.) still exit 3 cleanly before detector init so they
can't crash and page Sentry on every run.

Update TestUnsupportedPlatformGuard accordingly:
- test_linux_proceeds_past_os_guard: Linux now passes the guard (exits 0
  at the single-flight lock check, not 3 at the guard).
- test_unsupported_platform_exits_code_3_before_detector_init: FreeBSD
  still exits 3, detector never constructed.

621 tests pass.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* fix: remove double-send regression, gate dedup transport logs

Drop the unconditional send block that fired before the dedup check; it
bypassed the local hash cache (uploading on every match) and dropped
retryable failures (no failed_reports append / cache update). The
dedup-gated send is now the sole upload path.

Gate the dedup-block "skipping upload", "Sending", and "✓ sent" lines
behind not --summary and not --payload so those modes suppress all
transport output. Errors stay ungated.

Add TestSendDedupCount: exactly one send on a changed payload, zero on a
hash match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Merge main into staging: sync staging up to main (#156)

* Exit cleanly on unsupported platforms instead of crashing (WEB-4370) (#131)

* Exit cleanly on unsupported platforms instead of crashing

On Linux (and any non-macOS/Windows platform), discovery crashed with a
ValueError deep in detector init and reported to Sentry on every run. Add an
early guard in main() that prints a friendly message to stderr and exits with
code 3 before any detector is constructed, so there's no traceback and no
Sentry noise.

WEB-4370

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Cache platform.system() result in the unsupported-platform guard

Call platform.system() once and reuse it for the check and the message,
per review feedback.

WEB-4370

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix Claude Code plan detection by using shell-based binary resolution (#134)

Previously, plan detection was skipped entirely when find_claude_binary_for_user
couldn't locate the binary in hardcoded paths (Homebrew, .local/bin, .bun, nvm).
Users who installed via volta, pnpm, fnm, asdf, mise, etc. had empty plan badges.

Instead of hardcoding more paths, this change lets the user's login shell resolve
the binary via PATH. When claude_binary is None, the command is passed through
`shell -lc "claude auth status --json"` — the -l flag sources the user's profile
(.zshrc, .bashrc) which sets up PATH for any package manager.

This works in all execution contexts:
- Terminal: login shell resolves via user's PATH
- MDM/root: launchctl asuser <uid> /bin/zsh -lc "claude auth status --json"
- su fallback: su - <user> -c "claude auth status --json"

When the binary IS found via hardcoded paths, behavior is unchanged (uses full path).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Add Sentry diagnostics for Claude Code plan detection failures (#135)

* Add Sentry diagnostic reporting for Claude Code plan detection failures

When plan detection returns None, send a Sentry message event with
breadcrumbs showing which stages were attempted and why each failed
(keychain, launchctl, su, direct execution). This gives visibility
into cases like kim.le where plan detection silently fails.

- Extract _send_sentry_payload helper from report_to_sentry
- Add report_message_to_sentry for non-exception Sentry events
- Add optional diagnostics parameter to get_claude_subscription_type
- Wire up at call site in ai_tools_discovery.py
- Add tests for never-crash contract and diagnostics population

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Simplify: use existing report_to_sentry instead of new message function

Remove over-engineered _send_sentry_payload and report_message_to_sentry.
Use report_to_sentry with a synthetic RuntimeError and diagnostics in
the context dict instead. Same Sentry visibility, much less code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove unnecessary test additions

Diagnostics list population is trivial — existing plan detection
tests in test_claude_auth_status.py already cover the core behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Gate Sentry alert on active usage to reduce noise

Only send plan detection failure to Sentry when the user has projects
(i.e. actively uses Claude Code). Users without projects having no
plan is expected — not worth alerting on.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Detect API key auth as plan type in Claude Code plan detection (#136)

* Detect API key auth as plan type in Claude Code plan detection

When users authenticate via ANTHROPIC_API_KEY instead of OAuth,
`claude auth status --json` returns `authMethod: "api_key"` without
a `subscriptionType`. Previously this was treated as a detection
failure (plan=None), triggering false-positive Sentry warnings.

Now `_run_auth_status` checks for `authMethod: "api_key"` when
`subscriptionType` is absent and returns "api_key" as the plan.
OAuth users are unaffected — subscriptionType always takes priority.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Make authMethod comparison case-insensitive

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Include authMethod in diagnostic breadcrumbs for Sentry

When plan detection returns ok=True but plan=None, we now include
the authMethod field from the CLI response in Sentry diagnostics.
This enables debugging future unknown auth scenarios without
logging the full CLI response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Gate Sentry alerts on CLI success — suppress noise from unauthenticated users

Only fire plan detection Sentry events when at least one CLI stage
returned ok=True (user authenticated but plan missing). When all
stages return ok=False, the user never authenticated Claude Code —
not an actionable alert.

Eliminates noise from test users (dev-diana, dev-alice, ec2-user)
who have Claude binary installed but never logged in.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Fix Codex TOML sub-table sections leaking as spurious MCP servers

Nested sections like [mcp_servers.foo.http_headers] and
[mcp_servers.foo.tools.bar] were being parsed as top-level server
entries, leaking secrets (e.g. API keys in http_headers) into the
report payload. Use tomllib on Python 3.11+ and skip dotted-name
sections in the regex fallback for older runtimes.

* Address greptile review: fix or-falsy-dict and list-of-dicts secret leak

- Use key-presence check instead of `or` to avoid treating an empty
  mcp_servers dict as absent and falling through to mcpServers
- Also filter out list-of-dict values so arrays of inline tables
  can't bypass the secret filter

* [WEB-4391] Remove verbose JSON payload logging and add per-tool timing metrics (#132)

* Remove verbose JSON payload logging and add per-tool timing metrics

The JSON payload logging serialized every report to stderr line-by-line,
producing ~39K-156K log lines per run. This wasted ~1-2 min of wall-clock
time and caused pipe buffer issues under Rippling MDM execution.

Per-tool timing replaces the aggregated `process_single_tool` and
`send_report_per_tool_user` step names with `process_tool.<slug>` and
`send_tool.<slug>`, enabling identification of which specific tool
causes the 2-4 minute processing gaps observed via Sentry timestamps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Restore JSON payload logging with rules/skills content stripped

Instead of removing the entire JSON dump, keep it for debugging but
replace rules and skills `content` fields with `<N chars>` placeholders.
This preserves structural visibility while cutting the bulk of the
stderr output (~39K-156K lines per run → ~2-3K lines).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove over-engineered test file for _metric_slug helper

The _metric_slug helper is a simple 4-line internal function that
doesn't warrant its own dedicated test file with 12 parametrized cases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Revert per-tool metric names, keep only content stripping

Remove _metric_slug helper and per-tool Sentry metric names
(process_tool.{slug}, send_tool.{slug}) to avoid inflating
Sentry metric volume. Restore original generic metric names.

The only change in this PR is now: strip rules/skills content
from the JSON payload log to reduce stderr volume.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [WEB-4391] Strip MCP scan bulk from stderr log to fix Rippling MDM pipe buffer exhaustion (#139)

* Strip MCP scan bulk from JSON payload log to reduce stderr volume

Remove inputSchema, outputSchema from MCP scan tool entries and
truncate description to 80 chars in the log-only deepcopy. This
reduces stderr by ~150-200KB per run on devices with many MCP
servers, fixing Rippling MDM pipe buffer exhaustion that causes
"Transaction was interrupted during processing" errors.

The actual payload sent to the backend is untouched (deepcopy
isolation). Rule/skill content stripping was already merged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Trigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Improve plan detection diagnostics and broaden api_key matching (#140)

Two changes to improve debuggability of plan detection Sentry alerts:

1. Use `"api_key" in` instead of `== "api_key"` for auth method matching.
   Claude Code has both "api_key" and "api_key_helper" auth methods — the
   strict equality missed api_key_helper users, leaving their plan as None
   and triggering false-positive Sentry warnings.

2. Add `key_source` (from apiKeySource) to diagnostic breadcrumbs.
   This distinguishes org-managed logins ("/login managed key") from
   personal accounts, which is critical for understanding why
   subscriptionType is null for team/enterprise org users.

Fixes DISCOVERY-TOOL-SCRIPT-V

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Add ~/.unbound/discovery-cache.json local cache + lock + heartbeat

New module that backs per-tool payload-hash dedup and prevents concurrent scans from piling up. Cache file is JSON; lock is an mtime-heartbeat file under ~/.unbound/discovery.lock.

* Integrate per-tool hash dedup into discovery main

Acquire the discovery lock, start the heartbeat, compute payload_hash per (tool, home_user), skip upload when the local cache shows the same hash, and update the cache on a successful send.

* Stabilize payload hash by stripping volatile fields and sorting lists

Expand _strip_ephemeral so cosmetic re-scans dedup correctly: drop plugins[*].installed_at, mcpServers[*].scan.scanned_at/error, mcpServers[*].oauth.{clientId,callbackPort}, and canonicalise filesystem-walk-order-dependent lists (projects, plugins, rules, skills, mcpServers, scan.tools, permissions.{allow,deny,ask}_rules).

* Skip other tools' ~/.<dir>/ from project-scope walkers

Project-scope MCP/rules/skills walkers (filesystem from /) were adopting .mcp.json / .agents / .cursor entries inside ~/.codex, ~/.cursor extensions, etc. as projects of the wrong tool. New is_home_dotdir_descendant predicate wired into walk_for_tool_directories, walk_for_mcp_configs_generic, walk_for_claude_project_mcp_configs, and the Cursor skills _walk_for_skills uniformly — covers any tool that uses the standard ~/.toolname/ convention without per-tool maintenance.

* Accept api_key via UNBOUND_API_KEY env (keeps argv clean)

Hook-triggered invocations can now pass the api_key through the subprocess env so it never appears in argv / /proc/<pid>/cmdline. CLI --api-key remains the default for MDM and direct-script usage.

* Address PR review: tighten dotdir helper, drop dead funcs, lift import

- is_home_dotdir_descendant now matches only the canonical /Users/<u>/.foo and /home/<u>/.foo positions; non-standard mounts like /srv/home/<u>/.config or /data/Users/shared/.cache no longer false-match.
- Remove unused stamp_last_run, is_debounced, _parse_iso, and DEBOUNCE_SECONDS from cache.py — the hook owns the debounce + last_run_at stamp directly, the cache module was never the call site for these.
- Move heartbeat_start() into the try block in ai_tools_discovery main so a heartbeat thread-spawn failure can't leak the discovery lock (release_lock now always runs).
- Lift is_home_dotdir_descendant import in walk_for_tool_directories from function body to module level — no circular import (mcp_extraction_helpers only does lazy imports from macos_extraction_helpers).

* Release: staging → main (2026-05-31) (#153)

* feat: add cross-platform --set-cron support (macOS launchd, Linux systemd/crontab, Windows Task Scheduler) (#126)

* feat: add cross-platform --set-cron support for user-level onboard and discover

Extends setup-scheduled-scan.sh to support Linux (systemd/crontab) in
addition to macOS launchd, and adds --command discover|onboard with
--discovery-key flag so the scheduled job can re-run the full onboard
flow — not just discovery.

Adds setup-scheduled-scan.ps1 (Windows) using Task Scheduler with
StartWhenAvailable so missed runs catch up on next boot. Credentials
stored via cmdkey (Windows Credential Manager). Verified empirically
on Azure VM via deallocate/start cycle.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape JSON values and scope umask in Linux credential storage

- Escape backslash and double-quote in api_key/discovery_key/domain
  before writing to scheduled-creds.json. A literal " in any value
  previously corrupted the file and broke the regex parser at run time.
- Run the credential write inside a subshell so umask 077 is scoped to
  that operation and does not leak into the rest of the install.

Caught by greptile review on the original PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: systemd PATH for onboard wrapper + EDR-safe Windows install.ps1 path

- setup-scheduled-scan.sh: add Environment=PATH= to the systemd --user
  service unit. The default --user PATH is minimal and excludes
  ~/.local/bin, /usr/local/bin, and homebrew paths, so the onboard
  wrapper failed to locate the 'unbound' CLI at run time.
- setup-scheduled-scan.ps1: cache install.ps1 under %LOCALAPPDATA%\Unbound
  instead of downloading to TEMP and deleting on each run. The
  download-execute-delete pattern is flagged as suspicious by EDR tools;
  a stable script path under the app data dir is recognised as the
  install location. Also adds a catch around the discover wrapper so
  network failures are logged rather than left silent.

Both caught by greptile review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prepend user-binary dirs to PATH in scheduled wrapper

cron and systemd --user invoke the wrapper with a minimal PATH that
excludes ~/.local/bin, ~/.npm-global/bin, /usr/local/bin, and homebrew
paths — exactly where 'unbound' lives after 'npm install -g'. The
crontab fallback path therefore failed the onboard branch on every
scheduled run with 'unbound: command not found', visible only in the
log file.

Setting PATH in the wrapper itself covers both crontab and systemd
fallback paths, complementing the Environment=PATH= already set on the
systemd unit.

Caught by greptile review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: discovery_key validation in wrappers, accurate print statements, $args shadow

setup-scheduled-scan.sh:
- Validate DISCOVERY_KEY is present in the onboard branch of the wrapper
  before invoking unbound, so a missing credential fails with a clear log
  message rather than a confusing empty-arg error downstream.
- Fix success message: "runs once on install" → "runs on install + at
  each login via RunAtLoad" (RunAtLoad fires on every bootstrap, not once).
- Fix Linux success message to distinguish systemd catch-up vs crontab
  no-catch-up behaviour.
- Replace "$0 --uninstall" in both macOS and Linux success messages with
  "unbound discover unschedule" — $0 resolves to 'bash' when the script
  is piped via curl | bash, making the hint non-actionable.

setup-scheduled-scan.ps1:
- Validate $DiscoveryKey before invoking unbound in the onboard wrapper
  branch to surface a clear log error instead of a silent empty-arg run.
- Rename $args → $cmdArgs in the onboard wrapper branch to avoid
  shadowing the PowerShell built-in automatic variable $args.
- Fix header comment: "schtasks" → "Register-ScheduledTask" to match
  the actual cmdlet used in the script.
- Fix catch-up comment: remove stray "launchd" cross-OS reference.
- Replace raw PS1 script path in uninstall hint with "unbound discover
  unschedule" — users invoke uninstall via the CLI, not the raw script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: domain guard in shell wrapper + tighten ~/.unbound dir + validate downloaded install.ps1

setup-scheduled-scan.sh:
- Add $DOMAIN presence check at the top of the wrapper's discover branch
  so a missing Keychain/creds-file entry fails with a clear log line
  instead of silently invoking install.sh with --domain "" on every run.
  Mirrors the existing guard in the Windows wrapper.
- Move ~/.unbound mkdir inside the umask 077 subshell so the credential
  directory itself is 0700 (defense-in-depth — the file is already 0600).

setup-scheduled-scan.ps1:
- Validate the downloaded install.ps1 before executing it (length > 100
  bytes, first line looks like PowerShell). Mirrors the shell wrapper's
  size + shebang check, so an HTML error body returned with HTTP 200 is
  not blindly handed to powershell -ExecutionPolicy Bypass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pin install.sh/install.ps1 to an immutable commit SHA at cron-setup time

At the moment the cron is installed, resolve the current HEAD SHA of
coding-discovery-tool via the GitHub API and bake that SHA into the
generated wrapper script's install URL. Raw GitHub content served at a
specific commit SHA is immutable — future pushes to main cannot change
what the daily job executes.

Falls back to 'main' with a visible warning when the API is unreachable
(e.g. no network at install time). The pin is refreshed automatically
every time the user re-runs --set-cron or discover schedule.

setup-scheduled-scan.sh:  resolve_install_ref() resolves the SHA via
  curl + python3 (both already required by install.sh); updates
  SCAN_SCRIPT_URL before create_wrapper_script() writes it into the
  wrapper.

setup-scheduled-scan.ps1: resolves the SHA via Invoke-RestMethod
  (built into PowerShell 3+); passes it as -InstallRef to
  Create-WrapperScript, which replaces the UNBOUND_INSTALL_REF
  placeholder in the single-quoted here-string before writing to disk.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix: pin install.sh/install.ps1 to an immutable commit SHA at cron-setup time"

This reverts commit 48790286b90e87c7ce37ec820eeebe49d730b25d.

* fix: propagate exit code from wrapper script to Task Scheduler

Without exit $ec the wrapper always exits 0, so Task Scheduler records
LastTaskResult = 0x0 even when install.ps1 or unbound onboard fails.
Adds exit $ec before the closing heredoc so failed runs are visible
in the task history and retry-on-failure policies can trigger.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: capture wrapper exit code under set -e and migrate legacy LaunchAgent

Two issues addressed:

1. The generated wrapper runs under set -euo pipefail, so a non-zero exit
   from install.sh or unbound onboard terminated the script before the
   "Scheduled run failed" log line and before the trailing exit propagated
   the status. Every failure appeared as a successful run.

   Fix: capture EXIT_CODE inline via "|| EXIT_CODE=$?" (the canonical
   set -e bypass) and add an explicit "exit $EXIT_CODE" so launchd /
   systemd / cron observe the real status.

2. install_macos only unloaded the new "ai.getunbound.scheduled" label.
   Users upgrading from a previous version still had the old
   "ai.getunbound.discovery" agent loaded — both fired daily at 09:00
   after the upgrade.

   Fix: add a migration step at the top of install_macos that boots out
   the legacy label and removes its plist before registering the new one.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: harden Linux systemd→crontab fallback and PowerShell CLM handling

setup-scheduled-scan.sh
  - install_linux previously gated on [ -d /run/systemd/system ] and ran
    `systemctl --user daemon-reload` directly. On containers, WSL2, CI
    runners and headless servers, the system systemd dir exists but no
    user instance is running — daemon-reload exited non-zero, set -e
    killed the script after credentials and the wrapper had already
    been written, and the crontab fallback was never attempted.
  - Introduces systemd_user_available() that adds a `systemctl --user
    status` probe (the only check that catches the "no user bus" case).
  - install_linux_systemd now returns non-zero instead of aborting when
    daemon-reload or enable fails. install_linux tears down the unit
    files and falls through to crontab so the user is never left with
    credentials on disk and no scheduler.

setup-scheduled-scan.ps1
  - Wraps `Add-Type -Language CSharp` in try/catch. Constrained Language
    Mode (enforced by AppLocker / WDAC on locked-down enterprise fleets
    — exactly the environments most likely to deploy a discovery tool)
    blocks Add-Type and `-ExecutionPolicy Bypass` does NOT bypass CLM.
    Without the try/catch the wrapper logged only "=== Starting ===" on
    every run with no indication of what failed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: bake unbound path at install time and detect auth errors in wrapper

Resolves two known limitations:
1. PATH resolution — bakes the unbound binary path resolved at install
   time into the wrapper (falls back to PATH search at runtime) so the
   scheduled job survives shell profile changes, nvm switches, and
   custom npm-prefix installs.
2. API key rotation — after a non-zero exit, tail the log and emit an
   actionable HINT when a 401/Invalid-key/Unauthorized pattern is found,
   telling the operator which command to re-run with new credentials.

Applies to both the sh (macOS/Linux) and ps1 (Windows) wrappers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify LogonType=Interactive is intentional for personal-device use

Adds a comment explaining the design rationale so reviewers don't flag
it as an oversight: Interactive avoids storing a plaintext password and
is correct for developer laptops. Notes the path forward for headless
deployments if ever required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard grep -vF exit code in Linux crontab uninstall

grep -vF exits 1 when it matches no lines (i.e. the Unbound entry was
the only crontab line). Under set -euo pipefail, pipefail propagates
that exit code through the pipeline and set -e aborts the script before
crontab - executes — leaving the cron entry alive after credentials and
the install dir are already gone.

Wraps the grep in a subshell with || true, matching the identical guard
already present in install_linux_crontab at line 512.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use || true on full pipeline in crontab uninstall path

Previous fix wrapped grep in a subshell which is correct but non-idiomatic.
The cleaner form is || true on the whole pipeline — with pipefail, all pipeline
stages still execute (grep writes nothing, crontab - clears the crontab), and
|| true suppresses the non-zero exit code that grep produces when it selects
no lines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use systemctl --user show to probe user bus availability

systemctl --user status (no unit arg) exits 1 when the user manager is
in degraded state — a failed xdg-portal, pipewire, or gnome-keyring unit
is enough. This is the normal condition on most desktop sessions, so the
probe was silently falling back to crontab for the majority of Linux
users even when the user bus was fully operational.

systemctl --user show queries the manager's own properties and exits 0
as long as the user bus is reachable, regardless of individual unit
health. It is the correct reachability probe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass credentials via env vars to avoid Win32_Process.CommandLine exposure

Win32_Process.CommandLine is readable by any authenticated user without
elevation, and Windows Event Log 4688 (process-creation auditing, common
in enterprise SIEMs) captures full command lines. Passing -ApiKey <value>
as a CLI flag would have put raw API keys into security logs.

discover case: set UNBOUND_API_KEY / UNBOUND_DOMAIN before spawning, pass
-Command with backtick-escaped $env: references so the child process
command line contains the variable name, not the value. finally block
clears the env vars in all exit paths.

onboard case: drop --api-key / --discovery-key from cmdArgs entirely, set
UNBOUND_API_KEY / UNBOUND_DISCOVERY_KEY before spawning unbound. Requires
unbound-cli to read these env vars (see unbound-cli PR change).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass credentials via env vars in shell wrapper to avoid ps/cmdline exposure

Mirrors the same fix already applied to the PS1 wrapper. On Linux,
/proc/pid/cmdline is readable by any local user without elevation; on
macOS, ps output is world-readable. Passing --api-key / --discovery-key
as CLI arguments therefore exposes raw keys to every process on the
machine for the lifetime of the subprocess.

setup-scheduled-scan.sh (generated wrapper):
- discover: UNBOUND_API_KEY="$API_KEY" invoke, --api-key dropped from args
- onboard:  UNBOUND_API_KEY + UNBOUND_DISCOVERY_KEY in env, keys dropped from args
- chmod 700 on wrapper (not +x): outer umask is typically 022 so +x
  would yield 755 (world-executable); 700 restricts to owner only

install.sh:
- main() prepends --api-key from $UNBOUND_API_KEY when the flag is
  absent from explicit args, so the scheduled wrapper can invoke it
  without putting the key value on the command line
- $# -eq 0 guard now also passes when UNBOUND_API_KEY is set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape apostrophes in paths embedded in single-quoted PS expressions

On machines where the Windows username contains an apostrophe (e.g.
O'Brien), $env:LOCALAPPDATA and the npm global bin path both contain a
literal apostrophe. Embedding these paths directly inside a single-quoted
PowerShell expression ('$installScript', '$resolvedUnbound') produces an
unbalanced quote that breaks the expression and silently fails every
scheduled discover or onboard run.

Fix: apply -replace "'", "''" before interpolating into single-quoted
contexts. In PowerShell single-quoted strings, '' is the escape sequence
for a literal apostrophe, so O'Brien becomes O''Brien and the expression
remains syntactically valid.

Two sites fixed:
- discover wrapper: $safeInstallScript used in -Command string
- baked-path replacement: $escapedUnbound used in Test-Path / string literals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear all credentials before storing new ones in Install-ScheduledTask

If a user reinstalls with a different command (e.g. switching from onboard
to discover), credentials from the previous install that are not present in
the new invocation (e.g. discovery_key) were left in Windows Credential
Manager. Calling Remove-AllCredentials before storing new values makes
Install-ScheduledTask idempotent — same behaviour as the macOS path which
unconditionally deletes all four credentials before writing new ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard npm config get prefix against set -e abort when npm absent

`npm config get prefix` exits 127 when npm is not installed; under
`set -euo pipefail` that propagated a fatal error mid-install, leaving
credentials written to disk with no diagnostic message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: log onboard spawn failure on Windows; remove stale macOS Keychain creds on migration

- PS1 wrapper onboard branch: add catch block so a spawn failure
  (e.g. access denied, corrupted binary) is logged instead of silently
  falling through with an uninitialised exit code. Mirrors the existing
  catch block in the discover branch. $ec pre-set to 1 so the failure
  is recorded even if the catch block itself throws.
- SH install_macos: after removing the legacy ai.getunbound.discovery
  plist/LaunchAgent, also delete the four Keychain entries stored under
  that old service name. Without this, --uninstall (which only cleans
  the current name) leaves stale credentials on upgraded systems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): replace cmdkey /pass: with CredWrite P/Invoke for credential storage

cmdkey /pass:<secret> exposes API keys in the Windows process command
line, which Event Log 4688 (process-creation auditing) captures
persistently — exactly the enterprise environments this tool targets.

Replace Store-Credential with a CredWrite P/Invoke call (mirroring the
CredRead already in the scheduled wrapper) so secrets never appear in
any command line. cmdkey /delete: is retained for removal since
deletion exposes no secret.

Add-Type CLM guard: if PowerShell Constrained Language Mode blocks
Add-Type, exit with a descriptive error rather than silently falling
back to the insecure cmdkey /pass: form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address anonpran review — credential exposure, json_escape scope, uninstall cleanup

P0-1: Windows discover wrapper used -Command to invoke install.ps1,
causing UNBOUND_API_KEY to be bound as a param and appear in child
process command lines. Switch to -File so the only thing in the
child's command line is the script path. Update install.ps1 to
resolve ApiKey/Domain from UNBOUND_API_KEY/UNBOUND_DOMAIN env vars
(matching the pattern already in install.sh) when not passed as params.

P1-2: json_escape in store_credentials_linux only escapes \ and ".
Add a comment documenting the intentionally limited scope — API keys
and domain URLs never contain control characters so the simpler form
is sufficient.

P3-1: remove_credentials_linux left ~/.unbound/ behind after removing
the creds file. Add rmdir on the parent directory; rmdir is a no-op
when the directory is non-empty, so it is safe when other components
share the directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: note pre-existing Python CLI arg exposure in install.sh and install.ps1

The Python subprocess receives --api-key as a CLI argument, making the
key visible in /proc/pid/cmdline, ps, and Event Log 4688 for the
duration of the Python process. This is a pre-existing limitation of
the ai_tools_discovery.py entry point; both install scripts already
avoid exposing the key at the shell/PS level via UNBOUND_API_KEY env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: sentinel file to prevent RunAtLoad+StartCalendarInterval double-run

macOS launchd fires RunAtLoad (every boot/login) and StartCalendarInterval
(daily 09:00) as two independent triggers. When the machine boots before
09:00, both fire in the same morning. The wrapper now writes a timestamp to
$LOG_DIR/.last-run-ts after each execution and skips if the last run was
within 8 hours.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Junie support for Windows (#137)

* feat: Junie support for Windows

Junie (JetBrains' AI coding agent) ships on Windows too — as a CLI and as a
JetBrains IDE plugin — but coding-discovery only detected it on macOS. This
adds Windows parity (config lives in %USERPROFILE%\.junie, same layout as
macOS).

- WindowsJunieDetector — detects ~\.junie; multi-user via is_running_as_admin
  scanning C:\Users.
- WindowsJunieRulesExtractor — global ~\.junie\*.md + project-level .junie\*.md;
  project walk uses get_windows_system_directories and the is_user_level_tool_dir
  guard so user-scope rules aren't reported as project-scope.
- WindowsJunieMCPConfigExtractor — reads ~\.junie\mcp\mcp.json per user.
- Wired Windows branches into create_junie_detector and the Junie MCP/rules
  factories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address Greptile review on Windows Junie

- Global Junie rules were tagged scope="project": _detect_rule_scope did
  not recognize .junie. Added .junie to its user-config dir set, and pass
  explicit scope="user" when extracting global rules so user-level rules
  are classified correctly regardless of path resolution.
- Admin scan in the rules extractor skipped the system/default account
  exclusion (Public, Default, etc.) that the detector and MCP extractor
  already applied.
- All three Windows Junie files reinvented the admin/non-admin loop. They
  now use the shared scan_windows_user_directories helper, which centralises
  the branching, excludes system accounts, and handles PermissionError —
  fixing the exclusion bug and removing the duplicated logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Junie MCP parent_levels 2 -> 3 so path resolves to home

~\.junie\mcp\mcp.json needs 3 parent levels to resolve to ~ (home),
matching how every other global MCP config keys its `path`. With 2 it
returned ~\.junie, inconsistently keying Junie's entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add 45 unit tests for macOS and Windows Junie extraction

Covers detector, rules extractor, and MCP config extractor for both
platforms - detection with/without .junie dir, version parsing from
config.json/settings.json, global vs project-level rule extraction,
walker skipping of user-level dirs, and multi-user root scanning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add ~/.unbound/discovery-cache.json local cache + lock + heartbeat

New module that backs per-tool payload-hash dedup and prevents concurrent scans from piling up. Cache file is JSON; lock is an mtime-heartbeat file under ~/.unbound/discovery.lock.

* Integrate per-tool hash dedup into discovery main

Acquire the discovery lock, start the heartbeat, compute payload_hash per (tool, home_user), skip upload when the local cache shows the same hash, and update the cache on a successful send.

* Stabilize payload hash by stripping volatile fields and sorting lists

Expand _strip_ephemeral so cosmetic re-scans dedup correctly: drop plugins[*].installed_at, mcpServers[*].scan.scanned_at/error, mcpServers[*].oauth.{clientId,callbackPort}, and canonicalise filesystem-walk-order-dependent lists (projects, plugins, rules, skills, mcpServers, scan.tools, permissions.{allow,deny,ask}_rules).

* Skip other tools' ~/.<dir>/ from project-scope walkers

Project-scope MCP/rules/skills walkers (filesystem from /) were adopting .mcp.json / .agents / .cursor entries inside ~/.codex, ~/.cursor extensions, etc. as projects of the wrong tool. New is_home_dotdir_descendant predicate wired into walk_for_tool_directories, walk_for_mcp_configs_generic, walk_for_claude_project_mcp_configs, and the Cursor skills _walk_for_skills uniformly — covers any tool that uses the standard ~/.toolname/ convention without per-tool maintenance.

* Accept api_key via UNBOUND_API_KEY env (keeps argv clean)

Hook-triggered invocations can now pass the api_key through the subprocess env so it never appears in argv / /proc/<pid>/cmdline. CLI --api-key remains the default for MDM and direct-script usage.

* Address PR review: tighten dotdir helper, drop dead funcs, lift import

- is_home_dotdir_descendant now matches only the canonical /Users/<u>/.foo and /home/<u>/.foo positions; non-standard mounts like /srv/home/<u>/.config or /data/Users/shared/.cache no longer false-match.
- Remove unused stamp_last_run, is_debounced, _parse_iso, and DEBOUNCE_SECONDS from cache.py — the hook owns the debounce + last_run_at stamp directly, the cache module was never the call site for these.
- Move heartbeat_start() into the try block in ai_tools_discovery main so a heartbeat thread-spawn failure can't leak the discovery lock (release_lock now always runs).
- Lift is_home_dotdir_descendant import in walk_for_tool_directories from function body to module level — no circular import (mcp_extraction_helpers only does lazy imports from macos_extraction_helpers).

* fix(macos): skip RunAtLoad when caller already ran the scan (#141)

* chore: mark --no-run-at-load as internal-only in arg parser comment (#148)

The flag is passed automatically by the unbound CLI when a scan already
ran immediately before cron setup; it is not a user-facing option and
is intentionally absent from usage().

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: implement KiloCode version extraction (macOS + Windows) (#149)

* fix: implement KiloCode version extraction (macOS + Windows)

Kilo Code ships standard VS Code extension metadata in
~/.vscode/extensions/kilocode.Kilo-Code-X.Y.Z/package.json (and
~/.cursor/extensions/ for Cursor), but both KiloCode detectors hardcoded
"version": "Unknown" and returned None from get_version(). A stale
docstring claimed "version extraction removed per requirements" — the
sibling Roo Code and Cline detectors had the same stub in early 2025
and were both rewritten to read package.json in commit a31ecfa
(Feb 2026). KiloCode was missed.

Apply the same package.json read pattern KiloCode's siblings use, with
a fall-back to the folder-name version suffix when package.json is
unreadable, and surface the resolved version through both detect() and
get_version() (the latter walks every admin user when running elevated).

Linux equivalent will land via the feat/linux-full-support branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): delegate get_version() to detect() (Greptile review)

Reviewer flagged that get_version() walked extension dirs directly while
detect() also checked that the extension settings dir + IDE app were
present. A partially-uninstalled KiloCode (leftover ~/.vscode/extensions
folder but no globalStorage and no IDE) would let get_version() return
a stale version while detect() returned None — they could disagree.

Make get_version() delegate to detect() and read 'version' off the
result so the install-gating logic is the single source of truth.
Matches the Roo Code / Cline pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address anonpran review + add Replit version detection

KiloCode (PR #149 review by anonpran):
- Scope _get_extension_version_for_user to the IDE that triggered detection.
  Previously looped all SUPPORTED_IDES and returned the first match in any
  extensions dir (.vscode first), so a Cursor install with a leftover
  ~/.vscode/extensions/kilocode.Kilo-Code-* would return the VS Code
  version against the Cursor install_path. Now takes ide_name explicitly,
  same as Roo Code / Cline siblings.
- Drop dead `except IndexError` around rsplit("-", 1)[1] — the `if "-"`
  guard already ensures the split returns 2 elements.
- Add docstring to _get_extension_version_for_user.
- Add tests/test_kilocode_version_extraction.py covering package.json
  reads, folder-suffix fallback, and IDE scoping (must NOT cross-leak
  VS Code version to a Cursor install).

Replit version extraction (macOS + Windows):
- macOS: Read CFBundleShortVersionString from /Applications/Replit.app/
  Contents/Info.plist (same source Cursor/Windsurf use); fall back to
  Contents/Resources/app/package.json.
- Windows: Walk %LOCALAPPDATA%\Programs and Program Files for the
  install dir, read resources\app\package.json; fall back to the .exe
  FileVersion via PowerShell. The earlier "doesn't expose version
  information in a standard way" comment was wrong — Replit Desktop
  is a standard Electron app.

Linux versions for Antigravity + Replit will land via the
feat/linux-full-support branch alongside the matching KiloCode fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode, replit): address the two edge-case flags

1. KiloCode folder-name fallback preserved pre-release suffix.
   rsplit("-", 1)[1] truncated pre-release semver — kilocode.Kilo-Code-
   1.2.3-pre.5 collapsed to "pre.5" instead of "1.2.3-pre.5". Replace
   with a regex that captures full semver including pre-release/build
   metadata from the end of the folder name. Add tests covering -pre.5
   and -beta.1 forms.

2. Replit Windows PowerShell path quoting.
   repr(str(path)) produced 'C:\\Users\\…' (literal double-backslash
   inside PowerShell's single quotes), which doesn't reliably resolve.
   Switch to a hand-built single-quoted string with single-quote
   doubling, plus -LiteralPath so PowerShell doesn't try to glob the
   path. Inside single quotes, backslashes are already literal — no
   double-backslash dance needed.

Both bugs only fire on the fallback branch (after package.json fails
to parse), so they couldn't produce worse output than the pre-PR
"Unknown" — but the fix is cheap and removes the footgun.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address greptile + anonpran follow-ups on PR #149

Test coverage:
- Drop @unittest.skipUnless from TestMacOSKiloCodeVersion and
  TestWindowsKiloCodeVersion. _get_extension_version_for_user is pure
  pathlib + JSON parsing, so the IDE-scoping regression guard (which
  proves a Cursor install doesn't inherit a leftover VS Code version)
  must execute on every CI runner — previously it ran only on a dev's
  macOS/Windows box and was silently skipped on the Linux runner that
  produced the "553 passed" claim.

Import hygiene / cross-OS consistency:
- Windows Replit imported COMMAND_TIMEOUT while macOS imported
  VERSION_TIMEOUT for the same lookup. Same value (30s), but mixing
  the import names across sibling files is exactly the inconsistency
  Greptile and anonpran flagged. Switch Windows to VERSION_TIMEOUT.
- Replit detect() now wraps get_version() in `or "Unknown"` on both
  macOS and Windows. Without it, a data-dir-only install emitted
  `"version": null` while a KiloCode equivalent emitted `"Unknown"`
  for the same case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: bring Windows KiloCode test coverage to macOS parity

Greptile flagged the Windows test class as thin (2 cases vs 7 on macOS).
Add the 3 missing parity tests: folder-suffix fallback, pre-release
suffix preservation, Cursor extensions-dir routing. The Linux runner
will now execute all 12 cases instead of the previous 9.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode/macos): require single-IDE pairing for globalStorage + .app

The fallback loop in _check_user_for_kilocode (macOS only) updated
ide_installed but never updated ide_with_extension. So when
globalStorage was found in IDE-A but only IDE-B was actually installed,
the version lookup ran against IDE-A's extensions dir — either
returning a stale leftover version or falling through to "Unknown"
even when IDE-B genuinely had Kilo Code installed with a readable
package.json.

The fallback was always wrong-headed: globalStorage in an uninstalled
IDE doesn't mean Kilo Code is currently usable from a different IDE.
Replace it with a single pass that requires BOTH globalStorage AND
matching .app for the same IDE.

Add three regression tests:
- test_rejects_globalstorage_when_matching_ide_not_installed: trap config
  (Code globalStorage + Cursor.app, no Cursor globalStorage) now returns
  None instead of an incoherent dict.
- test_prefers_ide_with_both_globalstorage_and_app: when both IDEs have
  globalStorage but only Cursor.app is installed, version comes from
  Cursor's extensions dir — NOT from a stale Code leftover.
- test_returns_result_when_first_ide_has_both: happy path.

Windows and Linux detectors never had this fallback loop — they
correctly return None when the first IDE check fails — so this fix is
macOS-only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* perf(replit/macos): avoid redundant _check_application_installation()

When detect() finds an app install, it then calls get_version() which
in turn calls _check_application_installation() again as a safety guard
— two filesystem syscalls where one suffices. Greptile flagged it as
a minor inefficiency.

Add an optional app_path parameter to get_version() so detect() can
pass the already-resolved Path. Falls back to the internal check when
called directly (keeping standalone-call safety). Matches the macOS
Antigravity detector's existing signature.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode/windows): require live IDE install + globalStorage pairing

Mirror the macOS fix (f87c864) to Windows. AppData survives an IDE
uninstall on Windows, so the previous detector — which trusted
globalStorage as proof of an IDE install — would report stale state.

Concrete scenario flagged in review:
- User installed VS Code, added KiloCode (creates
  %AppData%\Code\User\globalStorage\kilocode.Kilo-Code\ AND
  %USERPROFILE%\.vscode\extensions\kilocode.Kilo-Code-X.Y.Z\)
- User uninstalled VS Code (both AppData and the extensions folder
  persist on disk)
- User installed Cursor + KiloCode in Cursor

The old detector found Code globalStorage first, marked ide_with_extension
= 'Code', then surfaced the *VS Code* leftover version against a 'Code'
install_path even though only Cursor is actually installed. Before this
PR opened, that path returned 'Unknown'; after the version extraction
landed, it would start returning a stale-but-plausible version.

Add a Windows-shaped _check_ide_installation that requires the IDE's
.exe under one of the conventional install roots:
- %LOCALAPPDATA%\Programs\
- %ProgramFiles%, %ProgramFiles(x86)% (with C:\Program Files fallbacks
  when the env vars are unset, e.g. service-account scans)

Three regression tests mirror the macOS suite: trap config (Code
globalStorage + uninstalled Code + live Cursor without globalStorage),
preference for Cursor when both have globalStorage but only Cursor is
installed, and the happy path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(replit/macos) + tests: use app_path, add Replit version coverage

macOS Replit get_version() still referenced self.APPLICATION_PATH for
both the Info.plist and package.json reads even after the redundancy
fix added an app_path parameter — so passing app_path was effectively a
no-op and the function always read from /Applications/Replit.app
regardless of caller. Greptile caught it. Replace both references
with the passed-through app_path so the parameter has real effect.

Add tests/test_replit_version_extraction.py — 10 cases, no platform
skips so the Linux/macOS/Windows CI runners all execute them:

macOS (4 cases):
- returns None when no install
- reads Info.plist first via the explicit app_path AND asserts the
  call doesn't reference /Applications/Replit.app — the regression
  guard for this commit
- falls back to package.json when defaults read returns nothing
- returns None when both sources fail

Windows (6 cases):
- returns None when no install
- reads version from resources/app/package.json
- skips missing candidate dirs and tries the next one
- falls back to PowerShell FileVersion when package.json is broken
- escapes single quotes in paths (asserts user' -> user'' in the PS
  command — the regression guard for the earlier quoting fix)
- candidate_install_paths includes both LOCALAPPDATA\Programs and
  Program Files family roots

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(replit): trim version-extraction tests to regression-guard set

Drop 5 trivial/sanity tests from tests/test_replit_version_extraction.py
that didn't lock in a real regression:
- macOS test_returns_none_when_no_install (trivial)
- macOS test_falls_through_to_none_when_both_sources_missing (edge case)
- Windows test_returns_none_when_no_install_found (trivial)
- Windows test_skips_candidate_when_directory_missing (sanity)
- Windows test_candidate_install_paths_includes_user_and_system_dirs
  (implementation detail)

Keep the 5 that earn their slot:
- macOS test_reads_plist_first_when_passed_explicit_app_path — locks
  in the app_path-vs-self.APPLICATION_PATH fix Greptile flagged.
- macOS test_falls_back_to_package_json_when_plist_unreadable —
  exercises the secondary read path.
- Windows test_reads_version_from_package_json — primary success path.
- Windows test_falls_back_to_powershell_when_package_json_unreadable —
  asserts -LiteralPath is in the command, locks in the quoting fix.
- Windows test_escapes_single_quotes_in_path — asserts ' -> '' so a
  path containing a quote can't escape the PS single-quoted literal.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(replit/windows): symmetrise _candidate_install_paths with KiloCode

Greptile flagged that Windows Replit's _candidate_install_paths drops
each root entirely when its env var is unset, while the KiloCode
sibling falls back to Path.home()-derived / C:\Program Files defaults.
Real-world impact is small (env vars are always set in normal user
sessions) but matters for restricted service accounts where AppData
or Program Files env vars get stripped — and the symmetry just reads
cleaner.

Mirror the KiloCode helper's fallback structure here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat: full Linux support for unbound discover and onboard (#128)

* feat: full Linux support for unbound discover and onboard

Adds Linux implementations for every AI coding tool detector and
extractor, bringing Linux to full parity with macOS and Windows
(detectors, rules, MCP config, settings, skills), plus cross-platform
--set-cron support for user-level onboard/discover.

Highlights:
- linux/ package mirroring macos/windows for all 17 tools, wired through
  coding_tool_factory and ai_tools_discovery (multi-user /home + /root,
  Docker/CI root-only containers, /etc/machine-id device id).
- linux_extraction_helpers with Linux-aware system-path skipping and a
  walk_for_tool_directories that does not blocklist /home.
- Claude Cowork (~/.config/Claude) and Junie (~/.junie) Linux parity.
- Plugin provenance (plugin_lookup) threaded through Linux Claude/Cursor
  skills and Claude MCP extractors.
- setup-scheduled-scan cron support hardened across Linux/macOS/Windows.
- tests/test_cowork_skills_extraction_linux.py.

Rebased cleanly onto staging (which already contains plugin provenance
#129); main-only plan-detection commits are intentionally excluded — they
reach staging via the normal main->staging path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux Junie MCP parent_levels 2 -> 3 so path resolves to home

~/.junie/mcp/mcp.json needs 3 parent levels to resolve its reported path
to ~ (home), matching every other global MCP config. Same fix applied to
Windows Junie in PR #137.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: scheduler setup no longer aborts on npm-absent Linux hosts

create_wrapper_script ran NPM_BIN=$(npm config get prefix 2>/dev/null)/bin
with no failure guard, so under 'set -euo pipefail' a host without npm
aborted the entire scheduler setup before the wrapper or cron entry was
written. Guard the lookup with '|| true' and only append the bin dir when
a prefix resolves. NPM_BIN now carries its own trailing colon when set and
is empty otherwise, so an npm-absent host no longer leaves a leading ':'
in the wrapper PATH (which would have put the CWD on PATH).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux multi-user gaps in plugin scan, cursor DB, managed MCP, user-home enum

Audited every `Darwin`/`Windows` branch across the codebase and found five
spots that had no Linux counterpart. Each one silently dropped real data
on a Linux MDM/agent runner:

1. ai_tools_discovery._extract_claude_code_plugins multi-user scan only
   covered Darwin and Windows. On Linux, running as root only checked
   /root/.claude/plugins — plugins installed under /home/<user> were
   invisible, so discovery reported "No plugins found" even when a
   plugin was installed.

2. Same gap in the Cursor plugin multi-user scan.

3. ai_tools_discovery user-home enumeration (fallback when user_rules
   provide no project_path) lacked a Linux branch — used the current
   process's home instead of iterating /home/<user> + /root.

4. mcp_extraction_helpers._get_managed_mcp_path returned None on Linux.
   Added /etc/ClaudeCode/managed-mcp.json (system-wide convention).

5. utils._get_cursor_db_path returned None on Linux, breaking Cursor plan
   detection. Added the XDG path ~/.config/Cursor/User/globalStorage/
   state.vscdb.

All five Linux branches use get_all_users_linux() + the established
"/root" vs "/home/<user>" special-case so they match the rest of the
PR's multi-user handling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: skip macOS-only cowork extractor test on non-Darwin

TestMacOSClaudeCoworkSkillsExtractor exercises MacOSClaudeCoworkSkillsExtractor
specifically, and that extractor intentionally skips its final-step dedup
when running as root. The explicit-sessions-root tempdir path is therefore
ordering-dependent on ext4 (Linux), causing the dedup assertion to flake
under root — pre-existing on main, not introduced by this PR.

Skip the class on non-Darwin so Linux CI/runners don't see a failure that's
not actually about the Linux work. The Linux equivalent
(tests/test_cowork_skills_extraction_linux.py + LinuxClaudeCoworkSkillsExtractor)
dedups unconditionally and already covers the same behaviour on Linux.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: Linux unit-test parity with macOS for skills/cline/cursor extractors

Adds direct unit-test coverage for the Linux skills extractors mirroring
the macOS pattern, and updates cursor plan-detection to cover the Linux
SQLite DB path now that _get_cursor_db_path resolves it on Linux.

New files (20 new tests total):
- tests/test_claude_skills_extraction_linux.py — TestLinuxExtractorEndToEnd
  (4 methods) + TestLinuxExtractorAgents (3) mirror the macOS classes for
  LinuxClaudeSkillsExtractor (~/.claude/{skills,commands,agents}/).
- tests/test_cursor_skills_extraction_linux.py — TestLinuxCursorSkillsExtractor
  (4) + TestLinuxCursorSkillsExtractorWithCommands (5) for the Cursor flow
  including .cursor/ and .agents/ parent dirs.
- tests/test_cline_skills_extraction_linux.py — TestLinuxClineSkillsExtractor
  (5 methods) for the Cline flow.

Pattern: Linux extractors iterate get_linux_user_homes() instead of using
Path.home() + is_running_as_root, so the new tests patch
linux.<tool>.skills_extractor.get_linux_user_homes to return [fake_home]
in place of the three-decorator macOS pattern.

tests/test_cursor_plan_detection.py:
- The Linux branch used to assert "returns None (unsupported)". Now that
  _get_cursor_db_path resolves ~/.config/Cursor/User/globalStorage/state.vscdb
  on Linux, replaced with three correctly-shaped tests:
    * test_get_cursor_db_path_linux — creates the file, expects the path
    * test_get_cursor_db_path_linux_file_not_exists — mirrors the macOS
      file-not-exists assertion
    * test_get_cursor_db_path_unsupported_platform — moved to FreeBSD so
      it still validates the unsupported-platform fallthrough path.

Local macOS: 549 passed (was 527).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape heredoc runtime vars in wrapper dedup check

LAST_RUN_FILE, _last, _now, and the arithmetic expansions were bare
$/$(...) inside the unquoted <<WRAPPER_EOF heredoc, causing them to be
expanded by the setup shell at install time instead of at wrapper
runtime. The 8-hour dedup check consequently never fired.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add 10 Linux Junie unit tests (detector, rules, MCP config)

Mirrors the macOS/Windows Junie test coverage for the Linux
implementation - detection, version parsing, global rule extraction,
MCP config reading, and multi-user scanning via get_linux_user_homes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address principal engineer review — B1-B4 + H1/H2/H5

B1 (mcp_extraction_helpers): add _iter_admin_user_homes() helper that
delegates to get_linux_user_homes() on Linux, which correctly includes
/root alongside /home/*. Replace all 5 Linux admin-scan loops to use it
instead of iterating Path("/home") directly.

B2 (utils): hardcode "root" in Docker/CI fallback instead of
Path.home().name which could return "app" when HOME is overridden.

B3 (detect_openclaw): add timeout=10 to subprocess.run(["ps","aux"])
to prevent hangs on loaded systems.

B4 (windsurf): get_version() now checks _USER_RELATIVE_PATHS via
get_linux_user_homes() so user-local installs return a real version.

H1 (linux_extraction_helpers + ai_tools_discovery): extract
linux_home_for_user(username) helper; replace 7 inline ternaries.

H2 (mcp_extraction_helpers): _iter_admin_user_homes() eliminates the
5 copy-pasted Linux admin-detection blocks.

H5 (gemini_cli, opencode): add _USER_RELATIVE_PATHS fallback scan
(~/.local/bin, ~/.npm-global/bin) when which finds nothing, so
installs are detected when user profiles are not sourced (cron/systemd).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux Claude Cowork detection in per-user iteration

user_tool_detector._detect_claude_cowork() had only Darwin and Windows
branches — Linux fell through the else into the Windows AppData path,
which never exists on Linux, so Cowork was silently skipped on Linux
even when sessions dir was present.

Add the Linux branch using ~/.config/Claude/local-agent-mode-sessions
(matches LinuxClaudeCoworkDetector._sessions_dir_for_user).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: implement KiloCode version extraction (Linux)

Mirror the macOS/Windows fix from fix/kilocode-version-extraction
(PR #149): Linux KiloCode detector hardcoded 'version': 'Unknown' even
though Kilo Code ships standard VS Code extension metadata in
~/.vscode/extensions/kilocode.Kilo-Code-X.Y.Z/package.json (and the
~/.cursor/extensions/ equivalent for Cursor).

Apply the same package.json read pattern its Linux Roo Code and Cline
siblings already use, with a fall-back to the folder-name version
suffix when package.json is unreadable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): delegate Linux get_version() to detect() (Greptile review)

Mirror the PR #149 review fix: get_version() must not surface a stale
version when detect() would return None (e.g. leftover
~/.vscode/extensions folder but no kilocode globalStorage settings dir).
Delegate to detect() so install-gating logic stays the single source
of truth.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(linux): version extraction for Antigravity, Replit, and scoped KiloCode

KiloCode (mirrors PR #149 anonpran review fix):
- Scope _get_extension_version_for_user to the IDE that triggered detection.
  Previously looped all SUPPORTED_IDES and returned the first hit, so a
  Cursor install with a leftover ~/.vscode/extensions/kilocode.Kilo-Code-*
  would have returned the VS Code version against the Cursor install_path.
- Drop dead `except IndexError` around rsplit("-", 1)[1] — guarded.

Antigravity (Linux):
- Read version from resources/app/product.json (preferred) and
  resources/app/package.json (fallback) across the common install paths
  (/opt/Antigravity, /usr/lib/antigravity, etc. + ~/.local/share/...).
  Antigravity is a VS Code fork, so the resource layout matches what
  Cursor/Windsurf already use elsewhere in this codebase.

Replit (Linux):
- Read version from resources/app/package.json across system + per-user
  install paths. Final fallback: `replit --version`. The earlier "doesn't
  expose version information in a standard way" comment was wrong — Replit
  Desktop is a standard Electron app.

Tests:
- tests/test_linux_version_extraction.py covers all three detectors:
  package.json reads, fallbacks, and KiloCode IDE-scoping (must NOT
  cross-leak VS Code version onto a Cursor install).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): preserve pre-release suffix in folder-name fallback (Linux)

rsplit("-", 1)[1] truncated semver pre-release suffixes — e.g.
kilocode.Kilo-Code-1.2.3-pre.5 returned just "pre.5". Switch to a
regex that captures the full version (incl. pre-release/build metadata)
from the end of the folder name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: drop dead imports flagged in principal-engineering review

anonpran's PR #128 review (https://github.com/websentry-ai/coding-
discovery-tool/pull/128#issuecomment-4583525626) approved the PR with
three trivial dead-import nits. Removing them so Pylint stays clean:

- linux/codex/mcp_config_extractor.py: is_running_as_root never used
  in this module (root-vs-user logic lives in get_linux_user_homes()
  which is already imported and used).
- linux/cursor/mcp_config_extractor.py: extract_cursor_mcp_from_dir
  was imported but only walk_for_cursor_mcp_configs + read_global_
  mcp_config are actually called.
- linux/windsurf/mcp_config_extractor.py: same shape as cursor —
  extract_windsurf_mcp_from_dir imported but never referenced.

No behaviour change. 579 tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: drop dead Linux branch from extract_global_mcp_config_with_root_support

Greptile flagged the Linux branch in this helper as a latent
early-return bug (would silently drop non-first-user configs if any
future Linux extractor invoked it). The branch was preparation-for-
future-use that no current Linux extractor reaches.

Removing it instead. Linux MCP extractors already follow a better
pattern — per-user accumulation across get_linux_user_homes() — so
the helper is macOS/Windows-only by intent. Replace the branch with
a comment that points future authors at the correct Linux template.

Zero behaviour change today: macOS callers hit the Darwin branch,
Windows callers hit the Windows branch, no Linux callers exist. 579
tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: fix over-indented try block in extract_global_mcp_config_with_root_support

Greptile cosmetic note — the try block under the admin-homes loop was
indented 16 spaces instead of 12 (a pre-existing artifact, syntactically
valid but inconsistent). Re-align to the 4-space loop body. No behaviour
change; 579 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

* fix: allow Linux in main() OS guard so discovery actually runs (#152)

PR #128 added full Linux support (70 modules, 52 factory branches,
linux_extraction_helpers, and 3 Linux branches in main()), but the
squash-merge into staging retained the exit-3 OS guard at the top of
main():

    if current_platform not in ("Darwin", "Windows"):
        ... "AI tool discovery is not supported on {platform}"
        sys.exit(3)

This guard fires BEFORE detector init, so on Linux `unbound discover`
exits 3 immediately and skips the scan — making every downstream Linux
branch unreachable dead code. Reproduced end-to-end on an Ubuntu VM via
the real install.sh entry path against staging: clones staging, runs
ai_tools_discovery.py, exits 3.

Narrow the guard to include Linux. Genuinely unsupported platforms
(*BSD, AIX, etc.) still exit 3 cleanly before detector init so they
can't crash and page Sentry on every run.

Update TestUnsupportedPlatformGuard accordingly:
- test_linux_proceeds_past_os_guard: Linux now passes the guard (exits 0
  at the single-flight lock check, not 3 at the guard).
- test_unsupported_platform_exits_code_3_before_detector_init: FreeBSD
  still exits 3, detector never constructed.

621 tests pass.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: vishnu <vishnuvinod072@gmail.com>
Co-authored-by: Vishnu <79318686+zeus-12@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

* [WEB-4580] Add GitHub Copilot CLI discovery (macOS + Windows) (#147)

* Add GitHub Copilot CLI discovery (macOS)

Detect the standalone GitHub Copilot CLI (@github/copilot, config home
~/.copilot/) and its MCP servers as a distinct tool ("GitHub Copilot CLI"),
separate from the existing Copilot VS Code extension / JetBrains plugin
detection. Mirrors the dedicated-module pattern used for Codex.

- macos/copilot_cli/: detector (per-user/root scan, version-robust union
  marker gate) + MCP extractor reading ~/.copilot/mcp-config.json
  (JSONC-aware; mcpServers/servers/flat forms)
- ai_tools_discovery: exact-match "github copilot cli" routing branch placed
  before the existing "github copilot" substring branch;
  _process_copilot_cli_tool with empty-project filtering
- coding_tool_factory: detector + MCP extractor factory wiring
- tests: 41 integration tests (detection, MCP parsing, routing order, root
  all-users scan, factory wiring)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Address review: harden flat-form MCP detection + doc wording

- _extract_servers_obj: in the flat (unwrapped…

* Add workspace .mcp.json MCP source to Copilot CLI discovery (WEB-4620) (#155)

GitHub Copilot CLI loads MCP servers from three sources (per the binary's
own `copilot mcp` help): User (~/.copilot/mcp-config.json), Workspace
(<project root>/.mcp.json), and Plugin. The extractor read only the User
source, so servers a user configured in a repo's .mcp.json (e.g. serena)
were used by the CLI but invisible to discovery — the dashboard showed the
CLI with an empty MCP Servers column.

Add the Workspace source on macOS + Windows:
- Walk project roots for .mcp.json, reusing the shared
  walk_for_claude_project_mcp_configs (same file Claude Code reads, so both
  tools correctly surface it).
- Isolate OS-specific roots/skip behind _workspace_search_roots /
  _should_skip_workspace_path seams; macOS implements them, the Windows
  subclass overrides them. Parser stays shared (DRY).
- No orchestrator change: _process_copilot_cli_tool already lists every
  returned project on the CLI row.

The CLI does NOT inherit the host IDE's MCP store (verified via
`copilot mcp list`), so no IDE config is read. Plugin-scope MCP is deferred
to WEB-4619.

Tests: new TestCopilotCliWorkspaceMcpExtraction (5 cases incl. end-to-end
via the orchestrator branch); full suite 161 passed; real on-machine e2e
seeded a repo .mcp.json with serena and confirmed it surfaces under the
GitHub Copilot CLI row.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Aakash Velusamy <aakashvpsgtech@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: MohamedAklamaash <aklamaash78@gmail.com>
Co-authored-by: Mohamed Aklamaash M.R <111295679+MohamedAklamaash@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NandaPranesh <106886030+anonpran@users.noreply.github.com>
Co-authored-by: pugazhendhi-m <132246623+pugazhendhi-m@users.noreply.github.com>
Co-authored-by: Nanda Pranesh <nandapranesh27@gmail.com>
---
 .../ai_tools_discovery.py                     | 161 +++++++++-------
 .../macos/copilot_cli/mcp_config_extractor.py |  91 ++++++---
 .../copilot_cli/mcp_config_extractor.py       |  58 ++++--
 tests/test_cli_flags.py                       | 180 ++++++++++++++++++
 tests/test_copilot_cli_discovery.py           | 167 ++++++++++++++++
 5 files changed, 548 insertions(+), 109 deletions(-)
 create mode 100644 tests/test_cli_flags.py

diff --git a/scripts/coding_discovery_tools/ai_tools_discovery.py b/scripts/coding_discovery_tools/ai_tools_discovery.py
index 9e3b556e..01440c29 100644
--- a/scripts/coding_discovery_tools/ai_tools_discovery.py
+++ b/scripts/coding_discovery_tools/ai_tools_discovery.py
@@ -128,8 +128,8 @@
     from scripts.coding_discovery_tools.s3_uploader import compute_payload_hash
     from scripts.coding_discovery_tools import cache as discovery_cache
 
-# Set up logger
 logger = logging.getLogger(__name__)
+payload_logger = logging.getLogger(__name__ + ".payload")
 configure_logger()
 
 
@@ -1964,8 +1964,36 @@ def main():
     parser.add_argument('--api-key', type=str, help='API key for authentication and report submission (or set UNBOUND_API_KEY env)')
     parser.add_argument('--domain', type=str, help='Domain of the backend to send the report to')
     parser.add_argument('--app_name', type=str, help='Application name (e.g., JumpCloud)')
+    verbosity = parser.add_mutually_exclusive_group()
+    verbosity.add_argument(
+        '--dump',
+        action='store_true',
+        help='Also log the full per-tool JSON payload sent to the backend.',
+    )
+    verbosity.add_argument(
+        '--summary',
+        action='store_true',
+        help='Suppress per-tool detail output: Report Summary box, transport '
+             'lines (Sending / ✓ sent), and logging_helpers sub-boxes. Keeps '
+             'headline output, per-tool totals, warnings, and errors.',
+    )
+    verbosity.add_argument(
+        '--payload',
+        action='store_true',
+        help='Print only the per-tool JSON payload(s) as raw output; mute every '
+             'other INFO log (errors/warnings still pass through).',
+    )
     args = parser.parse_args()
 
+    if args.payload:
+        logging.getLogger().setLevel(logging.WARNING)
+        payload_logger.setLevel(logging.INFO)
+    elif args.summary:
+        try:
+            from scripts.coding_discovery_tools import logging_helpers as _lh
+        except ImportError:
+            from . import logging_helpers as _lh
+        logging.getLogger(_lh.__name__).setLevel(logging.WARNING)
     # Hook-triggered invocations pass the api_key via env so it never appears
     # in argv / /proc/<pid>/cmdline. CLI --api-key remains supported for MDM
     # and direct-script usage.
@@ -2264,68 +2292,67 @@ def time_step(name: str, phase: str) -> Iterator[None]:
                             'rules': num_rules
                         })
 
-                        # Log detailed summary of what's being sent
-                        logger.info("")
-                        logger.info("  ┌─ Report Summary ────────────────────────────────────────────────")
-                        logger.info(f"  │ User: {user_name}")
-                        logger.info(f"  │ Tool: {tool_name}")
-                        logger.info(f"  │ Version: {tool_filtered.get('version', 'Unknown')}")
-                        logger.info(f"  │ Install Path: {tool_filtered.get('install_path', 'Unknown')}")
-                        logger.info(f"  │ Projects: {len(projects)}")
-                        logger.info(f"  │ Total Rules: {num_rules}")
-                        logger.info(f"  │ Total MCP Servers: {num_mcp_servers}")
-
-                        # Log permissions details if present
-                        if "permissions" in tool_filtered:
-                            perms = tool_filtered.get("permissions", {})
-                            logger.info(f"  │ Permissions: ✓ Present")
-                            logger.info(f"  │   Scope: {perms.get('scope', 'unknown')}")
-                            logger.info(f"  │   Path: {perms.get('settings_path', 'unknown')}")
-                            logger.info(f"  │   Permission Mode: {perms.get('permission_mode', 'not set')}")
-                            logger.info(f"  │   Allow Rules: {len(perms.get('allow_rules', []))}")
-                            logger.info(f"  │   Deny Rules: {len(perms.get('deny_rules', []))}")
-                            logger.info(f"  │   Ask Rules: {len(perms.get('ask_rules', []))}")
-                            if perms.get('mcp_servers'):
-                                logger.info(f"  │   MCP Servers: {len(perms.get('mcp_servers', []))}")
-                            if perms.get('mcp_policies'):
-                                policies = perms.get('mcp_policies', {})
-                                if policies.get('allowedMcpServers') or policies.get('deniedMcpServers'):
-                                    logger.info(f"  │   MCP Policies: allowed={len(policies.get('allowedMcpServers', []))}, denied={len(policies.get('deniedMcpServers', []))}")
-                            logger.info(f"  │   Sandbox Enabled: {perms.get('sandbox_enabled', 'not set')}")
-                        else:
-                            logger.info(f"  │ Permissions: ✗ Not present")
+                        if not args.summary and not args.payload:
+                            logger.info("")
+                            logger.info("  ┌─ Report Summary ────────────────────────────────────────────────")
+                            logger.info(f"  │ User: {user_name}")
+                            logger.info(f"  │ Tool: {tool_name}")
+                            logger.info(f"  │ Version: {tool_filtered.get('version', 'Unknown')}")
+                            logger.info(f"  │ Install Path: {tool_filtered.get('install_path', 'Unknown')}")
+                            logger.info(f"  │ Projects: {len(projects)}")
+                            logger.info(f"  │ Total Rules: {num_rules}")
+                            logger.info(f"  │ Total MCP Servers: {num_mcp_servers}")
+
+                            if "permissions" in tool_filtered:
+                                perms = tool_filtered.get("permissions", {})
+                                logger.info(f"  │ Permissions: ✓ Present")
+                                logger.info(f"  │   Scope: {perms.get('scope', 'unknown')}")
+                                logger.info(f"  │   Path: {perms.get('settings_path', 'unknown')}")
+                                logger.info(f"  │   Permission Mode: {perms.get('permission_mode', 'not set')}")
+                                logger.info(f"  │   Allow Rules: {len(perms.get('allow_rules', []))}")
+                                logger.info(f"  │   Deny Rules: {len(perms.get('deny_rules', []))}")
+                                logger.info(f"  │   Ask Rules: {len(perms.get('ask_rules', []))}")
+                                if perms.get('mcp_servers'):
+                                    logger.info(f"  │   MCP Servers: {len(perms.get('mcp_servers', []))}")
+                                if perms.get('mcp_policies'):
+                                    policies = perms.get('mcp_policies', {})
+                                    if policies.get('allowedMcpServers') or policies.get('deniedMcpServers'):
+                                        logger.info(f"  │   MCP Policies: allowed={len(policies.get('allowedMcpServers', []))}, denied={len(policies.get('deniedMcpServers', []))}")
+                                logger.info(f"  │   Sandbox Enabled: {perms.get('sandbox_enabled', 'not set')}")
+                            else:
+                                logger.info(f"  │ Permissions: ✗ Not present")
 
-                        logger.info("  └──────────────────────────────────────────────────────────────────")
-                        logger.info("")
+                            logger.info("  └──────────────────────────────────────────────────────────────────")
+                            logger.info("")
 
-                        # Log JSON payload (rules/skills content stripped to reduce volume)
-                        logger.info("  Complete JSON payload being sent to backend:")
-                        logger.info("  " + "=" * 70)
-                        try:
-                            log_report = copy.deepcopy(single_tool_report)
-                            for t in log_report.get("tools") or []:
-                                for p in t.get("projects") or []:
-                                    for r in p.get("rules") or []:
-                                        if "content" in r:
-                                            r["content"] = f"<{len(r['content'])} chars>"
-                                    for s in p.get("skills") or []:
-                                        if "content" in s:
-                                            s["content"] = f"<{len(s['content'])} chars>"
-                                    for mcp in p.get("mcpServers") or []:
-                                        for tool in (mcp.get("scan") or {}).get("tools") or []:
-                                            tool.pop("inputSchema", None)
-                                            tool.pop("outputSchema", None)
-                                            desc = tool.get("description") or ""
-                                            if len(desc) > 80:
-                                                tool["description"] = desc[:80] + "..."
-                            report_json = json.dumps(log_report, indent=2)
-                            for line in report_json.split('\n'):
-                                logger.info(f"  {line}")
-                        except Exception as e:
-                            logger.warning(f"  Could not serialize report to JSON for logging: {e}")
-                            logger.info(f"  Report structure: {single_tool_report}")
-                        logger.info("  " + "=" * 70)
-                        logger.info("")
+                        if args.dump or args.payload:
+                            payload_logger.info("  Complete JSON payload being sent to backend:")
+                            payload_logger.info("  " + "=" * 70)
+                            try:
+                                log_report = copy.deepcopy(single_tool_report)
+                                for t in log_report.get("tools") or []:
+                                    for p in t.get("projects") or []:
+                                        for r in p.get("rules") or []:
+                                            if "content" in r:
+                                                r["content"] = f"<{len(r['content'])} chars>"
+                                        for s in p.get("skills") or []:
+                                            if "content" in s:
+                                                s["content"] = f"<{len(s['content'])} chars>"
+                                        for mcp in p.get("mcpServers") or []:
+                                            for tool in (mcp.get("scan") or {}).get("tools") or []:
+                                                tool.pop("inputSchema", None)
+                                                tool.pop("outputSchema", None)
+                                                desc = tool.get("description") or ""
+                                                if len(desc) > 80:
+                                                    tool["description"] = desc[:80] + "..."
+                                report_json = json.dumps(log_report, indent=2)
+                                for line in report_json.split('\n'):
+                                    payload_logger.info(f"  {line}")
+                            except Exception as e:
+                                logger.warning(f"  Could not serialize report to JSON for logging: {e}")
+                                payload_logger.info(f"  Report structure: {single_tool_report}")
+                            payload_logger.info("  " + "=" * 70)
+                            payload_logger.info("")
 
                         # Per-(tool, home_user) hash dedup against ~/.unbound/discovery-cache.json.
                         # Backend already dedups on payload_hash; this short-circuits the upload
@@ -2338,14 +2365,17 @@ def time_step(name: str, phase: str) -> Iterator[None]:
 
                         cached_hash = discovery_cache.get_cached_hash(tool_name, user_name)
                         if local_payload_hash and cached_hash == local_payload_hash:
-                            logger.info(f"  · {tool_name} unchanged for user {user_name} (hash match), skipping upload")
+                            if not args.summary and not args.payload:
+                                logger.info(f"  · {tool_name} unchanged for user {user_name} (hash match), skipping upload")
                         else:
-                            logger.info(f"  Sending {tool_name} report for user {user_name} to backend...")
+                            if not args.summary and not args.payload:
+                                logger.info(f"  Sending {tool_name} report for user {user_name} to backend...")
 
                             with time_step("send_report_per_tool_user", "send"):
                                 success, retryable = send_report_to_backend(args.domain, args.api_key, single_tool_report, args.app_name, sentry_context=sentry_ctx)
                             if success:
-                                logger.info(f"  ✓ {tool_name} report for user {user_name} sent successfully")
+                                if not args.summary and not args.payload:
+                                    logger.info(f"  ✓ {tool_name} report for user {user_name} sent successfully")
                                 if local_payload_hash:
                                     discovery_cache.update_tool(tool_name, user_name, local_payload_hash)
                             else:
@@ -2353,7 +2383,8 @@ def time_step(name: str, phase: str) -> Iterator[None]:
                                 if retryable:
                                     failed_reports.append(single_tool_report)
 
-                        logger.info("")
+                        if not args.summary and not args.payload:
+                            logger.info("")
 
                     except PermissionError as e:
                         # User-specific permission error - send scan_event=failed with home_user
diff --git a/scripts/coding_discovery_tools/macos/copilot_cli/mcp_config_extractor.py b/scripts/coding_discovery_tools/macos/copilot_cli/mcp_config_extractor.py
index 445601d3..4e3fbf99 100644
--- a/scripts/coding_discovery_tools/macos/copilot_cli/mcp_config_extractor.py
+++ b/scripts/coding_discovery_tools/macos/copilot_cli/mcp_config_extractor.py
@@ -1,30 +1,30 @@
 """
 MCP config extraction for the GitHub Copilot CLI.
 
-The CLI stores its MCP servers in ``~/.copilot/mcp-config.json``. The file is
-JSON with comments (``//`` and ``/* */``) and trailing commas are both
-tolerated, since the file is commonly hand-edited (review P1). The server map
-may appear under ``mcpServers``, under ``servers``, or — for the GitHub CLI's
-Claude-style unwrapped form — as a flat top-level object of ``{name: config}``
-entries (review P1-4).
-
-The parsing here is platform-neutral and the all-users branch is handled by
-``extract_ide_global_configs_with_root_support`` (which already supports both
-macOS ``/Users`` and Windows ``C:\\Users`` admin scans), so this extractor is
-OS-agnostic. The Windows package reuses it via a thin subclass; do not fork the
-parser. The class keeps the ``MacOS`` name for historical/import stability.
+The CLI loads MCP servers from the User config (``~/.copilot/mcp-config.json``)
+and Workspace files (``<project>/.mcp.json``); this extractor reads both. The
+User file is parsed here; the Workspace ``.mcp.json`` is read via the shared
+``walk_for_claude_project_mcp_configs``, with OS-specific roots/skip in the
+``_workspace_search_roots`` / ``_should_skip_workspace_path`` seams (overridden
+by the Windows subclass).
 """
 
 import json
 import logging
 import re
 from pathlib import Path
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, Tuple
 
 from ...coding_tool_base import BaseMCPConfigExtractor
+from ...macos_extraction_helpers import (
+    get_top_level_directories,
+    should_skip_path,
+    should_skip_system_path,
+)
 from ...mcp_extraction_helpers import (
     extract_ide_global_configs_with_root_support,
     transform_mcp_servers_to_array,
+    walk_for_claude_project_mcp_configs,
 )
 from .copilot_cli import _resolve_copilot_dir
 
@@ -32,11 +32,7 @@
 
 _TOOL_NAME = "GitHub Copilot CLI"
 _CLI_DIR_NAME = ".copilot"
-# MCP servers live ONLY in mcp-config.json. The detector accepts config.json /
-# settings.json as install *markers*, but those hold general CLI settings
-# (model, theme, trusted dirs) — not MCP servers — so reading only this file is
-# intentional, not an oversight of the marker-set union (review W3). The CLI's
-# own `--additional-mcp-config` docs confirm it augments ~/.copilot/mcp-config.json.
+# User-scope MCP file. Workspace servers live in .mcp.json (project-scope walk).
 _MCP_CONFIG_FILENAME = "mcp-config.json"
 
 # String-aware JSONC comment stripper. Removes // line comments and /* */ block
@@ -124,19 +120,17 @@ def extract_mcp_config(
         self, plugin_lookup: Optional[Dict] = None
     ) -> Optional[Dict]:
         """
-        Extract GitHub Copilot CLI MCP configuration on macOS.
-
-        Reads ``~/.copilot/mcp-config.json`` for every relevant user (root-aware
-        via ``extract_ide_global_configs_with_root_support``).
-
-        Returns:
-            Dict with a ``projects`` array, or None if no configs found.
+        Extract GitHub Copilot CLI MCP config: User (``~/.copilot/mcp-config.json``,
+        root-aware) plus Workspace (``.mcp.json`` at project roots). Each is a
+        distinct project entry. Returns a ``projects`` dict, or None if empty.
         """
         projects = extract_ide_global_configs_with_root_support(
             self._extract_cli_configs_for_user,
             tool_name=_TOOL_NAME,
         )
 
+        projects.extend(self._extract_workspace_configs())
+
         if not projects:
             return None
 
@@ -157,6 +151,48 @@ def _extract_cli_configs_for_user(self, user_home: Path) -> List[Dict]:
         config = self._read_cli_mcp_config(config_path, str(copilot_dir))
         return [config] if config else []
 
+    # -- Workspace scope: project-root .mcp.json -----------------------------
+
+    def _extract_workspace_configs(self) -> List[Dict]:
+        """Walk ``_workspace_search_roots`` for project-scope ``.mcp.json`` files.
+
+        Never raises — this runs on customer machines.
+        """
+        projects: List[Dict] = []
+        for root_path, start_dir in self._workspace_search_roots():
+            try:
+                start_depth = len(start_dir.relative_to(root_path).parts)
+            except ValueError:
+                start_depth = 0
+            try:
+                walk_for_claude_project_mcp_configs(
+                    root_path,
+                    start_dir,
+                    projects,
+                    self._should_skip_workspace_path,
+                    current_depth=start_depth,
+                )
+            except (PermissionError, OSError) as exc:
+                logger.debug(f"Skipping workspace scan of {start_dir}: {exc}")
+            except Exception as exc:
+                logger.debug(f"Error scanning workspace dir {start_dir}: {exc}")
+        return projects
+
+    def _workspace_search_roots(self) -> List[Tuple[Path, Path]]:
+        """``(root_path, start_dir)`` pairs for the project walk (macOS): every
+        top-level dir under ``/`` (root-aware), or the home dir as a fallback."""
+        root_path = Path("/")
+        try:
+            return [(root_path, top_dir) for top_dir in get_top_level_directories(root_path)]
+        except (PermissionError, OSError) as exc:
+            logger.debug(f"Falling back to home for workspace scan: {exc}")
+            home = Path.home()
+            return [(home, home)]
+
+    def _should_skip_workspace_path(self, item: Path) -> bool:
+        """Skip predicate for the macOS workspace walk (system + skip dirs)."""
+        return should_skip_path(item) or should_skip_system_path(item)
+
     def _read_cli_mcp_config(
         self, config_path: Path, tool_path: str
     ) -> Optional[Dict]:
@@ -169,7 +205,7 @@ def _read_cli_mcp_config(
         must never crash.
 
         Returns:
-            Dict with ``path`` and ``mcpServers`` keys, or None.
+            Dict with ``path``, ``mcpServers`` and ``scope`` keys, or None.
         """
         try:
             if not config_path.is_file():
@@ -187,9 +223,12 @@ def _read_cli_mcp_config(
             mcp_servers_array = transform_mcp_servers_to_array(mcp_servers_obj)
 
             if mcp_servers_array:
+                # scope mirrors the workspace walk's "project" tag so the combined
+                # projects list is consistently labelled.
                 return {
                     "path": tool_path,
                     "mcpServers": mcp_servers_array,
+                    "scope": "user",
                 }
         except json.JSONDecodeError as exc:
             logger.warning(
diff --git a/scripts/coding_discovery_tools/windows/copilot_cli/mcp_config_extractor.py b/scripts/coding_discovery_tools/windows/copilot_cli/mcp_config_extractor.py
index 0db3e208..a847b91d 100644
--- a/scripts/coding_discovery_tools/windows/copilot_cli/mcp_config_extractor.py
+++ b/scripts/coding_discovery_tools/windows/copilot_cli/mcp_config_extractor.py
@@ -1,30 +1,52 @@
 """
 MCP config extraction for the GitHub Copilot CLI on Windows systems.
 
-DRY decision (CLAUDE.md): the Copilot CLI MCP extraction is 100% OS-agnostic —
-the config path is identical (``~/.copilot/mcp-config.json``, i.e.
-``%USERPROFILE%\\.copilot``), the all-users scan is handled by the shared
-``extract_ide_global_configs_with_root_support`` (which already branches on
-``platform.system()`` for the Windows admin + ``C:\\Users`` case), and the
-parser (JSONC comments + trailing commas + wrapped/flat server resolution) is
-pure string/JSON logic. So this Windows extractor is a thin subclass of the
-macOS one with NO duplicated parsing — the trailing-comma fix (review P1) and
-every future parser change are shared automatically. A distinct ``Windows``
-class is kept (rather than reusing the macOS class directly in the factory) only
-so the ``CopilotCliMCPConfigExtractorFactory`` stays symmetric with every sibling
-factory, which returns a per-OS ``WindowsXxx`` type.
+The parser and the User-scope read are OS-agnostic, so this subclass inherits
+them from the macOS extractor. Only the Workspace ``.mcp.json`` walk differs on
+Windows (drive-letter root + Windows system dirs); we override the
+``_workspace_search_roots`` / ``_should_skip_workspace_path`` seams accordingly.
 """
 
+import logging
+from pathlib import Path
+from typing import List, Tuple
+
 from ...macos.copilot_cli.mcp_config_extractor import (
     MacOSCopilotCliMCPConfigExtractor,
 )
+from ...windows_extraction_helpers import should_skip_path
+
+logger = logging.getLogger(__name__)
+
+# Windows system dirs skipped by the workspace walk (mirrors Windows Claude Code).
+# Lower-cased: NTFS is case-insensitive, so we match dir names case-insensitively.
+_WINDOWS_SYSTEM_DIRS = frozenset({
+    "windows", "program files", "program files (x86)", "programdata",
+    "system volume information", "$recycle.bin", "recovery",
+    "perflogs", "boot", "system32", "syswow64", "winsxs",
+    "config.msi", "documents and settings", "msocache",
+})
 
 
 class WindowsCopilotCliMCPConfigExtractor(MacOSCopilotCliMCPConfigExtractor):
-    """GitHub Copilot CLI MCP config extractor on Windows.
+    """Copilot CLI MCP extractor on Windows; overrides only the workspace walk."""
+
+    def _workspace_search_roots(self) -> List[Tuple[Path, Path]]:
+        """``(root_path, start_dir)`` pairs for the project walk (Windows): the
+        home drive's top-level dirs (system dirs skipped), or home as fallback."""
+        root_path = Path(Path.home().anchor or "C:\\")
+        try:
+            top_level_dirs = [
+                item for item in root_path.iterdir()
+                if item.is_dir() and not self._should_skip_workspace_path(item)
+            ]
+            return [(root_path, top_dir) for top_dir in top_level_dirs]
+        except (PermissionError, OSError) as exc:
+            logger.debug(f"Falling back to home for workspace scan: {exc}")
+            home = Path.home()
+            return [(home, home)]
 
-    Identical behaviour to the macOS extractor: the logic is OS-agnostic (see the
-    module docstring). This subclass exists purely to keep the per-OS factory
-    contract symmetric; it deliberately adds no parser code.
-    """
-    pass
+    def _should_skip_workspace_path(self, item: Path) -> bool:
+        """Skip predicate for the Windows workspace walk: shared SKIP_DIRS plus a
+        case-insensitive Windows system-dir match."""
+        return should_skip_path(item) or item.name.lower() in _WINDOWS_SYSTEM_DIRS
diff --git a/tests/test_cli_flags.py b/tests/test_cli_flags.py
new file mode 100644
index 00000000..650cd4a0
--- /dev/null
+++ b/tests/test_cli_flags.py
@@ -0,0 +1,180 @@
+"""
+Tests for the --dump / --summary / --payload verbosity flags.
+"""
+import argparse
+import io
+import logging
+import unittest
+
+
+def _build_parser():
+    """Mirror the parser shape used by ai_tools_discovery.main()."""
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--api-key')
+    parser.add_argument('--domain')
+    parser.add_argument('--app_name')
+    g = parser.add_mutually_exclusive_group()
+    g.add_argument('--dump', action='store_true')
+    g.add_argument('--summary', action='store_true')
+    g.add_argument('--payload', action='store_true')
+    return parser
+
+
+class TestVerbosityFlagsMutex(unittest.TestCase):
+    """argparse should reject any two-of-three verbosity flag combination."""
+
+    def setUp(self):
+        self.parser = _build_parser()
+
+    def _expect_exit(self, *flags):
+        with self.assertRaises(SystemExit):
+            self.parser.parse_args(['--api-key', 'k', '--domain', 'd', *flags])
+
+    def test_dump_alone_ok(self):
+        args = self.parser.parse_args(['--api-key', 'k', '--domain', 'd', '--dump'])
+        self.assertTrue(args.dump)
+        self.assertFalse(args.summary)
+        self.assertFalse(args.payload)
+
+    def test_summary_alone_ok(self):
+        args = self.parser.parse_args(['--api-key', 'k', '--domain', 'd', '--summary'])
+        self.assertTrue(args.summary)
+        self.assertFalse(args.dump)
+        self.assertFalse(args.payload)
+
+    def test_payload_alone_ok(self):
+        args = self.parser.parse_args(['--api-key', 'k', '--domain', 'd', '--payload'])
+        self.assertTrue(args.payload)
+        self.assertFalse(args.dump)
+        self.assertFalse(args.summary)
+
+    def test_no_flags_defaults_all_false(self):
+        args = self.parser.parse_args(['--api-key', 'k', '--domain', 'd'])
+        self.assertFalse(args.dump)
+        self.assertFalse(args.summary)
+        self.assertFalse(args.payload)
+
+    def test_dump_and_summary_rejected(self):
+        self._expect_exit('--dump', '--summary')
+
+    def test_dump_and_payload_rejected(self):
+        self._expect_exit('--dump', '--payload')
+
+    def test_summary_and_payload_rejected(self):
+        self._expect_exit('--summary', '--payload')
+
+    def test_all_three_rejected(self):
+        self._expect_exit('--dump', '--summary', '--payload')
+
+
+class TestLoggingHelpersSuppression(unittest.TestCase):
+    """
+    Setting the logging_helpers module's logger to WARNING must silence the
+    INFO output of log_rules_details / log_mcp_details / log_settings_details.
+
+    This is the property --summary relies on; we test it directly so a future
+    rename of the module doesn't silently break --summary.
+    """
+
+    def setUp(self):
+        try:
+            from scripts.coding_discovery_tools import logging_helpers
+        except ImportError:
+            from coding_discovery_tools import logging_helpers  # type: ignore
+        self.logging_helpers = logging_helpers
+        self.helpers_logger = logging.getLogger(logging_helpers.__name__)
+
+        self.stream = io.StringIO()
+        self.handler = logging.StreamHandler(self.stream)
+        self.handler.setLevel(logging.DEBUG)
+        self.handler.setFormatter(logging.Formatter('%(levelname)s %(message)s'))
+        self.helpers_logger.addHandler(self.handler)
+        self._original_level = self.helpers_logger.level
+        self.helpers_logger.setLevel(logging.INFO)
+        self.helpers_logger.propagate = False
+
+    def tearDown(self):
+        self.helpers_logger.removeHandler(self.handler)
+        self.helpers_logger.setLevel(self._original_level)
+        self.helpers_logger.propagate = True
+
+    def _captured(self) -> str:
+        self.handler.flush()
+        return self.stream.getvalue()
+
+    def _sample_projects(self):
+        return {
+            '/proj/a': {
+                'rules': [{'file_name': 'a.md', 'size': 10, 'scope': 'project'}],
+                'mcpServers': [{'name': 'srv', 'command': '/bin/srv', 'args': []}],
+            },
+        }
+
+    def test_log_rules_emits_at_info_level(self):
+        self.logging_helpers.log_rules_details(self._sample_projects(), 'tool-x')
+        self.assertIn('Rules Summary', self._captured())
+
+    def test_log_rules_silenced_at_warning_level(self):
+        self.helpers_logger.setLevel(logging.WARNING)
+        self.logging_helpers.log_rules_details(self._sample_projects(), 'tool-x')
+        self.assertEqual(self._captured(), '')
+
+    def test_log_mcp_silenced_at_warning_level(self):
+        self.helpers_logger.setLevel(logging.WARNING)
+        self.logging_helpers.log_mcp_details(self._sample_projects(), 'tool-x')
+        self.assertEqual(self._captured(), '')
+
+    def test_log_settings_silenced_at_warning_level(self):
+        self.helpers_logger.setLevel(logging.WARNING)
+        self.logging_helpers.log_settings_details(
+            [{'scope': 'user', 'settings_path': '/p', 'permissions': {}}],
+            'tool-x',
+        )
+        self.assertEqual(self._captured(), '')
+
+
+class TestSendDedupCount(unittest.TestCase):
+    """
+    The per-(tool, user) upload must fire exactly once for a changed payload
+    and never for a hash match. This mirrors the dedup branch in
+    ai_tools_discovery.main() so a reintroduced unconditional send (the
+    double-send regression) is caught here.
+    """
+
+    def _run_dedup(self, local_hash, cached_hash):
+        sends = []
+        cache_updates = []
+
+        def send_report_to_backend():
+            sends.append(1)
+            return True, False
+
+        def update_tool(name, user, h):
+            cache_updates.append((name, user, h))
+
+        if local_hash and cached_hash == local_hash:
+            pass
+        else:
+            success, retryable = send_report_to_backend()
+            if success and local_hash:
+                update_tool('tool-x', 'user-y', local_hash)
+        return len(sends), len(cache_updates)
+
+    def test_changed_payload_sends_once(self):
+        sends, updates = self._run_dedup('newhash', 'oldhash')
+        self.assertEqual(sends, 1)
+        self.assertEqual(updates, 1)
+
+    def test_hash_match_does_not_send(self):
+        sends, updates = self._run_dedup('samehash', 'samehash')
+        self.assertEqual(sends, 0)
+        self.assertEqual(updates, 0)
+
+    def test_missing_hash_sends_without_cache_update(self):
+        sends, updates = self._run_dedup(None, None)
+        self.assertEqual(sends, 1)
+        self.assertEqual(updates, 0)
+
+
+if __name__ == '__main__':
+    unittest.main()
diff --git a/tests/test_copilot_cli_discovery.py b/tests/test_copilot_cli_discovery.py
index 5bbac10b..6dc39956 100644
--- a/tests/test_copilot_cli_discovery.py
+++ b/tests/test_copilot_cli_discovery.py
@@ -323,6 +323,7 @@ def test_wrapped_mcpServers_surfaces_serena(self):
         configs = self.extractor._extract_cli_configs_for_user(self.user_home)
         self.assertIn("serena", self._server_names(configs))
         self.assertEqual(configs[0]["path"], str(self.copilot_dir))
+        self.assertEqual(configs[0].get("scope"), "user")
 
     def test_servers_key_surfaces_serena(self):
         """mcp-config.json may use the 'servers' key (VS Code-style)."""
@@ -414,6 +415,171 @@ def test_non_object_json_no_results(self):
         self.assertEqual(self.extractor._extract_cli_configs_for_user(self.user_home), [])
 
 
+_MCP_MOD = (
+    "scripts.coding_discovery_tools.macos.copilot_cli.mcp_config_extractor"
+)
+
+
+class TestCopilotCliWorkspaceMcpExtraction(unittest.TestCase):
+    """Workspace ``.mcp.json`` at a project root is surfaced for the CLI.
+
+    The tmp tree is rooted under the real home so the production skip predicate
+    (which treats ``/tmp`` as a system path) does not reject it — the real walk
+    and skip run unmocked.
+    """
+
+    def setUp(self):
+        utils_mod._SENTRY_DSN = ""
+        self.extractor = MacOSCopilotCliMCPConfigExtractor()
+        self.workspace_root = Path(tempfile.mkdtemp(dir=str(Path.home())))
+        self.repo = self.workspace_root / "myrepo"
+        self.repo.mkdir(parents=True)
+
+    def tearDown(self):
+        shutil.rmtree(self.workspace_root, ignore_errors=True)
+
+    def _write(self, path: Path, payload: dict) -> None:
+        path.parent.mkdir(parents=True, exist_ok=True)
+        path.write_text(json.dumps(payload), encoding="utf-8")
+
+    def _scan_workspace(self):
+        """Run the real workspace walk rooted at the tmp tree."""
+        with patch.object(
+            self.extractor,
+            "_workspace_search_roots",
+            return_value=[(self.workspace_root, self.workspace_root)],
+        ):
+            return self.extractor._extract_workspace_configs()
+
+    def test_workspace_mcp_json_surfaces_serena(self):
+        self._write(self.repo / ".mcp.json", {
+            "mcpServers": {"serena": {"command": "uvx", "args": ["serena"]}},
+        })
+        configs = self._scan_workspace()
+        self.assertEqual(len(configs), 1)
+        self.assertEqual(configs[0]["path"], str(self.repo))
+        self.assertEqual(configs[0].get("scope"), "project")
+        self.assertEqual({s["name"] for s in configs[0]["mcpServers"]}, {"serena"})
+
+    def test_workspace_node_modules_is_skipped(self):
+        """A ``.mcp.json`` under node_modules must not be surfaced (SKIP_DIRS)."""
+        self._write(self.repo / "node_modules" / "pkg" / ".mcp.json", {
+            "mcpServers": {"vendored": {"command": "node"}},
+        })
+        self.assertEqual(self._scan_workspace(), [])
+
+    def test_no_workspace_mcp_json_yields_nothing(self):
+        self.assertEqual(self._scan_workspace(), [])
+
+    def test_extract_mcp_config_combines_user_and_workspace(self):
+        """``extract_mcp_config`` merges the User config dir and Workspace repo
+        into distinct project rows (the orchestrator lists both on the CLI)."""
+        self._write(self.repo / ".mcp.json", {
+            "mcpServers": {"serena": {"command": "uvx"}},
+        })
+        user_project = {
+            "path": "/Users/x/.copilot",
+            "mcpServers": [{"name": "notion"}],
+        }
+        with patch.object(
+            self.extractor,
+            "_workspace_search_roots",
+            return_value=[(self.workspace_root, self.workspace_root)],
+        ), patch(
+            f"{_MCP_MOD}.extract_ide_global_configs_with_root_support",
+            return_value=[dict(user_project)],
+        ):
+            result = self.extractor.extract_mcp_config()
+
+        self.assertIsNotNone(result)
+        by_path = {p["path"]: p for p in result["projects"]}
+        self.assertIn("/Users/x/.copilot", by_path)
+        self.assertIn(str(self.repo), by_path)
+        self.assertEqual(
+            {s["name"] for s in by_path[str(self.repo)]["mcpServers"]}, {"serena"}
+        )
+
+    def test_end_to_end_workspace_serena_on_cli_row(self):
+        """Guarantee for the reported user: serena in a repo's ``.mcp.json`` lands
+        on the GitHub Copilot CLI tool row via the orchestrator branch."""
+        self._write(self.repo / ".mcp.json", {
+            "mcpServers": {"serena": {"command": "uvx", "args": ["serena"]}},
+        })
+        detector = AIToolsDetector(os_name="Darwin")
+
+        # Real MCP extractor scoped to the tmp tree (User read neutralised below);
+        # sibling extractors stubbed to skip their on-disk walks.
+        real_mcp = MacOSCopilotCliMCPConfigExtractor()
+        detector._copilot_cli_mcp_extractor = real_mcp
+        detector._copilot_cli_rules_extractor = MagicMock()
+        detector._copilot_cli_rules_extractor.extract_all_copilot_cli_rules.return_value = []
+        detector._copilot_cli_skills_extractor = MagicMock()
+        detector._copilot_cli_skills_extractor.extract_all_skills.return_value = {
+            "user_skills": [], "project_skills": [],
+        }
+        detector._copilot_cli_settings_extractor = MagicMock()
+        detector._copilot_cli_settings_extractor.extract_settings.return_value = []
+
+        tool = {
+            "name": "GitHub Copilot CLI",
+            "version": "0.0.1",
+            "install_path": "/Users/x/.copilot",
+        }
+        with patch.object(
+            real_mcp,
+            "_workspace_search_roots",
+            return_value=[(self.workspace_root, self.workspace_root)],
+        ), patch(
+            f"{_MCP_MOD}.extract_ide_global_configs_with_root_support",
+            return_value=[],
+        ):
+            result = detector.process_single_tool(tool)
+
+        self.assertEqual(result["name"], "GitHub Copilot CLI")
+        repo_rows = [p for p in result["projects"] if p["path"] == str(self.repo)]
+        self.assertEqual(len(repo_rows), 1)
+        self.assertEqual(
+            {s["name"] for s in repo_rows[0]["mcpServers"]}, {"serena"}
+        )
+
+
+class TestWindowsCopilotCliWorkspaceMcpExtraction(unittest.TestCase):
+    """The Windows extractor's workspace seams: the case-insensitive system-dir
+    skip, and surfacing a repo ``.mcp.json`` via the inherited walk."""
+
+    def setUp(self):
+        utils_mod._SENTRY_DSN = ""
+        self.extractor = WindowsCopilotCliMCPConfigExtractor()
+        self.workspace_root = Path(tempfile.mkdtemp(dir=str(Path.home())))
+        self.repo = self.workspace_root / "myrepo"
+        self.repo.mkdir(parents=True)
+
+    def tearDown(self):
+        shutil.rmtree(self.workspace_root, ignore_errors=True)
+
+    def test_skip_predicate_handles_system_dirs_and_casing(self):
+        skip = self.extractor._should_skip_workspace_path
+        self.assertTrue(skip(Path("C:/Windows")))
+        self.assertTrue(skip(Path("C:/Program Files")))
+        self.assertTrue(skip(Path("C:/program files")))  # NTFS is case-insensitive
+        self.assertTrue(skip(Path("C:/Users/x/repo/node_modules")))  # SKIP_DIRS
+        self.assertFalse(skip(Path("C:/Users/x/acme-api")))
+
+    def test_workspace_mcp_json_surfaces_serena(self):
+        (self.repo / ".mcp.json").write_text(json.dumps({
+            "mcpServers": {"serena": {"command": "uvx", "args": ["serena"]}},
+        }), encoding="utf-8")
+        with patch.object(
+            self.extractor,
+            "_workspace_search_roots",
+            return_value=[(self.workspace_root, self.workspace_root)],
+        ):
+            configs = self.extractor._extract_workspace_configs()
+        self.assertEqual(len(configs), 1)
+        self.assertEqual(configs[0]["path"], str(self.repo))
+        self.assertEqual({s["name"] for s in configs[0]["mcpServers"]}, {"serena"})
+
+
 class TestCopilotCliMcpHelpers(unittest.TestCase):
     """Direct checks on the JSONC stripper and server-object resolver."""
 
@@ -930,6 +1096,7 @@ def test_wrapped_mcpServers_surfaces_serena(self):
         configs = self.extractor._extract_cli_configs_for_user(self.user_home)
         self.assertIn("serena", self._server_names(configs))
         self.assertEqual(configs[0]["path"], str(self.copilot_dir))
+        self.assertEqual(configs[0].get("scope"), "user")
 
     def test_trailing_comma_shared_fix_applies(self):
         """The P1 trailing-comma fix is inherited, not forked, on Windows."""

From 9676abaf021860f541b1859e0de73744532e57df Mon Sep 17 00:00:00 2001
From: Vignesh Subbiah <51325334+vigneshsubbiah16@users.noreply.github.com>
Date: Wed, 3 Jun 2026 21:26:46 -0700
Subject: [PATCH 2/5] Cherry-pick to main: discovery writable state-dir
 fallback + preflight Sentry (#159/#160) (#161)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* Report discovery preflight/lock setup failures to Sentry (#159)

* Release Staging to Main (#154)

* feat: add cross-platform --set-cron support (macOS launchd, Linux systemd/crontab, Windows Task Scheduler) (#126)

* feat: add cross-platform --set-cron support for user-level onboard and discover

Extends setup-scheduled-scan.sh to support Linux (systemd/crontab) in
addition to macOS launchd, and adds --command discover|onboard with
--discovery-key flag so the scheduled job can re-run the full onboard
flow — not just discovery.

Adds setup-scheduled-scan.ps1 (Windows) using Task Scheduler with
StartWhenAvailable so missed runs catch up on next boot. Credentials
stored via cmdkey (Windows Credential Manager). Verified empirically
on Azure VM via deallocate/start cycle.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape JSON values and scope umask in Linux credential storage

- Escape backslash and double-quote in api_key/discovery_key/domain
  before writing to scheduled-creds.json. A literal " in any value
  previously corrupted the file and broke the regex parser at run time.
- Run the credential write inside a subshell so umask 077 is scoped to
  that operation and does not leak into the rest of the install.

Caught by greptile review on the original PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: systemd PATH for onboard wrapper + EDR-safe Windows install.ps1 path

- setup-scheduled-scan.sh: add Environment=PATH= to the systemd --user
  service unit. The default --user PATH is minimal and excludes
  ~/.local/bin, /usr/local/bin, and homebrew paths, so the onboard
  wrapper failed to locate the 'unbound' CLI at run time.
- setup-scheduled-scan.ps1: cache install.ps1 under %LOCALAPPDATA%\Unbound
  instead of downloading to TEMP and deleting on each run. The
  download-execute-delete pattern is flagged as suspicious by EDR tools;
  a stable script path under the app data dir is recognised as the
  install location. Also adds a catch around the discover wrapper so
  network failures are logged rather than left silent.

Both caught by greptile review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: prepend user-binary dirs to PATH in scheduled wrapper

cron and systemd --user invoke the wrapper with a minimal PATH that
excludes ~/.local/bin, ~/.npm-global/bin, /usr/local/bin, and homebrew
paths — exactly where 'unbound' lives after 'npm install -g'. The
crontab fallback path therefore failed the onboard branch on every
scheduled run with 'unbound: command not found', visible only in the
log file.

Setting PATH in the wrapper itself covers both crontab and systemd
fallback paths, complementing the Environment=PATH= already set on the
systemd unit.

Caught by greptile review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: discovery_key validation in wrappers, accurate print statements, $args shadow

setup-scheduled-scan.sh:
- Validate DISCOVERY_KEY is present in the onboard branch of the wrapper
  before invoking unbound, so a missing credential fails with a clear log
  message rather than a confusing empty-arg error downstream.
- Fix success message: "runs once on install" → "runs on install + at
  each login via RunAtLoad" (RunAtLoad fires on every bootstrap, not once).
- Fix Linux success message to distinguish systemd catch-up vs crontab
  no-catch-up behaviour.
- Replace "$0 --uninstall" in both macOS and Linux success messages with
  "unbound discover unschedule" — $0 resolves to 'bash' when the script
  is piped via curl | bash, making the hint non-actionable.

setup-scheduled-scan.ps1:
- Validate $DiscoveryKey before invoking unbound in the onboard wrapper
  branch to surface a clear log error instead of a silent empty-arg run.
- Rename $args → $cmdArgs in the onboard wrapper branch to avoid
  shadowing the PowerShell built-in automatic variable $args.
- Fix header comment: "schtasks" → "Register-ScheduledTask" to match
  the actual cmdlet used in the script.
- Fix catch-up comment: remove stray "launchd" cross-OS reference.
- Replace raw PS1 script path in uninstall hint with "unbound discover
  unschedule" — users invoke uninstall via the CLI, not the raw script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: domain guard in shell wrapper + tighten ~/.unbound dir + validate downloaded install.ps1

setup-scheduled-scan.sh:
- Add $DOMAIN presence check at the top of the wrapper's discover branch
  so a missing Keychain/creds-file entry fails with a clear log line
  instead of silently invoking install.sh with --domain "" on every run.
  Mirrors the existing guard in the Windows wrapper.
- Move ~/.unbound mkdir inside the umask 077 subshell so the credential
  directory itself is 0700 (defense-in-depth — the file is already 0600).

setup-scheduled-scan.ps1:
- Validate the downloaded install.ps1 before executing it (length > 100
  bytes, first line looks like PowerShell). Mirrors the shell wrapper's
  size + shebang check, so an HTML error body returned with HTTP 200 is
  not blindly handed to powershell -ExecutionPolicy Bypass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pin install.sh/install.ps1 to an immutable commit SHA at cron-setup time

At the moment the cron is installed, resolve the current HEAD SHA of
coding-discovery-tool via the GitHub API and bake that SHA into the
generated wrapper script's install URL. Raw GitHub content served at a
specific commit SHA is immutable — future pushes to main cannot change
what the daily job executes.

Falls back to 'main' with a visible warning when the API is unreachable
(e.g. no network at install time). The pin is refreshed automatically
every time the user re-runs --set-cron or discover schedule.

setup-scheduled-scan.sh:  resolve_install_ref() resolves the SHA via
  curl + python3 (both already required by install.sh); updates
  SCAN_SCRIPT_URL before create_wrapper_script() writes it into the
  wrapper.

setup-scheduled-scan.ps1: resolves the SHA via Invoke-RestMethod
  (built into PowerShell 3+); passes it as -InstallRef to
  Create-WrapperScript, which replaces the UNBOUND_INSTALL_REF
  placeholder in the single-quoted here-string before writing to disk.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix: pin install.sh/install.ps1 to an immutable commit SHA at cron-setup time"

This reverts commit 48790286b90e87c7ce37ec820eeebe49d730b25d.

* fix: propagate exit code from wrapper script to Task Scheduler

Without exit $ec the wrapper always exits 0, so Task Scheduler records
LastTaskResult = 0x0 even when install.ps1 or unbound onboard fails.
Adds exit $ec before the closing heredoc so failed runs are visible
in the task history and retry-on-failure policies can trigger.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: capture wrapper exit code under set -e and migrate legacy LaunchAgent

Two issues addressed:

1. The generated wrapper runs under set -euo pipefail, so a non-zero exit
   from install.sh or unbound onboard terminated the script before the
   "Scheduled run failed" log line and before the trailing exit propagated
   the status. Every failure appeared as a successful run.

   Fix: capture EXIT_CODE inline via "|| EXIT_CODE=$?" (the canonical
   set -e bypass) and add an explicit "exit $EXIT_CODE" so launchd /
   systemd / cron observe the real status.

2. install_macos only unloaded the new "ai.getunbound.scheduled" label.
   Users upgrading from a previous version still had the old
   "ai.getunbound.discovery" agent loaded — both fired daily at 09:00
   after the upgrade.

   Fix: add a migration step at the top of install_macos that boots out
   the legacy label and removes its plist before registering the new one.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: harden Linux systemd→crontab fallback and PowerShell CLM handling

setup-scheduled-scan.sh
  - install_linux previously gated on [ -d /run/systemd/system ] and ran
    `systemctl --user daemon-reload` directly. On containers, WSL2, CI
    runners and headless servers, the system systemd dir exists but no
    user instance is running — daemon-reload exited non-zero, set -e
    killed the script after credentials and the wrapper had already
    been written, and the crontab fallback was never attempted.
  - Introduces systemd_user_available() that adds a `systemctl --user
    status` probe (the only check that catches the "no user bus" case).
  - install_linux_systemd now returns non-zero instead of aborting when
    daemon-reload or enable fails. install_linux tears down the unit
    files and falls through to crontab so the user is never left with
    credentials on disk and no scheduler.

setup-scheduled-scan.ps1
  - Wraps `Add-Type -Language CSharp` in try/catch. Constrained Language
    Mode (enforced by AppLocker / WDAC on locked-down enterprise fleets
    — exactly the environments most likely to deploy a discovery tool)
    blocks Add-Type and `-ExecutionPolicy Bypass` does NOT bypass CLM.
    Without the try/catch the wrapper logged only "=== Starting ===" on
    every run with no indication of what failed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: bake unbound path at install time and detect auth errors in wrapper

Resolves two known limitations:
1. PATH resolution — bakes the unbound binary path resolved at install
   time into the wrapper (falls back to PATH search at runtime) so the
   scheduled job survives shell profile changes, nvm switches, and
   custom npm-prefix installs.
2. API key rotation — after a non-zero exit, tail the log and emit an
   actionable HINT when a 401/Invalid-key/Unauthorized pattern is found,
   telling the operator which command to re-run with new credentials.

Applies to both the sh (macOS/Linux) and ps1 (Windows) wrappers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: clarify LogonType=Interactive is intentional for personal-device use

Adds a comment explaining the design rationale so reviewers don't flag
it as an oversight: Interactive avoids storing a plaintext password and
is correct for developer laptops. Notes the path forward for headless
deployments if ever required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard grep -vF exit code in Linux crontab uninstall

grep -vF exits 1 when it matches no lines (i.e. the Unbound entry was
the only crontab line). Under set -euo pipefail, pipefail propagates
that exit code through the pipeline and set -e aborts the script before
crontab - executes — leaving the cron entry alive after credentials and
the install dir are already gone.

Wraps the grep in a subshell with || true, matching the identical guard
already present in install_linux_crontab at line 512.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use || true on full pipeline in crontab uninstall path

Previous fix wrapped grep in a subshell which is correct but non-idiomatic.
The cleaner form is || true on the whole pipeline — with pipefail, all pipeline
stages still execute (grep writes nothing, crontab - clears the crontab), and
|| true suppresses the non-zero exit code that grep produces when it selects
no lines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use systemctl --user show to probe user bus availability

systemctl --user status (no unit arg) exits 1 when the user manager is
in degraded state — a failed xdg-portal, pipewire, or gnome-keyring unit
is enough. This is the normal condition on most desktop sessions, so the
probe was silently falling back to crontab for the majority of Linux
users even when the user bus was fully operational.

systemctl --user show queries the manager's own properties and exits 0
as long as the user bus is reachable, regardless of individual unit
health. It is the correct reachability probe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass credentials via env vars to avoid Win32_Process.CommandLine exposure

Win32_Process.CommandLine is readable by any authenticated user without
elevation, and Windows Event Log 4688 (process-creation auditing, common
in enterprise SIEMs) captures full command lines. Passing -ApiKey <value>
as a CLI flag would have put raw API keys into security logs.

discover case: set UNBOUND_API_KEY / UNBOUND_DOMAIN before spawning, pass
-Command with backtick-escaped $env: references so the child process
command line contains the variable name, not the value. finally block
clears the env vars in all exit paths.

onboard case: drop --api-key / --discovery-key from cmdArgs entirely, set
UNBOUND_API_KEY / UNBOUND_DISCOVERY_KEY before spawning unbound. Requires
unbound-cli to read these env vars (see unbound-cli PR change).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass credentials via env vars in shell wrapper to avoid ps/cmdline exposure

Mirrors the same fix already applied to the PS1 wrapper. On Linux,
/proc/pid/cmdline is readable by any local user without elevation; on
macOS, ps output is world-readable. Passing --api-key / --discovery-key
as CLI arguments therefore exposes raw keys to every process on the
machine for the lifetime of the subprocess.

setup-scheduled-scan.sh (generated wrapper):
- discover: UNBOUND_API_KEY="$API_KEY" invoke, --api-key dropped from args
- onboard:  UNBOUND_API_KEY + UNBOUND_DISCOVERY_KEY in env, keys dropped from args
- chmod 700 on wrapper (not +x): outer umask is typically 022 so +x
  would yield 755 (world-executable); 700 restricts to owner only

install.sh:
- main() prepends --api-key from $UNBOUND_API_KEY when the flag is
  absent from explicit args, so the scheduled wrapper can invoke it
  without putting the key value on the command line
- $# -eq 0 guard now also passes when UNBOUND_API_KEY is set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape apostrophes in paths embedded in single-quoted PS expressions

On machines where the Windows username contains an apostrophe (e.g.
O'Brien), $env:LOCALAPPDATA and the npm global bin path both contain a
literal apostrophe. Embedding these paths directly inside a single-quoted
PowerShell expression ('$installScript', '$resolvedUnbound') produces an
unbalanced quote that breaks the expression and silently fails every
scheduled discover or onboard run.

Fix: apply -replace "'", "''" before interpolating into single-quoted
contexts. In PowerShell single-quoted strings, '' is the escape sequence
for a literal apostrophe, so O'Brien becomes O''Brien and the expression
remains syntactically valid.

Two sites fixed:
- discover wrapper: $safeInstallScript used in -Command string
- baked-path replacement: $escapedUnbound used in Test-Path / string literals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: clear all credentials before storing new ones in Install-ScheduledTask

If a user reinstalls with a different command (e.g. switching from onboard
to discover), credentials from the previous install that are not present in
the new invocation (e.g. discovery_key) were left in Windows Credential
Manager. Calling Remove-AllCredentials before storing new values makes
Install-ScheduledTask idempotent — same behaviour as the macOS path which
unconditionally deletes all four credentials before writing new ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard npm config get prefix against set -e abort when npm absent

`npm config get prefix` exits 127 when npm is not installed; under
`set -euo pipefail` that propagated a fatal error mid-install, leaving
credentials written to disk with no diagnostic message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: log onboard spawn failure on Windows; remove stale macOS Keychain creds on migration

- PS1 wrapper onboard branch: add catch block so a spawn failure
  (e.g. access denied, corrupted binary) is logged instead of silently
  falling through with an uninitialised exit code. Mirrors the existing
  catch block in the discover branch. $ec pre-set to 1 so the failure
  is recorded even if the catch block itself throws.
- SH install_macos: after removing the legacy ai.getunbound.discovery
  plist/LaunchAgent, also delete the four Keychain entries stored under
  that old service name. Without this, --uninstall (which only cleans
  the current name) leaves stale credentials on upgraded systems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): replace cmdkey /pass: with CredWrite P/Invoke for credential storage

cmdkey /pass:<secret> exposes API keys in the Windows process command
line, which Event Log 4688 (process-creation auditing) captures
persistently — exactly the enterprise environments this tool targets.

Replace Store-Credential with a CredWrite P/Invoke call (mirroring the
CredRead already in the scheduled wrapper) so secrets never appear in
any command line. cmdkey /delete: is retained for removal since
deletion exposes no secret.

Add-Type CLM guard: if PowerShell Constrained Language Mode blocks
Add-Type, exit with a descriptive error rather than silently falling
back to the insecure cmdkey /pass: form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address anonpran review — credential exposure, json_escape scope, uninstall cleanup

P0-1: Windows discover wrapper used -Command to invoke install.ps1,
causing UNBOUND_API_KEY to be bound as a param and appear in child
process command lines. Switch to -File so the only thing in the
child's command line is the script path. Update install.ps1 to
resolve ApiKey/Domain from UNBOUND_API_KEY/UNBOUND_DOMAIN env vars
(matching the pattern already in install.sh) when not passed as params.

P1-2: json_escape in store_credentials_linux only escapes \ and ".
Add a comment documenting the intentionally limited scope — API keys
and domain URLs never contain control characters so the simpler form
is sufficient.

P3-1: remove_credentials_linux left ~/.unbound/ behind after removing
the creds file. Add rmdir on the parent directory; rmdir is a no-op
when the directory is non-empty, so it is safe when other components
share the directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: note pre-existing Python CLI arg exposure in install.sh and install.ps1

The Python subprocess receives --api-key as a CLI argument, making the
key visible in /proc/pid/cmdline, ps, and Event Log 4688 for the
duration of the Python process. This is a pre-existing limitation of
the ai_tools_discovery.py entry point; both install scripts already
avoid exposing the key at the shell/PS level via UNBOUND_API_KEY env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: sentinel file to prevent RunAtLoad+StartCalendarInterval double-run

macOS launchd fires RunAtLoad (every boot/login) and StartCalendarInterval
(daily 09:00) as two independent triggers. When the machine boots before
09:00, both fire in the same morning. The wrapper now writes a timestamp to
$LOG_DIR/.last-run-ts after each execution and skips if the last run was
within 8 hours.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: Junie support for Windows (#137)

* feat: Junie support for Windows

Junie (JetBrains' AI coding agent) ships on Windows too — as a CLI and as a
JetBrains IDE plugin — but coding-discovery only detected it on macOS. This
adds Windows parity (config lives in %USERPROFILE%\.junie, same layout as
macOS).

- WindowsJunieDetector — detects ~\.junie; multi-user via is_running_as_admin
  scanning C:\Users.
- WindowsJunieRulesExtractor — global ~\.junie\*.md + project-level .junie\*.md;
  project walk uses get_windows_system_directories and the is_user_level_tool_dir
  guard so user-scope rules aren't reported as project-scope.
- WindowsJunieMCPConfigExtractor — reads ~\.junie\mcp\mcp.json per user.
- Wired Windows branches into create_junie_detector and the Junie MCP/rules
  factories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address Greptile review on Windows Junie

- Global Junie rules were tagged scope="project": _detect_rule_scope did
  not recognize .junie. Added .junie to its user-config dir set, and pass
  explicit scope="user" when extracting global rules so user-level rules
  are classified correctly regardless of path resolution.
- Admin scan in the rules extractor skipped the system/default account
  exclusion (Public, Default, etc.) that the detector and MCP extractor
  already applied.
- All three Windows Junie files reinvented the admin/non-admin loop. They
  now use the shared scan_windows_user_directories helper, which centralises
  the branching, excludes system accounts, and handles PermissionError —
  fixing the exclusion bug and removing the duplicated logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Junie MCP parent_levels 2 -> 3 so path resolves to home

~\.junie\mcp\mcp.json needs 3 parent levels to resolve to ~ (home),
matching how every other global MCP config keys its `path`. With 2 it
returned ~\.junie, inconsistently keying Junie's entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add 45 unit tests for macOS and Windows Junie extraction

Covers detector, rules extractor, and MCP config extractor for both
platforms - detection with/without .junie dir, version parsing from
config.json/settings.json, global vs project-level rule extraction,
walker skipping of user-level dirs, and multi-user root scanning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add new flags to rm overdump of logs

* added payload flag

* refactored and added test cases

* Add ~/.unbound/discovery-cache.json local cache + lock + heartbeat

New module that backs per-tool payload-hash dedup and prevents concurrent scans from piling up. Cache file is JSON; lock is an mtime-heartbeat file under ~/.unbound/discovery.lock.

* Integrate per-tool hash dedup into discovery main

Acquire the discovery lock, start the heartbeat, compute payload_hash per (tool, home_user), skip upload when the local cache shows the same hash, and update the cache on a successful send.

* Stabilize payload hash by stripping volatile fields and sorting lists

Expand _strip_ephemeral so cosmetic re-scans dedup correctly: drop plugins[*].installed_at, mcpServers[*].scan.scanned_at/error, mcpServers[*].oauth.{clientId,callbackPort}, and canonicalise filesystem-walk-order-dependent lists (projects, plugins, rules, skills, mcpServers, scan.tools, permissions.{allow,deny,ask}_rules).

* Skip other tools' ~/.<dir>/ from project-scope walkers

Project-scope MCP/rules/skills walkers (filesystem from /) were adopting .mcp.json / .agents / .cursor entries inside ~/.codex, ~/.cursor extensions, etc. as projects of the wrong tool. New is_home_dotdir_descendant predicate wired into walk_for_tool_directories, walk_for_mcp_configs_generic, walk_for_claude_project_mcp_configs, and the Cursor skills _walk_for_skills uniformly — covers any tool that uses the standard ~/.toolname/ convention without per-tool maintenance.

* Accept api_key via UNBOUND_API_KEY env (keeps argv clean)

Hook-triggered invocations can now pass the api_key through the subprocess env so it never appears in argv / /proc/<pid>/cmdline. CLI --api-key remains the default for MDM and direct-script usage.

* Address PR review: tighten dotdir helper, drop dead funcs, lift import

- is_home_dotdir_descendant now matches only the canonical /Users/<u>/.foo and /home/<u>/.foo positions; non-standard mounts like /srv/home/<u>/.config or /data/Users/shared/.cache no longer false-match.
- Remove unused stamp_last_run, is_debounced, _parse_iso, and DEBOUNCE_SECONDS from cache.py — the hook owns the debounce + last_run_at stamp directly, the cache module was never the call site for these.
- Move heartbeat_start() into the try block in ai_tools_discovery main so a heartbeat thread-spawn failure can't leak the discovery lock (release_lock now always runs).
- Lift is_home_dotdir_descendant import in walk_for_tool_directories from function body to module level — no circular import (mcp_extraction_helpers only does lazy imports from macos_extraction_helpers).

* added guardrail to print summary if not --payload, --summary

* fix(macos): skip RunAtLoad when caller already ran the scan (#141)

* chore: mark --no-run-at-load as internal-only in arg parser comment (#148)

The flag is passed automatically by the unbound CLI when a scan already
ran immediately before cron setup; it is not a user-facing option and
is intentionally absent from usage().

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: implement KiloCode version extraction (macOS + Windows) (#149)

* fix: implement KiloCode version extraction (macOS + Windows)

Kilo Code ships standard VS Code extension metadata in
~/.vscode/extensions/kilocode.Kilo-Code-X.Y.Z/package.json (and
~/.cursor/extensions/ for Cursor), but both KiloCode detectors hardcoded
"version": "Unknown" and returned None from get_version(). A stale
docstring claimed "version extraction removed per requirements" — the
sibling Roo Code and Cline detectors had the same stub in early 2025
and were both rewritten to read package.json in commit a31ecfa
(Feb 2026). KiloCode was missed.

Apply the same package.json read pattern KiloCode's siblings use, with
a fall-back to the folder-name version suffix when package.json is
unreadable, and surface the resolved version through both detect() and
get_version() (the latter walks every admin user when running elevated).

Linux equivalent will land via the feat/linux-full-support branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): delegate get_version() to detect() (Greptile review)

Reviewer flagged that get_version() walked extension dirs directly while
detect() also checked that the extension settings dir + IDE app were
present. A partially-uninstalled KiloCode (leftover ~/.vscode/extensions
folder but no globalStorage and no IDE) would let get_version() return
a stale version while detect() returned None — they could disagree.

Make get_version() delegate to detect() and read 'version' off the
result so the install-gating logic is the single source of truth.
Matches the Roo Code / Cline pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address anonpran review + add Replit version detection

KiloCode (PR #149 review by anonpran):
- Scope _get_extension_version_for_user to the IDE that triggered detection.
  Previously looped all SUPPORTED_IDES and returned the first match in any
  extensions dir (.vscode first), so a Cursor install with a leftover
  ~/.vscode/extensions/kilocode.Kilo-Code-* would return the VS Code
  version against the Cursor install_path. Now takes ide_name explicitly,
  same as Roo Code / Cline siblings.
- Drop dead `except IndexError` around rsplit("-", 1)[1] — the `if "-"`
  guard already ensures the split returns 2 elements.
- Add docstring to _get_extension_version_for_user.
- Add tests/test_kilocode_version_extraction.py covering package.json
  reads, folder-suffix fallback, and IDE scoping (must NOT cross-leak
  VS Code version to a Cursor install).

Replit version extraction (macOS + Windows):
- macOS: Read CFBundleShortVersionString from /Applications/Replit.app/
  Contents/Info.plist (same source Cursor/Windsurf use); fall back to
  Contents/Resources/app/package.json.
- Windows: Walk %LOCALAPPDATA%\Programs and Program Files for the
  install dir, read resources\app\package.json; fall back to the .exe
  FileVersion via PowerShell. The earlier "doesn't expose version
  information in a standard way" comment was wrong — Replit Desktop
  is a standard Electron app.

Linux versions for Antigravity + Replit will land via the
feat/linux-full-support branch alongside the matching KiloCode fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode, replit): address the two edge-case flags

1. KiloCode folder-name fallback preserved pre-release suffix.
   rsplit("-", 1)[1] truncated pre-release semver — kilocode.Kilo-Code-
   1.2.3-pre.5 collapsed to "pre.5" instead of "1.2.3-pre.5". Replace
   with a regex that captures full semver including pre-release/build
   metadata from the end of the folder name. Add tests covering -pre.5
   and -beta.1 forms.

2. Replit Windows PowerShell path quoting.
   repr(str(path)) produced 'C:\\Users\\…' (literal double-backslash
   inside PowerShell's single quotes), which doesn't reliably resolve.
   Switch to a hand-built single-quoted string with single-quote
   doubling, plus -LiteralPath so PowerShell doesn't try to glob the
   path. Inside single quotes, backslashes are already literal — no
   double-backslash dance needed.

Both bugs only fire on the fallback branch (after package.json fails
to parse), so they couldn't produce worse output than the pre-PR
"Unknown" — but the fix is cheap and removes the footgun.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address greptile + anonpran follow-ups on PR #149

Test coverage:
- Drop @unittest.skipUnless from TestMacOSKiloCodeVersion and
  TestWindowsKiloCodeVersion. _get_extension_version_for_user is pure
  pathlib + JSON parsing, so the IDE-scoping regression guard (which
  proves a Cursor install doesn't inherit a leftover VS Code version)
  must execute on every CI runner — previously it ran only on a dev's
  macOS/Windows box and was silently skipped on the Linux runner that
  produced the "553 passed" claim.

Import hygiene / cross-OS consistency:
- Windows Replit imported COMMAND_TIMEOUT while macOS imported
  VERSION_TIMEOUT for the same lookup. Same value (30s), but mixing
  the import names across sibling files is exactly the inconsistency
  Greptile and anonpran flagged. Switch Windows to VERSION_TIMEOUT.
- Replit detect() now wraps get_version() in `or "Unknown"` on both
  macOS and Windows. Without it, a data-dir-only install emitted
  `"version": null` while a KiloCode equivalent emitted `"Unknown"`
  for the same case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: bring Windows KiloCode test coverage to macOS parity

Greptile flagged the Windows test class as thin (2 cases vs 7 on macOS).
Add the 3 missing parity tests: folder-suffix fallback, pre-release
suffix preservation, Cursor extensions-dir routing. The Linux runner
will now execute all 12 cases instead of the previous 9.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode/macos): require single-IDE pairing for globalStorage + .app

The fallback loop in _check_user_for_kilocode (macOS only) updated
ide_installed but never updated ide_with_extension. So when
globalStorage was found in IDE-A but only IDE-B was actually installed,
the version lookup ran against IDE-A's extensions dir — either
returning a stale leftover version or falling through to "Unknown"
even when IDE-B genuinely had Kilo Code installed with a readable
package.json.

The fallback was always wrong-headed: globalStorage in an uninstalled
IDE doesn't mean Kilo Code is currently usable from a different IDE.
Replace it with a single pass that requires BOTH globalStorage AND
matching .app for the same IDE.

Add three regression tests:
- test_rejects_globalstorage_when_matching_ide_not_installed: trap config
  (Code globalStorage + Cursor.app, no Cursor globalStorage) now returns
  None instead of an incoherent dict.
- test_prefers_ide_with_both_globalstorage_and_app: when both IDEs have
  globalStorage but only Cursor.app is installed, version comes from
  Cursor's extensions dir — NOT from a stale Code leftover.
- test_returns_result_when_first_ide_has_both: happy path.

Windows and Linux detectors never had this fallback loop — they
correctly return None when the first IDE check fails — so this fix is
macOS-only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* perf(replit/macos): avoid redundant _check_application_installation()

When detect() finds an app install, it then calls get_version() which
in turn calls _check_application_installation() again as a safety guard
— two filesystem syscalls where one suffices. Greptile flagged it as
a minor inefficiency.

Add an optional app_path parameter to get_version() so detect() can
pass the already-resolved Path. Falls back to the internal check when
called directly (keeping standalone-call safety). Matches the macOS
Antigravity detector's existing signature.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode/windows): require live IDE install + globalStorage pairing

Mirror the macOS fix (f87c864) to Windows. AppData survives an IDE
uninstall on Windows, so the previous detector — which trusted
globalStorage as proof of an IDE install — would report stale state.

Concrete scenario flagged in review:
- User installed VS Code, added KiloCode (creates
  %AppData%\Code\User\globalStorage\kilocode.Kilo-Code\ AND
  %USERPROFILE%\.vscode\extensions\kilocode.Kilo-Code-X.Y.Z\)
- User uninstalled VS Code (both AppData and the extensions folder
  persist on disk)
- User installed Cursor + KiloCode in Cursor

The old detector found Code globalStorage first, marked ide_with_extension
= 'Code', then surfaced the *VS Code* leftover version against a 'Code'
install_path even though only Cursor is actually installed. Before this
PR opened, that path returned 'Unknown'; after the version extraction
landed, it would start returning a stale-but-plausible version.

Add a Windows-shaped _check_ide_installation that requires the IDE's
.exe under one of the conventional install roots:
- %LOCALAPPDATA%\Programs\
- %ProgramFiles%, %ProgramFiles(x86)% (with C:\Program Files fallbacks
  when the env vars are unset, e.g. service-account scans)

Three regression tests mirror the macOS suite: trap config (Code
globalStorage + uninstalled Code + live Cursor without globalStorage),
preference for Cursor when both have globalStorage but only Cursor is
installed, and the happy path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(replit/macos) + tests: use app_path, add Replit version coverage

macOS Replit get_version() still referenced self.APPLICATION_PATH for
both the Info.plist and package.json reads even after the redundancy
fix added an app_path parameter — so passing app_path was effectively a
no-op and the function always read from /Applications/Replit.app
regardless of caller. Greptile caught it. Replace both references
with the passed-through app_path so the parameter has real effect.

Add tests/test_replit_version_extraction.py — 10 cases, no platform
skips so the Linux/macOS/Windows CI runners all execute them:

macOS (4 cases):
- returns None when no install
- reads Info.plist first via the explicit app_path AND asserts the
  call doesn't reference /Applications/Replit.app — the regression
  guard for this commit
- falls back to package.json when defaults read returns nothing
- returns None when both sources fail

Windows (6 cases):
- returns None when no install
- reads version from resources/app/package.json
- skips missing candidate dirs and tries the next one
- falls back to PowerShell FileVersion when package.json is broken
- escapes single quotes in paths (asserts user' -> user'' in the PS
  command — the regression guard for the earlier quoting fix)
- candidate_install_paths includes both LOCALAPPDATA\Programs and
  Program Files family roots

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(replit): trim version-extraction tests to regression-guard set

Drop 5 trivial/sanity tests from tests/test_replit_version_extraction.py
that didn't lock in a real regression:
- macOS test_returns_none_when_no_install (trivial)
- macOS test_falls_through_to_none_when_both_sources_missing (edge case)
- Windows test_returns_none_when_no_install_found (trivial)
- Windows test_skips_candidate_when_directory_missing (sanity)
- Windows test_candidate_install_paths_includes_user_and_system_dirs
  (implementation detail)

Keep the 5 that earn their slot:
- macOS test_reads_plist_first_when_passed_explicit_app_path — locks
  in the app_path-vs-self.APPLICATION_PATH fix Greptile flagged.
- macOS test_falls_back_to_package_json_when_plist_unreadable —
  exercises the secondary read path.
- Windows test_reads_version_from_package_json — primary success path.
- Windows test_falls_back_to_powershell_when_package_json_unreadable —
  asserts -LiteralPath is in the command, locks in the quoting fix.
- Windows test_escapes_single_quotes_in_path — asserts ' -> '' so a
  path containing a quote can't escape the PS single-quoted literal.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(replit/windows): symmetrise _candidate_install_paths with KiloCode

Greptile flagged that Windows Replit's _candidate_install_paths drops
each root entirely when its env var is unset, while the KiloCode
sibling falls back to Path.home()-derived / C:\Program Files defaults.
Real-world impact is small (env vars are always set in normal user
sessions) but matters for restricted service accounts where AppData
or Program Files env vars get stripped — and the symmetry just reads
cleaner.

Mirror the KiloCode helper's fallback structure here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat: full Linux support for unbound discover and onboard (#128)

* feat: full Linux support for unbound discover and onboard

Adds Linux implementations for every AI coding tool detector and
extractor, bringing Linux to full parity with macOS and Windows
(detectors, rules, MCP config, settings, skills), plus cross-platform
--set-cron support for user-level onboard/discover.

Highlights:
- linux/ package mirroring macos/windows for all 17 tools, wired through
  coding_tool_factory and ai_tools_discovery (multi-user /home + /root,
  Docker/CI root-only containers, /etc/machine-id device id).
- linux_extraction_helpers with Linux-aware system-path skipping and a
  walk_for_tool_directories that does not blocklist /home.
- Claude Cowork (~/.config/Claude) and Junie (~/.junie) Linux parity.
- Plugin provenance (plugin_lookup) threaded through Linux Claude/Cursor
  skills and Claude MCP extractors.
- setup-scheduled-scan cron support hardened across Linux/macOS/Windows.
- tests/test_cowork_skills_extraction_linux.py.

Rebased cleanly onto staging (which already contains plugin provenance
#129); main-only plan-detection commits are intentionally excluded — they
reach staging via the normal main->staging path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux Junie MCP parent_levels 2 -> 3 so path resolves to home

~/.junie/mcp/mcp.json needs 3 parent levels to resolve its reported path
to ~ (home), matching every other global MCP config. Same fix applied to
Windows Junie in PR #137.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: scheduler setup no longer aborts on npm-absent Linux hosts

create_wrapper_script ran NPM_BIN=$(npm config get prefix 2>/dev/null)/bin
with no failure guard, so under 'set -euo pipefail' a host without npm
aborted the entire scheduler setup before the wrapper or cron entry was
written. Guard the lookup with '|| true' and only append the bin dir when
a prefix resolves. NPM_BIN now carries its own trailing colon when set and
is empty otherwise, so an npm-absent host no longer leaves a leading ':'
in the wrapper PATH (which would have put the CWD on PATH).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux multi-user gaps in plugin scan, cursor DB, managed MCP, user-home enum

Audited every `Darwin`/`Windows` branch across the codebase and found five
spots that had no Linux counterpart. Each one silently dropped real data
on a Linux MDM/agent runner:

1. ai_tools_discovery._extract_claude_code_plugins multi-user scan only
   covered Darwin and Windows. On Linux, running as root only checked
   /root/.claude/plugins — plugins installed under /home/<user> were
   invisible, so discovery reported "No plugins found" even when a
   plugin was installed.

2. Same gap in the Cursor plugin multi-user scan.

3. ai_tools_discovery user-home enumeration (fallback when user_rules
   provide no project_path) lacked a Linux branch — used the current
   process's home instead of iterating /home/<user> + /root.

4. mcp_extraction_helpers._get_managed_mcp_path returned None on Linux.
   Added /etc/ClaudeCode/managed-mcp.json (system-wide convention).

5. utils._get_cursor_db_path returned None on Linux, breaking Cursor plan
   detection. Added the XDG path ~/.config/Cursor/User/globalStorage/
   state.vscdb.

All five Linux branches use get_all_users_linux() + the established
"/root" vs "/home/<user>" special-case so they match the rest of the
PR's multi-user handling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: skip macOS-only cowork extractor test on non-Darwin

TestMacOSClaudeCoworkSkillsExtractor exercises MacOSClaudeCoworkSkillsExtractor
specifically, and that extractor intentionally skips its final-step dedup
when running as root. The explicit-sessions-root tempdir path is therefore
ordering-dependent on ext4 (Linux), causing the dedup assertion to flake
under root — pre-existing on main, not introduced by this PR.

Skip the class on non-Darwin so Linux CI/runners don't see a failure that's
not actually about the Linux work. The Linux equivalent
(tests/test_cowork_skills_extraction_linux.py + LinuxClaudeCoworkSkillsExtractor)
dedups unconditionally and already covers the same behaviour on Linux.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: Linux unit-test parity with macOS for skills/cline/cursor extractors

Adds direct unit-test coverage for the Linux skills extractors mirroring
the macOS pattern, and updates cursor plan-detection to cover the Linux
SQLite DB path now that _get_cursor_db_path resolves it on Linux.

New files (20 new tests total):
- tests/test_claude_skills_extraction_linux.py — TestLinuxExtractorEndToEnd
  (4 methods) + TestLinuxExtractorAgents (3) mirror the macOS classes for
  LinuxClaudeSkillsExtractor (~/.claude/{skills,commands,agents}/).
- tests/test_cursor_skills_extraction_linux.py — TestLinuxCursorSkillsExtractor
  (4) + TestLinuxCursorSkillsExtractorWithCommands (5) for the Cursor flow
  including .cursor/ and .agents/ parent dirs.
- tests/test_cline_skills_extraction_linux.py — TestLinuxClineSkillsExtractor
  (5 methods) for the Cline flow.

Pattern: Linux extractors iterate get_linux_user_homes() instead of using
Path.home() + is_running_as_root, so the new tests patch
linux.<tool>.skills_extractor.get_linux_user_homes to return [fake_home]
in place of the three-decorator macOS pattern.

tests/test_cursor_plan_detection.py:
- The Linux branch used to assert "returns None (unsupported)". Now that
  _get_cursor_db_path resolves ~/.config/Cursor/User/globalStorage/state.vscdb
  on Linux, replaced with three correctly-shaped tests:
    * test_get_cursor_db_path_linux — creates the file, expects the path
    * test_get_cursor_db_path_linux_file_not_exists — mirrors the macOS
      file-not-exists assertion
    * test_get_cursor_db_path_unsupported_platform — moved to FreeBSD so
      it still validates the unsupported-platform fallthrough path.

Local macOS: 549 passed (was 527).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: escape heredoc runtime vars in wrapper dedup check

LAST_RUN_FILE, _last, _now, and the arithmetic expansions were bare
$/$(...) inside the unquoted <<WRAPPER_EOF heredoc, causing them to be
expanded by the setup shell at install time instead of at wrapper
runtime. The 8-hour dedup check consequently never fired.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add 10 Linux Junie unit tests (detector, rules, MCP config)

Mirrors the macOS/Windows Junie test coverage for the Linux
implementation - detection, version parsing, global rule extraction,
MCP config reading, and multi-user scanning via get_linux_user_homes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address principal engineer review — B1-B4 + H1/H2/H5

B1 (mcp_extraction_helpers): add _iter_admin_user_homes() helper that
delegates to get_linux_user_homes() on Linux, which correctly includes
/root alongside /home/*. Replace all 5 Linux admin-scan loops to use it
instead of iterating Path("/home") directly.

B2 (utils): hardcode "root" in Docker/CI fallback instead of
Path.home().name which could return "app" when HOME is overridden.

B3 (detect_openclaw): add timeout=10 to subprocess.run(["ps","aux"])
to prevent hangs on loaded systems.

B4 (windsurf): get_version() now checks _USER_RELATIVE_PATHS via
get_linux_user_homes() so user-local installs return a real version.

H1 (linux_extraction_helpers + ai_tools_discovery): extract
linux_home_for_user(username) helper; replace 7 inline ternaries.

H2 (mcp_extraction_helpers): _iter_admin_user_homes() eliminates the
5 copy-pasted Linux admin-detection blocks.

H5 (gemini_cli, opencode): add _USER_RELATIVE_PATHS fallback scan
(~/.local/bin, ~/.npm-global/bin) when which finds nothing, so
installs are detected when user profiles are not sourced (cron/systemd).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Linux Claude Cowork detection in per-user iteration

user_tool_detector._detect_claude_cowork() had only Darwin and Windows
branches — Linux fell through the else into the Windows AppData path,
which never exists on Linux, so Cowork was silently skipped on Linux
even when sessions dir was present.

Add the Linux branch using ~/.config/Claude/local-agent-mode-sessions
(matches LinuxClaudeCoworkDetector._sessions_dir_for_user).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: implement KiloCode version extraction (Linux)

Mirror the macOS/Windows fix from fix/kilocode-version-extraction
(PR #149): Linux KiloCode detector hardcoded 'version': 'Unknown' even
though Kilo Code ships standard VS Code extension metadata in
~/.vscode/extensions/kilocode.Kilo-Code-X.Y.Z/package.json (and the
~/.cursor/extensions/ equivalent for Cursor).

Apply the same package.json read pattern its Linux Roo Code and Cline
siblings already use, with a fall-back to the folder-name version
suffix when package.json is unreadable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): delegate Linux get_version() to detect() (Greptile review)

Mirror the PR #149 review fix: get_version() must not surface a stale
version when detect() would return None (e.g. leftover
~/.vscode/extensions folder but no kilocode globalStorage settings dir).
Delegate to detect() so install-gating logic stays the single source
of truth.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(linux): version extraction for Antigravity, Replit, and scoped KiloCode

KiloCode (mirrors PR #149 anonpran review fix):
- Scope _get_extension_version_for_user to the IDE that triggered detection.
  Previously looped all SUPPORTED_IDES and returned the first hit, so a
  Cursor install with a leftover ~/.vscode/extensions/kilocode.Kilo-Code-*
  would have returned the VS Code version against the Cursor install_path.
- Drop dead `except IndexError` around rsplit("-", 1)[1] — guarded.

Antigravity (Linux):
- Read version from resources/app/product.json (preferred) and
  resources/app/package.json (fallback) across the common install paths
  (/opt/Antigravity, /usr/lib/antigravity, etc. + ~/.local/share/...).
  Antigravity is a VS Code fork, so the resource layout matches what
  Cursor/Windsurf already use elsewhere in this codebase.

Replit (Linux):
- Read version from resources/app/package.json across system + per-user
  install paths. Final fallback: `replit --version`. The earlier "doesn't
  expose version information in a standard way" comment was wrong — Replit
  Desktop is a standard Electron app.

Tests:
- tests/test_linux_version_extraction.py covers all three detectors:
  package.json reads, fallbacks, and KiloCode IDE-scoping (must NOT
  cross-leak VS Code version onto a Cursor install).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(kilocode): preserve pre-release suffix in folder-name fallback (Linux)

rsplit("-", 1)[1] truncated semver pre-release suffixes — e.g.
kilocode.Kilo-Code-1.2.3-pre.5 returned just "pre.5". Switch to a
regex that captures the full version (incl. pre-release/build metadata)
from the end of the folder name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: drop dead imports flagged in principal-engineering review

anonpran's PR #128 review (https://github.com/websentry-ai/coding-
discovery-tool/pull/128#issuecomment-4583525626) approved the PR with
three trivial dead-import nits. Removing them so Pylint stays clean:

- linux/codex/mcp_config_extractor.py: is_running_as_root never used
  in this module (root-vs-user logic lives in get_linux_user_homes()
  which is already imported and used).
- linux/cursor/mcp_config_extractor.py: extract_cursor_mcp_from_dir
  was imported but only walk_for_cursor_mcp_configs + read_global_
  mcp_config are actually called.
- linux/windsurf/mcp_config_extractor.py: same shape as cursor —
  extract_windsurf_mcp_from_dir imported but never referenced.

No behaviour change. 579 tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: drop dead Linux branch from extract_global_mcp_config_with_root_support

Greptile flagged the Linux branch in this helper as a latent
early-return bug (would silently drop non-first-user configs if any
future Linux extractor invoked it). The branch was preparation-for-
future-use that no current Linux extractor reaches.

Removing it instead. Linux MCP extractors already follow a better
pattern — per-user accumulation across get_linux_user_homes() — so
the helper is macOS/Windows-only by intent. Replace the branch with
a comment that points future authors at the correct Linux template.

Zero behaviour change today: macOS callers hit the Darwin branch,
Windows callers hit the Windows branch, no Linux callers exist. 579
tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: fix over-indented try block in extract_global_mcp_config_with_root_support

Greptile cosmetic note — the try block under the admin-homes loop was
indented 16 spaces instead of 12 (a pre-existing artifact, syntactically
valid but inconsistent). Re-align to the 4-space loop body. No behaviour
change; 579 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

* fix: allow Linux in main() OS guard so discovery actually runs (#152)

PR #128 added full Linux support (70 modules, 52 factory branches,
linux_extraction_helpers, and 3 Linux branches in main()), but the
squash-merge into staging retained the exit-3 OS guard at the top of
main():

    if current_platform not in ("Darwin", "Windows"):
        ... "AI tool discovery is not supported on {platform}"
        sys.exit(3)

This guard fires BEFORE detector init, so on Linux `unbound discover`
exits 3 immediately and skips the scan — making every downstream Linux
branch unreachable dead code. Reproduced end-to-end on an Ubuntu VM via
the real install.sh entry path against staging: clones staging, runs
ai_tools_discovery.py, exits 3.

Narrow the guard to include Linux. Genuinely unsupported platforms
(*BSD, AIX, etc.) still exit 3 cleanly before detector init so they
can't crash and page Sentry on every run.

Update TestUnsupportedPlatformGuard accordingly:
- test_linux_proceeds_past_os_guard: Linux now passes the guard (exits 0
  at the single-flight lock check, not 3 at the guard).
- test_unsupported_platform_exits_code_3_before_detector_init: FreeBSD
  still exits 3, detector never constructed.

621 tests pass.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* fix: remove double-send regression, gate dedup transport logs

Drop the unconditional send block that fired before the dedup check; it
bypassed the local hash cache (uploading on every match) and dropped
retryable failures (no failed_reports append / cache update). The
dedup-gated send is now the sole upload path.

Gate the dedup-block "skipping upload", "Sending", and "✓ sent" lines
behind not --summary and not --payload so those modes suppress all
transport output. Errors stay ungated.

Add TestSendDedupCount: exactly one send on a changed payload, zero on a
hash match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Merge main into staging: sync staging up to main (#156)

* Exit cleanly on unsupported platforms instead of crashing (WEB-4370) (#131)

* Exit cleanly on unsupported platforms instead of crashing

On Linux (and any non-macOS/Windows platform), discovery crashed with a
ValueError deep in detector init and reported to Sentry on every run. Add an
early guard in main() that prints a friendly message to stderr and exits with
code 3 before any detector is constructed, so there's no traceback and no
Sentry noise.

WEB-4370

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Cache platform.system() result in the unsupported-platform guard

Call platform.system() once and reuse it for the check and the message,
per review feedback.

WEB-4370

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix Claude Code plan detection by using shell-based binary resolution (#134)

Previously, plan detection was skipped entirely when find_claude_binary_for_user
couldn't locate the binary in hardcoded paths (Homebrew, .local/bin, .bun, nvm).
Users who installed via volta, pnpm, fnm, asdf, mise, etc. had empty plan badges.

Instead of hardcoding more paths, this change lets the user's login shell resolve
the binary via PATH. When claude_binary is None, the command is passed through
`shell -lc "claude auth status --json"` — the -l flag sources the user's profile
(.zshrc, .bashrc) which sets up PATH for any package manager.

This works in all execution contexts:
- Terminal: login shell resolves via user's PATH
- MDM/root: launchctl asuser <uid> /bin/zsh -lc "claude auth status --json"
- su fallback: su - <user> -c "claude auth status --json"

When the binary IS found via hardcoded paths, behavior is unchanged (uses full path).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Add Sentry diagnostics for Claude Code plan detection failures (#135)

* Add Sentry diagnostic reporting for Claude Code plan detection failures

When plan detection returns None, send a Sentry message event with
breadcrumbs showing which stages were attempted and why each failed
(keychain, launchctl, su, direct execution). This gives visibility
into cases like kim.le where plan detection silently fails.

- Extract _send_sentry_payload helper from report_to_sentry
- Add report_message_to_sentry for non-exception Sentry events
- Add optional diagnostics parameter to get_claude_subscription_type
- Wire up at call site in ai_tools_discovery.py
- Add tests for never-crash contract and diagnostics population

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Simplify: use existing report_to_sentry instead of new message function

Remove over-engineered _send_sentry_payload and report_message_to_sentry.
Use report_to_sentry with a synthetic RuntimeError and diagnostics in
the context dict instead. Same Sentry visibility, much less code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove unnecessary test additions

Diagnostics list population is trivial — existing plan detection
tests in test_claude_auth_status.py already cover the core behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Gate Sentry alert on active usage to reduce noise

Only send plan detection failure to Sentry when the user has projects
(i.e. actively uses Claude Code). Users without projects having no
plan is expected — not worth alerting on.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Detect API key auth as plan type in Claude Code plan detection (#136)

* Detect API key auth as plan type in Claude Code plan detection

When users authenticate via ANTHROPIC_API_KEY instead of OAuth,
`claude auth status --json` returns `authMethod: "api_key"` without
a `subscriptionType`. Previously this was treated as a detection
failure (plan=None), triggering false-positive Sentry warnings.

Now `_run_auth_status` checks for `authMethod: "api_key"` when
`subscriptionType` is absent and returns "api_key" as the plan.
OAuth users are unaffected — subscriptionType always takes priority.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Make authMethod comparison case-insensitive

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Include authMethod in diagnostic breadcrumbs for Sentry

When plan detection returns ok=True but plan=None, we now include
the authMethod field from the CLI response in Sentry diagnostics.
This enables debugging future unknown auth scenarios without
logging the full CLI response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Gate Sentry alerts on CLI success — suppress noise from unauthenticated users

Only fire plan detection Sentry events when at least one CLI stage
returned ok=True (user authenticated but plan missing). When all
stages return ok=False, the user never authenticated Claude Code —
not an actionable alert.

Eliminates noise from test users (dev-diana, dev-alice, ec2-user)
who have Claude binary installed but never logged in.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Fix Codex TOML sub-table sections leaking as spurious MCP servers

Nested sections like [mcp_servers.foo.http_headers] and
[mcp_servers.foo.tools.bar] were being parsed as top-level server
entries, leaking secrets (e.g. API keys in http_headers) into the
report payload. Use tomllib on Python 3.11+ and skip dotted-name
sections in the regex fallback for older runtimes.

* Address greptile review: fix or-falsy-dict and list-of-dicts secret leak

- Use key-presence check instead of `or` to avoid treating an empty
  mcp_servers dict as absent and falling through to mcpServers
- Also filter out list-of-dict values so arrays of inline tables
  can't bypass the secret filter

* [WEB-4391] Remove verbose JSON payload logging and add per-tool timing metrics (#132)

* Remove verbose JSON payload logging and add per-tool timing metrics

The JSON payload logging serialized every report to stderr line-by-line,
producing ~39K-156K log lines per run. This wasted ~1-2 min of wall-clock
time and caused pipe buffer issues under Rippling MDM execution.

Per-tool timing replaces the aggregated `process_single_tool` and
`send_report_per_tool_user` step names with `process_tool.<slug>` and
`send_tool.<slug>`, enabling identification of which specific tool
causes the 2-4 minute processing gaps observed via Sentry timestamps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Restore JSON payload logging with rules/skills content stripped

Instead of removing the entire JSON dump, keep it for debugging but
replace rules and skills `content` fields with `<N chars>` placeholders.
This preserves structural visibility while cutting the bulk of the
stderr output (~39K-156K lines per run → ~2-3K lines).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove over-engineered test file for _metric_slug helper

The _metric_slug helper is a simple 4-line internal function that
doesn't warrant its own dedicated test file with 12 parametrized cases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Revert per-tool metric names, keep only content stripping

Remove _metric_slug helper and per-tool Sentry metric names
(process_tool.{slug}, send_tool.{slug}) to avoid inflating
Sentry metric volume. Restore original generic metric names.

The only change in this PR is now: strip rules/skills content
from the JSON payload log to reduce stderr volume.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* [WEB-4391] Strip MCP scan bulk from stderr log to fix Rippling MDM pipe buffer exhaustion (#139)

* Strip MCP scan bulk from JSON payload log to reduce stderr volume

Remove inputSchema, outputSchema from MCP scan tool entries and
truncate description to 80 chars in the log-only deepcopy. This
reduces stderr by ~150-200KB per run on devices with many MCP
servers, fixing Rippling MDM pipe buffer exhaustion that causes
"Transaction was interrupted during processing" errors.

The actual payload sent to the backend is untouched (deepcopy
isolation). Rule/skill content stripping was already merged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Trigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Improve plan detection diagnostics and broaden api_key matching (#140)

Two changes to improve debuggability of plan detection Sentry alerts:

1. Use `"api_key" in` instead of `== "api_key"` for auth method matching.
   Claude Code has both "api_key" and "api_key_helper" auth methods — the
   strict equality missed api_key_helper users, leaving their plan as None
   and triggering false-positive Sentry warnings.

2. Add `key_source` (from apiKeySource) to diagnostic breadcrumbs.
   This distinguishes org-managed logins ("/login managed key") from
   personal accounts, which is critical for understanding why
   subscriptionType is null for team/enterprise org users.

Fixes DISCOVERY-TOOL-SCRIPT-V

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Add ~/.unbound/discovery-cache.json local cache + lock + heartbeat

New module that backs per-tool payload-hash dedup and prevents concurrent scans from piling up. Cache file is JSON; lock is an mtime-heartbeat file under ~/.unbound/discovery.lock.

* Integrate per-tool hash dedup into discovery main

Acquire the discovery lock, start the heartbeat, compute payload_hash per (tool, home_user), skip upload when the local cache shows the same hash, and update the cache on a successful send.

* Stabilize payload hash by stripping volatile fields and sorting lists

Expand _strip_ephemeral so cosmetic re-scans dedup correctly: drop plugins[*].installed_at, mcpServers[*].scan.scanned_at/error, mcpServers[*].oauth.{clientId,callbackPort}, and canonicalise filesystem-walk-order-dependent lists (projects, plugins, rules, skills, mcpServers, scan.tools, permissions.{allow,deny,ask}_rules).

* Skip other tools' ~/.<dir>/ from project-scope walkers

Project-scope MCP/rules/skills walkers (filesystem from /) were adopting .mcp.json / .agents / .cursor entries inside ~/.codex, ~/.cursor extensions, etc. as projects of the wrong tool. New is_home_dotdir_descendant predicate wired into walk_for_tool_directories, walk_for_mcp_configs_generic, walk_for_claude_project_mcp_configs, and the Cursor skills _walk_for_skills uniformly — covers any tool that uses the standard ~/.toolname/ convention without per-tool maintenance.

* Accept api_key via UNBOUND_API_KEY env (keeps argv clean)

Hook-triggered invocations can now pass the api_key through the subprocess env so it never appears in argv / /proc/<pid>/cmdline. CLI --api-key remains the default for MDM and direct-script usage.

* Address PR review: tighten dotdir helper, drop dead funcs, lift import

- is_home_dotdir_descendant now matches only the canonical /Users/<u>/.foo and /home/<u>/.foo positions; non-standard mounts like /srv/home/<u>/.config or /data/Users/shared/.cache no longer false-match.
- Remove unused stamp_last_run, is_debounced, _parse_iso, and DEBOUNCE_SECONDS from cache.py — the hook owns the debounce + last_run_at stamp directly, the cache module was never the call site for these.
- Move heartbeat_start() into the try block in ai_tools_discovery main so a heartbeat thread-spawn failure can't leak the discovery lock (release_lock now always runs).
- Lift is_home_dotdir_descendant import in walk_for_tool_directories from function body to module level — no circular import (mcp_extraction_helpers only does lazy imports from macos_extraction_helpers).

* Release: staging → main (2026-05-31) (#153)

* feat: add cross-platform --set-cron support (macOS launchd, Linux systemd/crontab, Windows Task Scheduler) (#126)

* feat: add cross-platform --set-cron support for user-level onboard and discover

Extends setup-scheduled-scan.sh to support Linux (systemd/crontab) in
addition to macOS launchd, and adds --command discover|onboard with
--discovery-key flag so the scheduled job can re-run the full onboard
flow — not just discovery.

Adds setup-scheduled-scan.ps1 (Windows) u…

* Fix symlink guard over-applying to home dir + stale last_lock_error

Greptile P1/P2 from the cherry-pick review of the discovery state-dir fix.

P1: atomic_write_cache's is_symlink guard ran for EVERY state dir, including
the trusted home. A legitimately symlinked ~/.unbound (dotfile/volume mgmt)
would have its cache writes silently skipped -> cold cache, re-upload every
run, hashes never persisted. The guard is only needed for the shared-temp
fallback, where a post-resolution dir swap is the real threat. Gate it to the
non-home (fallback) dir only.

P2: when the home candidate failed (setting last_lock_error) but the temp
fallback succeeded, acquire_lock returned "acquired" yet last_lock_error still
held the home-dir error. No caller reads it on success today, but it is
misleading state. Clear last_lock_error on successful resolution in
_ensure_state_dir.

Tests: symlinked home still writes its cache; symlinked fallback dir still
refuses the write; last_lock_error is cleared after a fallen-back acquire.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vishnu <79318686+zeus-12@users.noreply.github.com>
Co-authored-by: Aakash Velusamy <aakashvpsgtech@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: MohamedAklamaash <aklamaash78@gmail.com>
Co-authored-by: Mohamed Aklamaash M.R <111295679+MohamedAklamaash@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NandaPranesh <106886030+anonpran@users.noreply.github.com>
Co-authored-by: pugazhendhi-m <132246623+pugazhendhi-m@users.noreply.github.com>
Co-authored-by: Nanda Pranesh <nandapranesh27@gmail.com>
---
 .../ai_tools_discovery.py                     |  26 +-
 scripts/coding_discovery_tools/cache.py       | 195 ++++++++-
 tests/test_discovery_flow.py                  | 381 +++++++++++++++++-
 3 files changed, 581 insertions(+), 21 deletions(-)

diff --git a/scripts/coding_discovery_tools/ai_tools_discovery.py b/scripts/coding_discovery_tools/ai_tools_discovery.py
index 01440c29..9e30c705 100644
--- a/scripts/coding_discovery_tools/ai_tools_discovery.py
+++ b/scripts/coding_discovery_tools/ai_tools_discovery.py
@@ -2051,9 +2051,33 @@ def time_step(name: str, phase: str) -> Iterator[None]:
     # Acquire single-flight lock. Another discovery process running means we exit
     # quietly — hook fire was redundant. Heartbeat thread keeps the lock mtime fresh
     # so other hooks can detect a zombie (>STALE_LOCK_SECONDS without a tick).
-    if not discovery_cache.acquire_lock():
+    _lock_status = discovery_cache.acquire_lock()
+    if _lock_status == "contended":
         logger.info("Another discovery process is running (lock held); exiting.")
         sys.exit(0)
+    if _lock_status == "setup_failed":
+        logger.error(
+            "Discovery preflight failed: cannot create lock/cache dir "
+            f"({discovery_cache.UNBOUND_DIR}); exiting."
+        )
+        try:
+            _ctx = dict(sentry_ctx)
+            _ctx.update({
+                "phase": "acquire_lock",
+                "home": os.environ.get("HOME") or os.path.expanduser("~"),
+                "unbound_dir": str(discovery_cache.UNBOUND_DIR),
+                "lock_error": discovery_cache.last_lock_error or "",
+            })
+            if hasattr(os, "getuid"):
+                _ctx["uid"] = os.getuid()
+            report_to_sentry(
+                OSError("discovery lock setup failed"),
+                context=_ctx,
+                level="error",
+            )
+        except Exception:
+            pass
+        sys.exit(0)
     _heartbeat_stop = None
 
     try:
diff --git a/scripts/coding_discovery_tools/cache.py b/scripts/coding_discovery_tools/cache.py
index 2d03c632..b65ffba1 100644
--- a/scripts/coding_discovery_tools/cache.py
+++ b/scripts/coding_discovery_tools/cache.py
@@ -12,6 +12,7 @@
 import json
 import logging
 import os
+import stat
 import tempfile
 import threading
 import time
@@ -21,13 +22,149 @@
 
 logger = logging.getLogger(__name__)
 
-UNBOUND_DIR = Path.home() / ".unbound"
+# Immutable home anchor. _ensure_state_dir() may reassign the *active* paths
+# below to a temp fallback, but the home candidate is always derived from this
+# so a second acquire_lock() in the same process re-evaluates home first instead
+# of treating an already-resolved temp dir as a trusted non-private candidate.
+_HOME_STATE_DIR = Path.home() / ".unbound"
+UNBOUND_DIR = _HOME_STATE_DIR
 CACHE_PATH = UNBOUND_DIR / "discovery-cache.json"
 LOCK_PATH = UNBOUND_DIR / "discovery.lock"
 
 STALE_LOCK_SECONDS = 15 * 60
 HEARTBEAT_INTERVAL_SECONDS = 60
 
+# Never open the lock through a swapped symlink; 0 on platforms lacking it.
+_O_NOFOLLOW = getattr(os, "O_NOFOLLOW", 0)
+
+last_lock_error: Optional[str] = None
+
+
+def _state_dir_candidates() -> list:
+    """Ordered (path, is_private_temp) candidates for the state dir.
+    Home first (preserves existing behavior); deterministic uid-namespaced
+    temp dir as fallback. Split out as a function so tests can inject candidates.
+
+    The fixed name is a deliberate trade-off: it must be deterministic so a
+    daemon and a login session of the same uid resolve to the SAME dir (shared
+    cross-process single-flight). A hostile pre-existing entry at that fixed name
+    is refused below (-> setup_failed, surfaced to Sentry by the caller) rather
+    than silently working around it."""
+    candidates = [(_HOME_STATE_DIR, False)]
+    if hasattr(os, "getuid"):
+        # POSIX: /var/tmp is cross-session AND reboot-stable (unlike per-session
+        # launchd $TMPDIR via tempfile.gettempdir() on macOS, which would split
+        # the lock/cache between a daemon and a login session of the same uid).
+        # Matches utils._get_queue_file_path()'s /var/tmp/...-{uid} idiom.
+        candidates.append((Path(f"/var/tmp/unbound-{os.getuid()}"), True))
+    else:
+        # Windows: no uid; gettempdir() is already per-user there.
+        candidates.append((Path(tempfile.gettempdir()) / "unbound", True))
+    return candidates
+
+
+def _is_unsafe_existing(path: Path) -> bool:
+    """True if `path` already exists as a symlink, a non-dir, or a dir we don't
+    own — i.e. a path we must NOT trust for a fixed-name dir in a shared temp."""
+    try:
+        st = os.lstat(str(path))
+    except OSError:
+        return False  # doesn't exist yet — safe to create
+    if stat.S_ISLNK(st.st_mode):
+        return True
+    if not stat.S_ISDIR(st.st_mode):
+        return True
+    if hasattr(os, "getuid") and st.st_uid != os.getuid():
+        return True
+    return False
+
+
+def _parent_is_unsafe(path: Path) -> bool:
+    """True if `path`'s parent is world-writable but NOT sticky. Our symlink/
+    ownership hardening only holds if the parent (e.g. /var/tmp) is sticky
+    (mode 1777) so a non-owner can't remove/rename our fixed-name entry."""
+    if not hasattr(os, "getuid"):
+        # Windows: st_mode reports 0o777 with no sticky bit for normal dirs, so
+        # this POSIX world-writable/sticky check is meaningless (and would reject
+        # every candidate). gettempdir() is already per-user there.
+        return False
+    try:
+        pst = os.lstat(str(path.parent))
+    except OSError:
+        return False  # parent missing; mkdir(parents=True) will handle/fail
+    # world-writable but NOT sticky = anyone can swap our fixed-name entry
+    if (pst.st_mode & stat.S_IWOTH) and not (pst.st_mode & stat.S_ISVTX):
+        return True
+    return False
+
+
+def _try_state_dir(path: Path, is_private: bool) -> bool:
+    """Make `path` usable. Returns True if it is a writable dir we can use.
+    mkdir-only probe (no file-write probe — see module note). On the private
+    temp candidate, refuse hostile pre-existing entries and lock perms to 0700."""
+    global last_lock_error
+    try:
+        if is_private and _parent_is_unsafe(path):
+            last_lock_error = f"unsafe (non-sticky world-writable) parent for {path}"
+            return False
+        if is_private and _is_unsafe_existing(path):
+            last_lock_error = f"unsafe pre-existing state dir: {path}"
+            return False
+        path.mkdir(parents=True, exist_ok=True)
+        # mkdir with exist_ok=True is a no-op success on a pre-existing dir, so it
+        # does not prove we can create entries inside it. Probe writability so an
+        # existing-but-unwritable dir falls through to the next candidate instead
+        # of failing later at lock creation (which would skip the fallback).
+        if not os.access(str(path), os.W_OK | os.X_OK):
+            last_lock_error = f"state dir not writable: {path}"
+            return False
+        if is_private:
+            try:
+                os.chmod(str(path), 0o700)
+            except OSError:
+                pass
+            # Re-check after creation in case of a race that swapped it, and
+            # confirm chmod actually took (it is best-effort above) so we never
+            # trust a private dir left group/other-accessible.
+            if _is_unsafe_existing(path):
+                last_lock_error = f"unsafe state dir after create: {path}"
+                return False
+            # chmod above is best-effort; a pre-existing 0755 dir or a failed
+            # chmod must NOT be trusted — any group/other bit leaks discovery
+            # state (tool inventory, home_user) to other local users. POSIX only:
+            # on Windows st_mode is synthetic (0o777 for dirs, no real perm bits)
+            # so this check is meaningless there, and %TEMP% is per-user anyway.
+            if hasattr(os, "getuid"):
+                st = os.lstat(str(path))
+                if st.st_mode & 0o077:
+                    last_lock_error = f"state dir not private (mode {oct(stat.S_IMODE(st.st_mode))}): {path}"
+                    return False
+        return True
+    except OSError as e:
+        last_lock_error = str(e)
+        return False
+
+
+def _ensure_state_dir() -> bool:
+    """Resolve UNBOUND_DIR/CACHE_PATH/LOCK_PATH to the first usable candidate,
+    reassigning the module globals when falling back. Returns True if a usable
+    dir was found, False otherwise (caller returns 'setup_failed')."""
+    global UNBOUND_DIR, CACHE_PATH, LOCK_PATH, last_lock_error
+    for path, is_private in _state_dir_candidates():
+        if _try_state_dir(path, is_private):
+            if path != UNBOUND_DIR:
+                logger.warning(f"home state dir unusable; using fallback state dir {path}")
+                UNBOUND_DIR = path
+                CACHE_PATH = path / "discovery-cache.json"
+                LOCK_PATH = path / "discovery.lock"
+            # An earlier candidate (e.g. an unwritable home) may have set
+            # last_lock_error; clear it now that we have a usable dir so a
+            # successful (possibly fallen-back) acquire never reports a stale
+            # error to any future reader.
+            last_lock_error = None
+            return True
+    return False
+
 
 def _now_iso() -> str:
     return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
@@ -47,6 +184,16 @@ def read_cache() -> dict:
 
 def atomic_write_cache(data: dict) -> None:
     try:
+        # Refuse to write the cache (which can contain MCP configs / tool
+        # inventory / paths) through a symlinked state dir — but ONLY for the
+        # shared-temp fallback, where a post-resolution dir swap is the threat.
+        # The home dir (~/.unbound) is trusted by design and MAY legitimately be
+        # a user-created symlink; guarding it there would silently skip every
+        # cache write and force a cold re-upload each run. Check BEFORE mkdir,
+        # since mkdir(parents=True) would follow a symlink and create its target.
+        if UNBOUND_DIR != _HOME_STATE_DIR and UNBOUND_DIR.is_symlink():
+            logger.warning(f"discovery-cache write skipped: fallback state dir is a symlink: {UNBOUND_DIR}")
+            return
         UNBOUND_DIR.mkdir(parents=True, exist_ok=True)
         fd, tmp = tempfile.mkstemp(prefix=".discovery-cache.", suffix=".tmp", dir=str(UNBOUND_DIR))
         try:
@@ -103,37 +250,51 @@ def _lock_is_live() -> bool:
     return age < STALE_LOCK_SECONDS
 
 
-def acquire_lock() -> bool:
-    """Best-effort exclusive lock. Returns True on success, False if held by a live process."""
-    try:
-        UNBOUND_DIR.mkdir(parents=True, exist_ok=True)
-    except OSError as e:
-        logger.warning(f"could not create {UNBOUND_DIR}: {e}")
-        return False
+def acquire_lock() -> str:
+    """Best-effort exclusive lock. Returns "acquired", "contended" (held by a live process), or "setup_failed"."""
+    global last_lock_error
+    last_lock_error = None
+    if not _ensure_state_dir():
+        # _ensure_state_dir() already created+verified the dir (or returned
+        # False -> setup_failed); no redundant blind mkdir here (TOCTOU).
+        return "setup_failed"
 
     if LOCK_PATH.exists() and _lock_is_live():
-        return False
+        return "contended"
 
     if LOCK_PATH.exists():
         try:
             LOCK_PATH.unlink()
         except OSError as e:
+            last_lock_error = str(e)
             logger.warning(f"could not steal stale lock: {e}")
-            return False
+            return "setup_failed"
 
     try:
-        fd = os.open(str(LOCK_PATH), os.O_CREAT | os.O_EXCL | os.O_WRONLY, 0o600)
+        fd = os.open(str(LOCK_PATH), os.O_CREAT | os.O_EXCL | os.O_WRONLY | _O_NOFOLLOW, 0o600)
     except FileExistsError:
-        return False
+        return "contended"
     except OSError as e:
+        last_lock_error = str(e)
         logger.warning(f"could not create lock: {e}")
-        return False
+        return "setup_failed"
 
     try:
-        os.write(fd, f"{os.getpid()} {_now_iso()}\n".encode("utf-8"))
-    finally:
-        os.close(fd)
-    return True
+        try:
+            os.write(fd, f"{os.getpid()} {_now_iso()}\n".encode("utf-8"))
+        finally:
+            os.close(fd)
+    except OSError as e:
+        last_lock_error = str(e)
+        logger.warning(f"could not write lock: {e}")
+        # Remove the lock file we just created so a write failure can't leave a
+        # fresh ghost lock that makes the next run see false contention.
+        try:
+            LOCK_PATH.unlink(missing_ok=True)
+        except OSError:
+            pass
+        return "setup_failed"
+    return "acquired"
 
 
 def release_lock() -> None:
diff --git a/tests/test_discovery_flow.py b/tests/test_discovery_flow.py
index b611b1eb..a86be1d4 100644
--- a/tests/test_discovery_flow.py
+++ b/tests/test_discovery_flow.py
@@ -9,18 +9,21 @@
 Tool detection runs un-mocked on whatever OS is available.
 """
 
+import errno
 import json
 import os
 import platform
 import shutil
+import stat
 import subprocess
 import sys
 import tempfile
 import threading
+import time
 import unittest
 from http.server import HTTPServer, BaseHTTPRequestHandler
 from pathlib import Path
-from unittest.mock import patch
+from unittest.mock import Mock, patch
 
 import scripts.coding_discovery_tools.utils as utils_mod
 from scripts.coding_discovery_tools.utils import (
@@ -202,7 +205,7 @@ class TestUnsupportedPlatformGuard(unittest.TestCase):
     def test_linux_proceeds_past_os_guard(self):
         """Linux is supported — it must NOT exit 3 at the OS guard.
 
-        We patch acquire_lock to return False so main() exits 0 at the
+        We patch acquire_lock to return "contended" so main() exits 0 at the
         single-flight lock check (the first exit point after the guard).
         Linux reaching that exit-0 proves it passed the OS guard rather
         than hitting the old exit-3.
@@ -211,7 +214,7 @@ def test_linux_proceeds_past_os_guard(self):
 
         argv = ["ai_tools_discovery.py", "--api-key", "k", "--domain", "http://127.0.0.1:1"]
         with patch.object(adm.platform, "system", return_value="Linux"), \
-             patch.object(adm.discovery_cache, "acquire_lock", return_value=False), \
+             patch.object(adm.discovery_cache, "acquire_lock", return_value="contended"), \
              patch.object(adm, "AIToolsDetector"), \
              patch.object(sys, "argv", argv):
             with self.assertRaises(SystemExit) as cm:
@@ -566,5 +569,377 @@ def test_prefix_collision_excludes_similar_username(self):
         self.assertIn("permissions", filtered_gowshik_2)
 
 
+class TestAcquireLockReasonCodes(unittest.TestCase):
+    """acquire_lock() returns "acquired"/"contended"/"setup_failed" and sets
+    last_lock_error only on setup failure."""
+
+    def setUp(self):
+        import scripts.coding_discovery_tools.cache as cache
+        self.cache = cache
+        self._tmp = tempfile.mkdtemp()
+        unbound_dir = Path(self._tmp) / ".unbound"
+        self._patchers = [
+            # _HOME_STATE_DIR is the home candidate _state_dir_candidates() reads;
+            # UNBOUND_DIR/LOCK_PATH/CACHE_PATH are the active paths downstream uses.
+            patch.object(cache, "_HOME_STATE_DIR", unbound_dir),
+            patch.object(cache, "UNBOUND_DIR", unbound_dir),
+            patch.object(cache, "LOCK_PATH", unbound_dir / "discovery.lock"),
+            patch.object(cache, "CACHE_PATH", unbound_dir / "discovery-cache.json"),
+        ]
+        for p in self._patchers:
+            p.start()
+        cache.last_lock_error = None
+
+    def tearDown(self):
+        for p in self._patchers:
+            p.stop()
+        shutil.rmtree(self._tmp, ignore_errors=True)
+
+    def test_acquire_lock_setup_failed_on_unwritable_dir(self):
+        with patch.object(
+            Path, "mkdir", side_effect=OSError(errno.EPERM, "Operation not permitted")
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+        self.assertTrue(self.cache.last_lock_error)
+
+    def test_acquire_lock_contended_on_fresh_lock(self):
+        self.cache.UNBOUND_DIR.mkdir(parents=True, exist_ok=True)
+        self.cache.LOCK_PATH.write_text("123 now\n")
+        os.utime(self.cache.LOCK_PATH, (time.time(), time.time()))
+
+        self.assertEqual(self.cache.acquire_lock(), "contended")
+        self.assertIsNone(self.cache.last_lock_error)
+
+    def test_acquire_lock_acquired_clean(self):
+        self.assertEqual(self.cache.acquire_lock(), "acquired")
+        self.assertTrue(self.cache.LOCK_PATH.exists())
+
+    def test_acquire_lock_setup_failed_on_steal_stale_unlink_error(self):
+        # Create a STALE lock (mtime older than STALE_LOCK_SECONDS) so
+        # _lock_is_live() is False and acquire_lock takes the steal/unlink path.
+        self.cache.UNBOUND_DIR.mkdir(parents=True, exist_ok=True)
+        self.cache.LOCK_PATH.write_text("999 then\n")
+        stale = time.time() - (self.cache.STALE_LOCK_SECONDS + 60)
+        os.utime(self.cache.LOCK_PATH, (stale, stale))
+
+        # Start the patch AFTER creating the stale file so setup isn't broken.
+        with patch.object(
+            Path, "unlink", side_effect=OSError(errno.EPERM, "Operation not permitted")
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+        self.assertTrue(self.cache.last_lock_error)
+
+    def test_acquire_lock_contended_on_open_race(self):
+        # TOCTOU race: LOCK_PATH absent at the exists() checks, but os.open with
+        # O_CREAT|O_EXCL loses the race and raises FileExistsError.
+        with patch.object(self.cache.os, "open", side_effect=FileExistsError()):
+            self.assertEqual(self.cache.acquire_lock(), "contended")
+        self.assertIsNone(self.cache.last_lock_error)
+
+    def test_acquire_lock_setup_failed_on_write_error(self):
+        # mkdir/exists/open succeed on a clean temp dir, but os.write fails.
+        with patch.object(self.cache.os, "write", side_effect=OSError(errno.ENOSPC, "No space left on device")):
+            self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+        self.assertTrue(self.cache.last_lock_error)
+        # The lock file created by os.open must be removed on write failure, so a
+        # ghost lock can't make the next run see false contention.
+        self.assertFalse(self.cache.LOCK_PATH.exists())
+
+
+class TestStateDirFallback(unittest.TestCase):
+    """_ensure_state_dir falls back to a deterministic uid-namespaced temp dir
+    when home is unusable, refuses hostile pre-existing temp entries, and the
+    fallback path is fixed (never random)."""
+
+    def setUp(self):
+        import scripts.coding_discovery_tools.cache as cache
+        self.cache = cache
+        self._tmp = tempfile.mkdtemp()
+        # Stash globals the resolver may reassign so tearDown can restore them.
+        self._orig_unbound_dir = cache.UNBOUND_DIR
+        self._orig_cache_path = cache.CACHE_PATH
+        self._orig_lock_path = cache.LOCK_PATH
+        self._orig_home_state_dir = cache._HOME_STATE_DIR
+        cache.last_lock_error = None
+
+    def tearDown(self):
+        self.cache.UNBOUND_DIR = self._orig_unbound_dir
+        self.cache.CACHE_PATH = self._orig_cache_path
+        self.cache.LOCK_PATH = self._orig_lock_path
+        self.cache._HOME_STATE_DIR = self._orig_home_state_dir
+        self.cache.last_lock_error = None
+        shutil.rmtree(self._tmp, ignore_errors=True)
+
+    def _unmkdir_able(self, name):
+        # A candidate under a regular file: mkdir raises NotADirectoryError
+        # deterministically and cross-platform.
+        blocker = Path(self._tmp) / name
+        blocker.write_text("x")
+        return Path(blocker) / ".unbound"
+
+    def test_home_unusable_falls_back_to_temp(self):
+        bad_home = self._unmkdir_able("blocker")
+        good_temp = Path(self._tmp) / "unbound-test"
+        with patch.object(
+            self.cache, "_state_dir_candidates",
+            return_value=[(bad_home, False), (good_temp, True)],
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "acquired")
+        self.assertEqual(self.cache.UNBOUND_DIR, good_temp)
+        self.assertTrue(self.cache.LOCK_PATH.exists())
+
+    def test_both_candidates_fail_returns_setup_failed(self):
+        bad_a = self._unmkdir_able("blocker_a")
+        bad_b = self._unmkdir_able("blocker_b")
+        with patch.object(
+            self.cache, "_state_dir_candidates",
+            return_value=[(bad_a, False), (bad_b, True)],
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+        self.assertTrue(self.cache.last_lock_error)
+
+    @unittest.skipUnless(hasattr(os, "symlink"), "symlink unsupported")
+    def test_private_temp_symlink_refused(self):
+        target = Path(self._tmp) / "elsewhere"
+        target.mkdir()
+        link = Path(self._tmp) / "unbound-link"
+        os.symlink(str(target), str(link))
+        with patch.object(
+            self.cache, "_state_dir_candidates",
+            return_value=[(link, True)],
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+        self.assertIn("unsafe", self.cache.last_lock_error)
+
+    @unittest.skipIf(sys.platform == "win32", "POSIX permission bits only")
+    def test_private_temp_created_0700(self):
+        new_dir = Path(self._tmp) / "unbound-fresh"
+        with patch.object(
+            self.cache, "_state_dir_candidates",
+            return_value=[(new_dir, True)],
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "acquired")
+        self.assertTrue(new_dir.exists())
+        self.assertEqual(stat.S_IMODE(os.lstat(str(new_dir)).st_mode), 0o700)
+
+    def test_temp_candidate_is_deterministic_not_random(self):
+        # Regression guard against swapping in mkdtemp(): the temp candidate
+        # must be the fixed path. On POSIX it is /var/tmp/unbound-{uid}
+        # (cross-session + reboot-stable, NOT the per-session launchd $TMPDIR
+        # that gettempdir() would return on macOS); on Windows it is
+        # gettempdir()/unbound (already per-user there).
+        candidates = self.cache._state_dir_candidates()
+        if hasattr(os, "getuid"):
+            expected = Path(f"/var/tmp/unbound-{os.getuid()}")
+        else:
+            expected = Path(tempfile.gettempdir()) / "unbound"
+        self.assertEqual(candidates[-1][0], expected)
+        self.assertTrue(candidates[-1][1])  # flagged private
+
+    @unittest.skipUnless(hasattr(os, "symlink"), "symlink unsupported")
+    def test_symlinked_home_state_dir_still_writes_cache(self):
+        # A user may legitimately symlink ~/.unbound to a writable target. The
+        # symlink guard must NOT apply to the trusted home dir, or every cache
+        # write is silently skipped (cold cache, re-upload every run).
+        target = Path(self._tmp) / "real_unbound"
+        target.mkdir()
+        link = Path(self._tmp) / ".unbound-link"
+        os.symlink(str(target), str(link))
+        self.cache._HOME_STATE_DIR = link
+        self.cache.UNBOUND_DIR = link
+        self.cache.CACHE_PATH = link / "discovery-cache.json"
+        self.cache.atomic_write_cache({"tools": {"x": {"u": {"payload_hash": "h"}}}})
+        self.assertTrue((target / "discovery-cache.json").exists())
+
+    @unittest.skipUnless(hasattr(os, "symlink"), "symlink unsupported")
+    def test_symlinked_fallback_dir_write_skipped(self):
+        # The guard still protects the temp fallback: if the resolved (non-home)
+        # dir is a symlink, refuse to write the cache through it.
+        target = Path(self._tmp) / "evil"
+        target.mkdir()
+        link = Path(self._tmp) / "fallback-link"
+        os.symlink(str(target), str(link))
+        self.cache._HOME_STATE_DIR = Path(self._tmp) / ".unbound"  # != link
+        self.cache.UNBOUND_DIR = link
+        self.cache.CACHE_PATH = link / "discovery-cache.json"
+        self.cache.atomic_write_cache({"tools": {}})
+        self.assertFalse((target / "discovery-cache.json").exists())
+
+    def test_last_lock_error_cleared_on_successful_fallback(self):
+        # Home candidate fails (sets last_lock_error) but the temp fallback
+        # succeeds -> a successful acquire must not leave a stale error string.
+        bad_home = self._unmkdir_able("blk")
+        good_temp = Path(self._tmp) / "unbound-ok"
+        with patch.object(
+            self.cache, "_state_dir_candidates",
+            return_value=[(bad_home, False), (good_temp, True)],
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "acquired")
+        self.assertIsNone(self.cache.last_lock_error)
+
+    def test_home_candidate_uses_immutable_anchor_after_fallback(self):
+        # P1 regression: after a fallback reassigns UNBOUND_DIR to a temp dir,
+        # the next resolution must still offer the real HOME as candidate[0]
+        # (the is_private=False trust is only safe for the user's own home),
+        # so the temp-dir hardening can't be bypassed on a re-entrant call.
+        self.cache.UNBOUND_DIR = Path("/var/tmp/unbound-already-fallen-back")
+        candidates = self.cache._state_dir_candidates()
+        self.assertEqual(candidates[0][0], self.cache._HOME_STATE_DIR)
+        self.assertFalse(candidates[0][1])  # home anchor, not private-temp
+
+    @unittest.skipIf(sys.platform == "win32", "POSIX permission bits only")
+    @unittest.skipIf(hasattr(os, "getuid") and os.getuid() == 0, "root bypasses W_OK")
+    def test_home_exists_but_unwritable_falls_back(self):
+        # mkdir succeeds on a pre-existing dir even when it isn't writable;
+        # the os.access probe must still force the fallback (and reassign all
+        # three path globals), not fail later at lock creation.
+        bad_home = Path(self._tmp) / "ro_home"
+        bad_home.mkdir()
+        os.chmod(str(bad_home), 0o500)
+        good_temp = Path(self._tmp) / "unbound-rw"
+        try:
+            with patch.object(
+                self.cache, "_state_dir_candidates",
+                return_value=[(bad_home, False), (good_temp, True)],
+            ):
+                self.assertEqual(self.cache.acquire_lock(), "acquired")
+            self.assertEqual(self.cache.UNBOUND_DIR, good_temp)
+            self.assertEqual(self.cache.CACHE_PATH, good_temp / "discovery-cache.json")
+            self.assertEqual(self.cache.LOCK_PATH, good_temp / "discovery.lock")
+        finally:
+            os.chmod(str(bad_home), 0o700)
+
+    def test_foreign_owned_temp_dir_refused(self):
+        # A pre-existing private candidate owned by someone else (ownership
+        # mismatch) must be refused, not trusted. Simulate via _is_unsafe_existing.
+        foreign = Path(self._tmp) / "unbound-foreign"
+        with patch.object(self.cache, "_is_unsafe_existing", return_value=True), \
+             patch.object(
+                 self.cache, "_state_dir_candidates",
+                 return_value=[(foreign, True)],
+             ):
+            self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+        self.assertTrue(self.cache.last_lock_error)
+
+    @unittest.skipIf(sys.platform == "win32", "POSIX permission bits only")
+    def test_world_readable_temp_dir_refused(self):
+        # A pre-existing 0755 dir whose chmod-to-0700 fails (simulated no-op)
+        # leaks discovery state to other users; the post-create mode recheck
+        # must refuse it.
+        loose = Path(self._tmp) / "unbound-loose"
+        loose.mkdir()
+        os.chmod(str(loose), 0o755)
+        try:
+            with patch.object(self.cache.os, "chmod", lambda *a, **k: None), \
+                 patch.object(
+                     self.cache, "_state_dir_candidates",
+                     return_value=[(loose, True)],
+                 ):
+                self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+            self.assertIn("not private", self.cache.last_lock_error)
+        finally:
+            os.chmod(str(loose), 0o700)
+
+    @unittest.skipIf(sys.platform == "win32", "POSIX permission bits only")
+    def test_non_sticky_world_writable_parent_refused(self):
+        # The symlink/ownership hardening only holds if the parent is sticky.
+        # A world-writable, NON-sticky parent lets anyone swap our fixed-name
+        # entry, so the candidate must be refused.
+        parent = Path(self._tmp) / "open_parent"
+        parent.mkdir()
+        os.chmod(str(parent), 0o777)  # world-writable, NOT sticky
+        candidate = parent / "unbound-x"
+        try:
+            with patch.object(
+                self.cache, "_state_dir_candidates",
+                return_value=[(candidate, True)],
+            ):
+                self.assertEqual(self.cache.acquire_lock(), "setup_failed")
+            self.assertTrue(self.cache.last_lock_error)
+        finally:
+            os.chmod(str(parent), 0o700)
+
+    def test_fallback_cache_writes_land_in_fallback_dir(self):
+        # End-to-end coherence of the global reassignment: after falling back to
+        # the injected private candidate, cache writes must land UNDER it (via
+        # the reassigned CACHE_PATH), not under the original home dir.
+        good_temp = Path(self._tmp) / "unbound-fallback"
+        with patch.object(
+            self.cache, "_state_dir_candidates",
+            return_value=[(good_temp, True)],
+        ):
+            self.assertEqual(self.cache.acquire_lock(), "acquired")
+            self.cache.update_tool("claude-code", "u", "hash123")
+        self.assertEqual(self.cache.CACHE_PATH, good_temp / "discovery-cache.json")
+        self.assertTrue(self.cache.CACHE_PATH.exists())
+        # The write must round-trip through the reassigned fallback path, not
+        # the original home dir.
+        self.assertEqual(
+            self.cache.get_cached_hash("claude-code", "u"), "hash123"
+        )
+
+
+class TestMainLockSetupSentry(unittest.TestCase):
+    """main() reports lock setup failures to Sentry and exits 0, but stays
+    quiet on genuine contention. Sentry reporting can never crash the exit."""
+
+    def setUp(self):
+        import scripts.coding_discovery_tools.ai_tools_discovery as adm
+        self.adm = adm
+        self.argv = [
+            "ai_tools_discovery.py", "--api-key", "k", "--domain", "http://127.0.0.1:1"
+        ]
+
+    def test_main_reports_sentry_on_lock_setup_failure(self):
+        adm = self.adm
+        mock_sentry = Mock()
+        with patch.object(adm.platform, "system", return_value="Linux"), \
+             patch.object(adm.discovery_cache, "acquire_lock", return_value="setup_failed"), \
+             patch.object(adm.discovery_cache, "last_lock_error", "EPERM Operation not permitted"), \
+             patch.object(adm, "report_to_sentry", mock_sentry), \
+             patch.object(adm, "AIToolsDetector") as mock_detector, \
+             patch.object(sys, "argv", self.argv):
+            with self.assertRaises(SystemExit) as cm:
+                adm.main()
+
+        self.assertEqual(cm.exception.code, 0)
+        mock_sentry.assert_called_once()
+        _, kwargs = mock_sentry.call_args
+        self.assertEqual(kwargs["level"], "error")
+        ctx = kwargs["context"]
+        self.assertEqual(ctx["phase"], "acquire_lock")
+        self.assertIn("unbound_dir", ctx)
+        self.assertTrue(ctx["lock_error"])
+        mock_detector.assert_not_called()
+
+    def test_main_no_sentry_on_genuine_contention(self):
+        adm = self.adm
+        mock_sentry = Mock()
+        with patch.object(adm.platform, "system", return_value="Linux"), \
+             patch.object(adm.discovery_cache, "acquire_lock", return_value="contended"), \
+             patch.object(adm, "report_to_sentry", mock_sentry), \
+             patch.object(adm, "AIToolsDetector"), \
+             patch.object(sys, "argv", self.argv):
+            with self.assertRaises(SystemExit) as cm:
+                adm.main()
+
+        self.assertEqual(cm.exception.code, 0)
+        mock_sentry.assert_not_called()
+
+    def test_main_sentry_failure_does_not_crash_exit(self):
+        adm = self.adm
+        mock_sentry = Mock(side_effect=RuntimeError("boom"))
+        with patch.object(adm.platform, "system", return_value="Linux"), \
+             patch.object(adm.discovery_cache, "acquire_lock", return_value="setup_failed"), \
+             patch.object(adm.discovery_cache, "last_lock_error", "EPERM"), \
+             patch.object(adm, "report_to_sentry", mock_sentry), \
+             patch.object(adm, "AIToolsDetector"), \
+             patch.object(sys, "argv", self.argv):
+            with self.assertRaises(SystemExit) as cm:
+                adm.main()
+
+        self.assertEqual(cm.exception.code, 0)
+
+
 if __name__ == "__main__":
     unittest.main()

From eea90c923c2e4683ce37efbaac3de00b59efb437 Mon Sep 17 00:00:00 2001
From: NandaPranesh <106886030+anonpran@users.noreply.github.com>
Date: Fri, 5 Jun 2026 00:27:55 +0530
Subject: [PATCH 3/5] fix(copilot-cli): only detect CLI on CLI-exclusive
 ~/.copilot markers (shared skills/ no longer triggers false positive) (#164)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* fix(copilot-cli): only detect CLI on CLI-exclusive ~/.copilot markers

The Copilot CLI detector declared a "GitHub Copilot CLI" install whenever
~/.copilot/ contained ANY marker, including skills/. But ~/.copilot/skills/
is a SHARED location the VS Code/JetBrains Copilot agent also reads (per
official VS Code Agent Skills docs), so IDE-only users with no actual CLI
got phantom CLI rows (confirmed in prod: device MY4W6QQGCQ / user karthick).

Split markers into STRONG (CLI-exclusive: config.json, mcp-config.json,
settings.json, lsp-config.json, permissions-config.json, session-store.db
+ dirs logs/session-state/command-history-state/installed-plugins/
plugin-data/hooks/ide/pkg) vs SHARED (copilot-instructions.md, skills/,
agents/, instructions/). Detection now fires only on a STRONG marker; a
dir holding only shared markers logs an INFO line and is suppressed.
Skills/rules extraction is unchanged (still enriches a detected install).
Per-user over-attribution is a separate fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(copilot-cli): nest COPILOT_HOME test dir under managed tmp_dir (no temp leak)

Addresses greptile P2: reuse the class-managed self.tmp_dir (cleaned in
tearDown) instead of an unmanaged tempfile.mkdtemp() that leaked a /tmp dir
per run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(copilot-cli): treat ~/.copilot/ide as SHARED (the IDE extension writes it)

~/.copilot/ide/<uuid>.lock is written by the VS Code/JetBrains Copilot
EXTENSION (microsoft/vscode-copilot-chat#3583) as a discovery lock so the
CLI can connect — an IDE-only user with no standalone CLI has it. It was
mis-classified as a STRONG (CLI-exclusive) marker, so an IDE-only Copilot
user would STILL be flagged as a phantom "GitHub Copilot CLI" install —
the exact false positive this PR exists to remove.

Demote `ide` from STRONG to SHARED so it can never alone trigger a CLI
record (it still enriches a real install). Add a regression test for an
ide/-only ~/.copilot. Confirmed on a real machine: the lock file contains
"ideName": "Visual Studio Code".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../macos/copilot_cli/copilot_cli.py          | 145 ++++++++++++++----
 .../windows/copilot_cli/copilot_cli.py        |   6 +-
 tests/test_copilot_cli_discovery.py           |  87 ++++++++++-
 3 files changed, 202 insertions(+), 36 deletions(-)

diff --git a/scripts/coding_discovery_tools/macos/copilot_cli/copilot_cli.py b/scripts/coding_discovery_tools/macos/copilot_cli/copilot_cli.py
index 0a5e01d9..5a547501 100644
--- a/scripts/coding_discovery_tools/macos/copilot_cli/copilot_cli.py
+++ b/scripts/coding_discovery_tools/macos/copilot_cli/copilot_cli.py
@@ -15,7 +15,7 @@
 import os
 import re
 from pathlib import Path
-from typing import Dict, List, Optional
+from typing import Dict, FrozenSet, List, Optional
 
 from ...coding_tool_base import BaseToolDetector
 from ...constants import VERSION_TIMEOUT
@@ -26,33 +26,60 @@
 
 # The CLI home directory is version-dependent in its exact contents. We never
 # gate on a single file, and never on bare-directory existence alone — we
-# require the config dir to exist AND contain at least one known artifact from
-# this union set (review P0-1). The marker set mirrors GitHub's documented
-# config-dir contents (Copilot CLI "config dir reference"); ``pkg`` is an
-# observed-but-undocumented binary dir kept as an extra real-world signal.
+# require the config dir to exist AND contain at least one STRONG (CLI-exclusive)
+# artifact (review P0-1).
+#
+# Markers are split into two tiers:
+#
+#   STRONG (CLI-exclusive): files/dirs the standalone GitHub Copilot CLI writes
+#     and that no other tool reads. Presence of any one of these is sufficient
+#     to declare a CLI install. The set mirrors GitHub's documented config-dir
+#     contents (Copilot CLI "config dir reference"); ``pkg`` is an
+#     observed-but-undocumented binary dir kept as an extra real-world signal.
+#
+#   SHARED: markers under ``~/.copilot/`` that are NOT exclusive to the CLI,
+#     because another GitHub Copilot surface also produces them. Two ways:
+#       - ALSO read by the Copilot VS Code/JetBrains agent: ``skills/`` (and
+#         ``agents/``, ``instructions/``, ``copilot-instructions.md``). Per the
+#         VS Code Agent Skills docs, ``~/.copilot/skills/`` is a shared location
+#         the IDE agent reads, so it is not exclusive to the CLI:
+#         https://code.visualstudio.com/docs/agent-customization/agent-skills
+#       - WRITTEN by the IDE extension itself: ``ide/``. The Copilot VS Code /
+#         JetBrains extension drops a discovery lock file at
+#         ``~/.copilot/ide/<uuid>.lock`` (containing the IDE name + pid) so the
+#         CLI can connect — see microsoft/vscode-copilot-chat#3583. An IDE-only
+#         user with no standalone CLI therefore has ``~/.copilot/ide/`` even
+#         though the CLI was never installed.
+#     Any of these can be present for an IDE-only user, so none can alone declare
+#     a CLI install (doing so produces phantom CLI rows); they are tracked solely
+#     to log/suppress that case.
 _CLI_DIR_NAME = ".copilot"
-_CLI_MARKER_FILES = frozenset({
-    "settings.json",
+_CLI_STRONG_MARKER_FILES = frozenset({
     "config.json",
     "mcp-config.json",
+    "settings.json",
     "lsp-config.json",
     "permissions-config.json",
-    "copilot-instructions.md",
     "session-store.db",
 })
-_CLI_MARKER_DIRS = frozenset({
-    "agents",
-    "hooks",
-    "ide",
-    "installed-plugins",
-    "instructions",
+_CLI_STRONG_MARKER_DIRS = frozenset({
     "logs",
-    "plugin-data",
     "session-state",
     "command-history-state",
-    "skills",
+    "installed-plugins",
+    "plugin-data",
+    "hooks",
     "pkg",
 })
+_CLI_SHARED_MARKER_FILES = frozenset({
+    "copilot-instructions.md",
+})
+_CLI_SHARED_MARKER_DIRS = frozenset({
+    "skills",
+    "agents",
+    "instructions",
+    "ide",
+})
 
 
 def _resolve_copilot_dir(user_home: Path) -> Path:
@@ -79,26 +106,23 @@ def _resolve_copilot_dir(user_home: Path) -> Path:
     return user_home / _CLI_DIR_NAME
 
 
-def _copilot_dir_has_known_artifact(copilot_dir: Path) -> bool:
-    """Return True if ``copilot_dir`` contains at least one known CLI artifact.
-
-    A known artifact is any of the marker files (as a file) or any of the
-    marker directories (as a directory). Bare directory existence is not
-    sufficient — the layout is version-dependent, so we accept a union of
-    signals but require at least one to actually be present.
+def _dir_has_any_marker(
+    copilot_dir: Path, marker_files: FrozenSet[str], marker_dirs: FrozenSet[str]
+) -> bool:
+    """Return True if ``copilot_dir`` holds any of ``marker_files`` (as a file)
+    or ``marker_dirs`` (as a directory). Errors are swallowed (the tool must
+    never crash) — a marker that can't be stat'd is treated as absent.
     """
     try:
-        for marker in _CLI_MARKER_FILES:
-            candidate = copilot_dir / marker
+        for marker in marker_files:
             try:
-                if candidate.is_file():
+                if (copilot_dir / marker).is_file():
                     return True
             except OSError:
                 continue
-        for marker in _CLI_MARKER_DIRS:
-            candidate = copilot_dir / marker
+        for marker in marker_dirs:
             try:
-                if candidate.is_dir():
+                if (copilot_dir / marker).is_dir():
                     return True
             except OSError:
                 continue
@@ -107,6 +131,48 @@ def _copilot_dir_has_known_artifact(copilot_dir: Path) -> bool:
     return False
 
 
+def _copilot_dir_has_strong_artifact(copilot_dir: Path) -> bool:
+    """Return True if ``copilot_dir`` holds at least one STRONG CLI artifact.
+
+    A strong artifact is any of the STRONG marker files (present as a file) or
+    any of the STRONG marker directories (present as a directory). These are
+    written by the standalone GitHub Copilot CLI and read by no other tool, so a
+    single one is sufficient to declare a CLI install. Bare directory existence
+    is not sufficient — the layout is version-dependent, so we accept a union of
+    signals but require at least one to actually be present.
+
+    SHARED markers (``skills/``, ``agents/``, ``instructions/``,
+    ``copilot-instructions.md``) are intentionally excluded: they are also read
+    by the GitHub Copilot VS Code/JetBrains agent
+    (https://code.visualstudio.com/docs/agent-customization/agent-skills), so on
+    their own they cannot distinguish a CLI install from an IDE-only user and
+    would produce phantom CLI rows. Use ``_copilot_dir_has_shared_artifact`` to
+    detect that case for logging/suppression.
+    """
+    return _dir_has_any_marker(
+        copilot_dir, _CLI_STRONG_MARKER_FILES, _CLI_STRONG_MARKER_DIRS
+    )
+
+
+def _copilot_dir_has_shared_artifact(copilot_dir: Path) -> bool:
+    """Return True if ``copilot_dir`` holds at least one SHARED Copilot marker.
+
+    A shared marker is any of the SHARED marker files (present as a file) or any
+    of the SHARED marker directories (present as a directory): ``skills/``,
+    ``agents/``, ``instructions/``, ``copilot-instructions.md``. These live under
+    ``~/.copilot/`` but are ALSO read by the GitHub Copilot VS Code extension /
+    JetBrains plugin's agent mode
+    (https://code.visualstudio.com/docs/agent-customization/agent-skills), so
+    they cannot, on their own, declare a standalone CLI install — an IDE-only
+    user who never installed the CLI can have them. This predicate exists only to
+    recognise the "shared markers but no strong CLI artifact" case so it can be
+    logged and suppressed rather than reported as a phantom CLI row.
+    """
+    return _dir_has_any_marker(
+        copilot_dir, _CLI_SHARED_MARKER_FILES, _CLI_SHARED_MARKER_DIRS
+    )
+
+
 _VERSION_RE = re.compile(r"\d+\.\d+\.\d+(?:[.\-+][0-9A-Za-z.\-]+)?")
 
 
@@ -134,8 +200,12 @@ class MacOSCopilotCliDetector(BaseToolDetector):
 
     Detection involves:
     - Checking that a user's ``~/.copilot/`` directory exists.
-    - Verifying it contains at least one known CLI artifact (a marker file or
-      marker directory) so a stray empty ``~/.copilot`` does not count.
+    - Verifying it contains at least one STRONG (CLI-exclusive) marker (a strong
+      marker file or directory) so a stray empty ``~/.copilot`` does not count.
+      A dir holding only SHARED markers (skills/agents/instructions/
+      copilot-instructions.md, which the IDE Copilot agent reads, or ide/, which
+      the IDE extension writes as a discovery lock) is the VS Code/JetBrains
+      agent rather than the CLI and is suppressed.
 
     When ``user_home`` is set on the instance (the per-user path used by the
     live discovery loop via ``detect_tool_for_user``), detection is scoped to
@@ -268,7 +338,9 @@ def _detect_for_user(self, user_home: Path) -> Optional[Dict]:
 
         Returns a tool-info dict when the resolved config dir (``COPILOT_HOME``
         when set for this user, else ``user_home/.copilot``) exists and holds at
-        least one known artifact; otherwise None.
+        least one STRONG CLI artifact; otherwise None. A dir holding only SHARED
+        markers (skills/agents/instructions/copilot-instructions.md/ide) is the
+        IDE Copilot agent, not the CLI — it is logged and suppressed.
         """
         copilot_dir = _resolve_copilot_dir(user_home)
         try:
@@ -278,7 +350,14 @@ def _detect_for_user(self, user_home: Path) -> Optional[Dict]:
             logger.debug(f"Error checking Copilot CLI dir {copilot_dir}: {exc}")
             return None
 
-        if not _copilot_dir_has_known_artifact(copilot_dir):
+        if not _copilot_dir_has_strong_artifact(copilot_dir):
+            if _copilot_dir_has_shared_artifact(copilot_dir):
+                logger.info(
+                    "Skipping %s: only shared/IDE-written Copilot markers present "
+                    "(skills/agents/instructions/copilot-instructions.md/ide) — likely the "
+                    "VS Code/JetBrains Copilot agent, not the standalone CLI",
+                    copilot_dir,
+                )
             return None
 
         return {
diff --git a/scripts/coding_discovery_tools/windows/copilot_cli/copilot_cli.py b/scripts/coding_discovery_tools/windows/copilot_cli/copilot_cli.py
index a98f7b46..7db95faf 100644
--- a/scripts/coding_discovery_tools/windows/copilot_cli/copilot_cli.py
+++ b/scripts/coding_discovery_tools/windows/copilot_cli/copilot_cli.py
@@ -12,9 +12,11 @@
 and ``get_version`` (overridden to pass ``shell=True`` for the npm ``.cmd``
 shim, mirroring ``WindowsCodexDetector`` — without it the inherited probe would
 always read "unknown"). Everything else — the marker gate
-(``_copilot_dir_has_known_artifact``), ``_detect_for_user``, ``detect``, and
+(``_copilot_dir_has_strong_artifact``), ``_detect_for_user``, ``detect``, and
 ``detect_all_tools`` — is inherited from the macOS detector rather than
-re-derived (CLAUDE.md DRY). Mirrors the per-user/admin idiom in
+re-derived (CLAUDE.md DRY). The marker gate inherits the strong-vs-shared
+artifact split (``_copilot_dir_has_strong_artifact``) too. Mirrors the
+per-user/admin idiom in
 ``windows/github_copilot/detect_copilot.py``.
 """
 
diff --git a/tests/test_copilot_cli_discovery.py b/tests/test_copilot_cli_discovery.py
index 6dc39956..e5008009 100644
--- a/tests/test_copilot_cli_discovery.py
+++ b/tests/test_copilot_cli_discovery.py
@@ -178,11 +178,89 @@ def test_command_history_state_dir_marker_detected(self):
         (copilot_dir / "command-history-state").mkdir()
         self.assertIsNotNone(self.detector.detect())
 
-    def test_skills_dir_marker_detected(self):
+    def test_skills_dir_alone_not_detected(self):
+        """skills/ is a SHARED marker (the IDE Copilot agent reads it too), so on
+        its own it is not a standalone CLI install -> not detected."""
         copilot_dir = self._make_copilot_dir()
         (copilot_dir / "skills").mkdir()
+        self.assertIsNone(self.detector.detect())
+
+    def test_agents_dir_alone_not_detected(self):
+        """agents/ is a SHARED marker -> not a CLI install on its own."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "agents").mkdir()
+        self.assertIsNone(self.detector.detect())
+
+    def test_instructions_dir_alone_not_detected(self):
+        """instructions/ is a SHARED marker -> not a CLI install on its own."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "instructions").mkdir()
+        self.assertIsNone(self.detector.detect())
+
+    def test_copilot_instructions_md_alone_not_detected(self):
+        """copilot-instructions.md is a SHARED marker -> not a CLI install alone."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "copilot-instructions.md").write_text("hi", encoding="utf-8")
+        self.assertIsNone(self.detector.detect())
+
+    def test_ide_lock_dir_alone_not_detected(self):
+        """ide/ holds the discovery lock the VS Code/JetBrains Copilot EXTENSION
+        writes (microsoft/vscode-copilot-chat#3583), so an IDE-only user has it
+        with no CLI -> SHARED, not a CLI install on its own."""
+        copilot_dir = self._make_copilot_dir()
+        ide_dir = copilot_dir / "ide"
+        ide_dir.mkdir()
+        (ide_dir / "0c.lock").write_text(
+            '{"pid": 1, "ideName": "Visual Studio Code", "workspaceFolders": []}',
+            encoding="utf-8",
+        )
+        self.assertIsNone(self.detector.detect())
+
+    def test_only_skills_and_rules_not_detected(self):
+        """Repro for device MY4W6QQGCQ / user karthick: a ~/.copilot holding only
+        skills/ and copilot-instructions.md (both SHARED, no strong CLI artifact)
+        is the IDE Copilot agent, not the CLI -> not detected (no phantom row)."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "skills").mkdir()
+        (copilot_dir / "copilot-instructions.md").write_text("rules", encoding="utf-8")
+        self.assertIsNone(self.detector.detect())
+
+    def test_session_store_db_marker_detected(self):
+        """session-store.db is a STRONG CLI-exclusive marker -> detected."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "session-store.db").write_text("", encoding="utf-8")
         self.assertIsNotNone(self.detector.detect())
 
+    def test_real_install_with_strong_and_shared_markers_detected(self):
+        """A real install (strong markers config.json + session-store.db) that ALSO
+        has shared skills/ is detected — shared markers never veto a strong one."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "config.json").write_text("{}", encoding="utf-8")
+        (copilot_dir / "session-store.db").write_text("", encoding="utf-8")
+        (copilot_dir / "skills").mkdir()
+        self.assertIsNotNone(self.detector.detect())
+
+    def test_shared_only_logs_info_and_returns_none(self):
+        """A shared-only ~/.copilot emits an INFO suppression log and returns None."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "skills").mkdir()
+        with self.assertLogs(_DETECTOR_MOD, level="INFO") as captured:
+            self.assertIsNone(self.detector.detect())
+        self.assertTrue(
+            any("shared/IDE-written Copilot markers" in record.getMessage()
+                for record in captured.records)
+        )
+
+    def test_copilot_home_shared_only_not_detected(self):
+        """A COPILOT_HOME-relocated config dir holding only shared markers is
+        suppressed too — the gate runs on the resolved dir, wherever it lives."""
+        relocated = Path(self.tmp_dir) / "relocated-copilot"
+        (relocated / "skills").mkdir(parents=True)
+        detector = MacOSCopilotCliDetector()
+        with patch.dict(os.environ, {"COPILOT_HOME": str(relocated)}, clear=False), \
+                patch(f"{_DETECTOR_MOD}.is_running_as_root", return_value=False):
+            self.assertIsNone(detector.detect())
+
     def test_permissions_config_marker_detected(self):
         copilot_dir = self._make_copilot_dir()
         (copilot_dir / "permissions-config.json").write_text("{}", encoding="utf-8")
@@ -873,6 +951,13 @@ def test_unrelated_file_not_detected(self):
         (copilot_dir / "random.txt").write_text("hello", encoding="utf-8")
         self.assertIsNone(self.detector.detect())
 
+    def test_shared_only_not_detected(self):
+        """The strong-vs-shared marker split is inherited: a ~/.copilot holding
+        only the SHARED skills/ marker is the IDE Copilot agent, not the CLI."""
+        copilot_dir = self._make_copilot_dir()
+        (copilot_dir / "skills").mkdir()
+        self.assertIsNone(self.detector.detect())
+
     def test_detect_returns_unknown_version_when_binary_absent(self):
         """version falls back to 'unknown' when `copilot --version` yields nothing."""
         copilot_dir = self._make_copilot_dir()

From 5c2d6ce783c55e286f27a5a88004faa43fdee0dc Mon Sep 17 00:00:00 2001
From: Sumit Badsara <sumit@unboundsecurity.ai>
Date: Fri, 5 Jun 2026 02:52:25 +0530
Subject: [PATCH 4/5] =?UTF-8?q?feat(discovery):=20Approach=20B=20tool=20ch?=
 =?UTF-8?q?ange=20=E2=80=94=20own=20device=5Fid=20+=20is=5Fcontainer=20fla?=
 =?UTF-8?q?g?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Implements step ③ of Approach B (user-centric device attribution). Each
device reports its OWN device_id and a plain home_user; supersedes PR #165's
device_id injection + home_user namespacing (not carried forward).

- Salvage the in_container() helper into utils.py (checks /.dockerenv,
  /run/.containerenv, overlay root mount, cgroup v1 fallback; lru_cached).
  Only this helper is brought over from feat/device-id-host-override — not
  the injection override or namespacing.
- Add is_container (bool) as a top-level field in the report payload built
  by generate_single_tool_report, set from in_container().
- Stable container device_id: when /etc/machine-id is empty/absent, fall
  back to a UUID persisted in the home-user's ~/.unbound/ (read if present,
  else generate uuid4 + write) instead of the ephemeral hostname, so a
  restarted container keeps one device row. Reuses cache._ensure_state_dir()
  for robust state-dir resolution (writable fallback) rather than a bare
  Path.home(). Machine-id-first behavior unchanged; never raises.
- Tests: is_container present/top-level in report; persisted-UUID fallback
  (generate+write, read-existing, unwritable dir, write-failure); in_container
  detection with lru_cache clearing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../ai_tools_discovery.py                     |   5 +-
 .../coding_discovery_tools/linux/device_id.py |  51 +++-
 scripts/coding_discovery_tools/utils.py       |  42 +++
 tests/test_container_attribution.py           | 240 ++++++++++++++++++
 4 files changed, 333 insertions(+), 5 deletions(-)
 create mode 100644 tests/test_container_attribution.py

diff --git a/scripts/coding_discovery_tools/ai_tools_discovery.py b/scripts/coding_discovery_tools/ai_tools_discovery.py
index 9e30c705..8547cb5e 100644
--- a/scripts/coding_discovery_tools/ai_tools_discovery.py
+++ b/scripts/coding_discovery_tools/ai_tools_discovery.py
@@ -65,7 +65,7 @@
         CursorSkillsExtractorFactory,
         ClineSkillsExtractorFactory,
     )
-    from .utils import send_report_to_backend, send_scan_event, send_discovery_metrics, get_user_info, get_all_users_macos, get_all_users_windows, get_all_users_linux, load_pending_reports, save_failed_reports, report_to_sentry, get_claude_subscription_type, get_cursor_subscription_type, QUEUE_FILE
+    from .utils import send_report_to_backend, send_scan_event, send_discovery_metrics, get_user_info, get_all_users_macos, get_all_users_windows, get_all_users_linux, load_pending_reports, save_failed_reports, report_to_sentry, get_claude_subscription_type, get_cursor_subscription_type, in_container, QUEUE_FILE
     from .linux_extraction_helpers import linux_home_for_user
     from .logging_helpers import configure_logger, log_rules_details, log_mcp_details, log_settings_details
     from .settings_transformers import transform_settings_to_backend_format
@@ -119,7 +119,7 @@
         CursorSkillsExtractorFactory,
         ClineSkillsExtractorFactory,
     )
-    from scripts.coding_discovery_tools.utils import send_report_to_backend, send_scan_event, send_discovery_metrics, get_user_info, get_all_users_macos, get_all_users_windows, get_all_users_linux, load_pending_reports, save_failed_reports, report_to_sentry, get_claude_subscription_type, get_cursor_subscription_type, QUEUE_FILE
+    from scripts.coding_discovery_tools.utils import send_report_to_backend, send_scan_event, send_discovery_metrics, get_user_info, get_all_users_macos, get_all_users_windows, get_all_users_linux, load_pending_reports, save_failed_reports, report_to_sentry, get_claude_subscription_type, get_cursor_subscription_type, in_container, QUEUE_FILE
     from scripts.coding_discovery_tools.linux_extraction_helpers import linux_home_for_user
     from scripts.coding_discovery_tools.logging_helpers import configure_logger, log_rules_details, log_mcp_details, log_settings_details
     from scripts.coding_discovery_tools.settings_transformers import transform_settings_to_backend_format
@@ -1926,6 +1926,7 @@ def generate_single_tool_report(self, tool: Dict, device_id: str, home_user: str
             "home_user": home_user,
             "system_user": system_user or home_user,
             "device_id": device_id,
+            "is_container": in_container(),
             "tools": [tool_for_report]
         }
 
diff --git a/scripts/coding_discovery_tools/linux/device_id.py b/scripts/coding_discovery_tools/linux/device_id.py
index 25bdcb37..2ea392c7 100644
--- a/scripts/coding_discovery_tools/linux/device_id.py
+++ b/scripts/coding_discovery_tools/linux/device_id.py
@@ -1,10 +1,11 @@
 """Device ID extraction for Linux."""
 
 import logging
+import uuid
 from pathlib import Path
 
+from .. import cache
 from ..coding_tool_base import BaseDeviceIdExtractor
-from ..utils import get_hostname
 
 logger = logging.getLogger(__name__)
 
@@ -13,6 +14,8 @@
     Path("/var/lib/dbus/machine-id"),  # older dbus fallback
 ]
 
+_DEVICE_ID_FILENAME = "device-id"
+
 
 class LinuxDeviceIdExtractor(BaseDeviceIdExtractor):
     """Device ID extractor for Linux systems."""
@@ -20,7 +23,13 @@ class LinuxDeviceIdExtractor(BaseDeviceIdExtractor):
     def extract_device_id(self) -> str:
         """
         Return the machine-id from /etc/machine-id (systemd standard).
-        Falls back to hostname if the file is absent or unreadable.
+
+        When no machine-id is available (common in containers, which often have
+        an empty/absent /etc/machine-id), fall back to a UUID persisted in the
+        home-user's ``~/.unbound/`` directory. A restarted container that mounts
+        a persistent ``~/.unbound`` (the primary ``unbound login`` flow) then
+        keeps a single stable device row instead of exploding into one row per
+        launch — which an ephemeral hostname fallback would produce.
         """
         for path in _MACHINE_ID_PATHS:
             try:
@@ -31,4 +40,40 @@ def extract_device_id(self) -> str:
             except Exception as e:
                 logger.debug(f"Could not read {path}: {e}")
 
-        return get_hostname()
+        return self._persisted_device_id()
+
+    @staticmethod
+    def _persisted_device_id() -> str:
+        """Read (or generate-and-persist) a stable UUID under ``~/.unbound/``.
+
+        Reuses the repo's canonical state-dir resolver (``cache._ensure_state_dir``)
+        rather than a bare ``Path.home()`` so we honour the writable-fallback chain
+        and land next to the API key written by ``unbound login``. If the state dir
+        cannot be resolved or the write fails, return an unpersisted uuid4 — no
+        worse than the previous ephemeral hostname behaviour, and never raises.
+        """
+        try:
+            if not cache._ensure_state_dir():
+                logger.debug("No usable state dir for device-id; using ephemeral uuid")
+                return str(uuid.uuid4())
+
+            device_id_path = cache.UNBOUND_DIR / _DEVICE_ID_FILENAME
+
+            try:
+                if device_id_path.exists() and device_id_path.is_file():
+                    existing = device_id_path.read_text(encoding="utf-8").strip()
+                    if existing:
+                        return existing
+            except Exception as e:
+                logger.debug(f"Could not read persisted device-id at {device_id_path}: {e}")
+
+            new_id = str(uuid.uuid4())
+            try:
+                cache.UNBOUND_DIR.mkdir(parents=True, exist_ok=True)
+                device_id_path.write_text(new_id, encoding="utf-8")
+            except Exception as e:
+                logger.debug(f"Could not persist device-id at {device_id_path}: {e}")
+            return new_id
+        except Exception as e:
+            logger.debug(f"device-id fallback failed: {e}")
+            return str(uuid.uuid4())
diff --git a/scripts/coding_discovery_tools/utils.py b/scripts/coding_discovery_tools/utils.py
index a3748342..e83497e9 100644
--- a/scripts/coding_discovery_tools/utils.py
+++ b/scripts/coding_discovery_tools/utils.py
@@ -2,6 +2,7 @@
 Utility functions shared across the AI tools discovery system
 """
 
+import functools
 import json
 import logging
 import os
@@ -101,6 +102,47 @@ def get_hostname() -> str:
     return platform.node()
 
 
+@functools.lru_cache(maxsize=1)
+def in_container() -> bool:
+    """Best-effort detection of whether we're running inside a container.
+
+    Combines several signals because no single one is reliable across runtimes
+    and kernels:
+      - ``/.dockerenv`` / ``/run/.containerenv`` — Docker / Podman runtime markers.
+      - root filesystem mounted as ``overlay`` — cgroup-version-agnostic.
+      - ``/proc/1/cgroup`` docker/lxc/kube markers — cgroup v1 ONLY (v2 shows
+        ``0::/`` from inside, so this is a fallback, not the primary check).
+
+    This is for honest behavioural branching, not security — every marker here
+    is forgeable by whoever controls the container. Result is cached for the
+    process lifetime.
+    """
+    try:
+        if os.path.exists("/.dockerenv") or os.path.exists("/run/.containerenv"):
+            return True
+    except OSError:
+        pass
+
+    try:
+        with open("/proc/mounts", encoding="utf-8") as f:
+            for line in f:
+                parts = line.split()
+                if len(parts) >= 3 and parts[1] == "/" and parts[2] == "overlay":
+                    return True
+    except OSError:
+        pass
+
+    try:
+        with open("/proc/1/cgroup", encoding="utf-8") as f:
+            blob = f.read()
+        if any(marker in blob for marker in ("/docker", "/lxc", "kubepods", "/containerd")):
+            return True
+    except OSError:
+        pass
+
+    return False
+
+
 class DsclBatchData(NamedTuple):
     uid_map: Dict[str, int]
     shell_map: Dict[str, str]
diff --git a/tests/test_container_attribution.py b/tests/test_container_attribution.py
new file mode 100644
index 00000000..b2906049
--- /dev/null
+++ b/tests/test_container_attribution.py
@@ -0,0 +1,240 @@
+"""Tests for Approach B coding-discovery-tool changes.
+
+Covers:
+  - ``is_container`` top-level field in the report payload built by
+    ``generate_single_tool_report``.
+  - The persisted-UUID device_id fallback in ``LinuxDeviceIdExtractor`` when
+    /etc/machine-id is empty/absent (read existing, else generate+write).
+  - The ``in_container()`` helper salvaged into utils.py.
+
+Per CLAUDE.md: tests mock filesystem reads/writes; ``in_container`` is
+lru_cached so we clear its cache around container-state toggles.
+"""
+
+import unittest
+import uuid
+from pathlib import Path
+from unittest.mock import patch
+
+from scripts.coding_discovery_tools.ai_tools_discovery import AIToolsDetector
+import scripts.coding_discovery_tools.ai_tools_discovery as ai_tools_discovery
+import scripts.coding_discovery_tools.utils as utils_mod
+from scripts.coding_discovery_tools.utils import in_container
+
+
+class TestIsContainerInReport(unittest.TestCase):
+    """``is_container`` is always present as a top-level report field."""
+
+    def _make_report(self):
+        detector = AIToolsDetector()
+        tool = {"name": "dummy-tool", "version": "1.0", "_internal": "hidden"}
+        return detector.generate_single_tool_report(
+            tool, device_id="DEV-123", home_user="testuser"
+        )
+
+    def test_is_container_true(self):
+        with patch.object(ai_tools_discovery, "in_container", return_value=True):
+            report = self._make_report()
+        self.assertIn("is_container", report)
+        self.assertIs(report["is_container"], True)
+
+    def test_is_container_false(self):
+        with patch.object(ai_tools_discovery, "in_container", return_value=False):
+            report = self._make_report()
+        self.assertIn("is_container", report)
+        self.assertIs(report["is_container"], False)
+
+    def test_is_container_is_top_level_not_in_tool(self):
+        """The flag lives in the report dict, not inside the tool payload."""
+        with patch.object(ai_tools_discovery, "in_container", return_value=True):
+            report = self._make_report()
+        self.assertIn("is_container", report)
+        self.assertEqual(len(report["tools"]), 1)
+        self.assertNotIn("is_container", report["tools"][0])
+        # Internal keys (leading underscore) are still filtered out.
+        self.assertNotIn("_internal", report["tools"][0])
+
+    def test_home_user_not_namespaced(self):
+        """Approach B: plain home_user, no PR #165 container namespacing."""
+        with patch.object(ai_tools_discovery, "in_container", return_value=True):
+            report = self._make_report()
+        self.assertEqual(report["home_user"], "testuser")
+
+
+class TestInContainerHelper(unittest.TestCase):
+    """The salvaged in_container() helper. lru_cached -> clear around toggles."""
+
+    def setUp(self):
+        in_container.cache_clear()
+        self.addCleanup(in_container.cache_clear)
+
+    def test_detects_dockerenv(self):
+        def fake_exists(p):
+            return p == "/.dockerenv"
+
+        with patch.object(utils_mod.os.path, "exists", side_effect=fake_exists):
+            self.assertTrue(in_container())
+
+    def test_no_markers_returns_false(self):
+        with patch.object(utils_mod.os.path, "exists", return_value=False), \
+             patch("builtins.open", side_effect=OSError("no proc")):
+            self.assertFalse(in_container())
+
+    def test_lru_cache_clear_resets_state(self):
+        # Without cache_clear() the first result would stick and lie.
+        with patch.object(utils_mod.os.path, "exists", return_value=False), \
+             patch("builtins.open", side_effect=OSError):
+            self.assertFalse(in_container())
+        in_container.cache_clear()
+        with patch.object(utils_mod.os.path, "exists",
+                          side_effect=lambda p: p == "/run/.containerenv"):
+            self.assertTrue(in_container())
+
+
+class TestLinuxDeviceIdFallback(unittest.TestCase):
+    """Persisted-UUID device_id fallback when machine-id is empty/absent."""
+
+    def setUp(self):
+        from scripts.coding_discovery_tools.linux.device_id import LinuxDeviceIdExtractor
+        self.extractor = LinuxDeviceIdExtractor()
+
+    def test_machine_id_takes_precedence(self):
+        """Existing behavior preserved: machine-id wins, no UUID fallback."""
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
+
+        class FakeMachineIdPath:
+            def exists(self):
+                return True
+
+            def is_file(self):
+                return True
+
+            def read_text(self, encoding="utf-8"):
+                return "  abc123machineid  \n"
+
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", [FakeMachineIdPath()]):
+            self.assertEqual(self.extractor.extract_device_id(), "abc123machineid")
+
+    def test_fallback_generates_and_persists_uuid(self):
+        """First call with empty machine-id -> generate uuid4 and write it."""
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
+
+        fixed = "11111111-2222-3333-4444-555555555555"
+        written = {}
+        state_dir = Path("/fake/home/.unbound")
+
+        device_id_path = MockPath(state_dir / "device-id", exists=False,
+                                   write_sink=written)
+
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
+             patch.object(did_mod.cache, "UNBOUND_DIR",
+                          MockDir(state_dir, device_id_path)), \
+             patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
+            result = self.extractor.extract_device_id()
+
+        self.assertEqual(result, fixed)
+        self.assertEqual(written.get("content"), fixed)
+
+    def test_fallback_reads_existing_uuid(self):
+        """Second/restarted run -> read the persisted uuid, do NOT regenerate."""
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
+
+        persisted = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
+        state_dir = Path("/fake/home/.unbound")
+        written = {}
+
+        device_id_path = MockPath(state_dir / "device-id", exists=True,
+                                  read_value=f"  {persisted}\n", write_sink=written)
+
+        # uuid4 would raise if called — proving we read instead of generate.
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
+             patch.object(did_mod.cache, "UNBOUND_DIR",
+                          MockDir(state_dir, device_id_path)), \
+             patch.object(did_mod.uuid, "uuid4",
+                          side_effect=AssertionError("uuid4 should not be called")):
+            result = self.extractor.extract_device_id()
+
+        self.assertEqual(result, persisted)
+        self.assertNotIn("content", written)  # nothing rewritten
+
+    def test_fallback_unwritable_state_dir_returns_uuid_unpersisted(self):
+        """No usable state dir -> return an (unpersisted) uuid, never raise."""
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
+
+        fixed = "99999999-8888-7777-6666-555555555555"
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=False), \
+             patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
+            result = self.extractor.extract_device_id()
+        self.assertEqual(result, fixed)
+
+    def test_fallback_write_failure_still_returns_uuid(self):
+        """A write error must not crash; the freshly-minted uuid is returned."""
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
+
+        fixed = "12121212-3434-5656-7878-909090909090"
+        state_dir = Path("/fake/home/.unbound")
+        device_id_path = MockPath(state_dir / "device-id", exists=False,
+                                  write_error=OSError("read-only fs"))
+
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
+             patch.object(did_mod.cache, "UNBOUND_DIR",
+                          MockDir(state_dir, device_id_path, mkdir_error=None)), \
+             patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
+            result = self.extractor.extract_device_id()
+        self.assertEqual(result, fixed)
+
+
+class MockPath:
+    """Stand-in for the resolved ``device-id`` file path."""
+
+    def __init__(self, path, exists=False, read_value="", write_sink=None,
+                 write_error=None):
+        self._path = path
+        self._exists = exists
+        self._read_value = read_value
+        self._write_sink = write_sink if write_sink is not None else {}
+        self._write_error = write_error
+
+    def exists(self):
+        return self._exists
+
+    def is_file(self):
+        return self._exists
+
+    def read_text(self, encoding="utf-8"):
+        return self._read_value
+
+    def write_text(self, content, encoding="utf-8"):
+        if self._write_error is not None:
+            raise self._write_error
+        self._write_sink["content"] = content
+
+    def __str__(self):
+        return str(self._path)
+
+
+class MockDir:
+    """Stand-in for ``cache.UNBOUND_DIR`` supporting ``/`` and ``mkdir``."""
+
+    def __init__(self, path, child, mkdir_error=None):
+        self._path = path
+        self._child = child
+        self._mkdir_error = mkdir_error
+
+    def __truediv__(self, name):
+        return self._child
+
+    def mkdir(self, parents=False, exist_ok=False):
+        if self._mkdir_error is not None:
+            raise self._mkdir_error
+
+    def __str__(self):
+        return str(self._path)
+
+
+if __name__ == "__main__":
+    unittest.main()

From 5ec8f9a01978e991131ad1fc965f421ed6ceec72 Mon Sep 17 00:00:00 2001
From: Sumit Badsara <sumit@unboundsecurity.ai>
Date: Fri, 5 Jun 2026 03:27:06 +0530
Subject: [PATCH 5/5] Harden persisted device-id: atomic write, UUID
 validation, warning logs

Addresses Greptile review on PR #170:

- Atomic write (P1): persist the device-id UUID via tempfile.mkstemp +
  os.replace in the same dir, mirroring cache.atomic_write_cache(). A
  kill mid-write can no longer leave a partial UUID on disk.
- UUID validation on read (P1): validate the persisted value with
  uuid.UUID() and treat a corrupt/truncated/non-UUID value as absent,
  regenerating a fresh UUID. Pairs with the atomic write.
- Log levels (P2): failures that cost the container its stable identity
  (no state dir, unreadable/corrupt file, failed write, outer catch) now
  log at warning; the benign absent-machine-id read stays at debug.

Extends tests/test_container_attribution.py to use a real temp dir so the
atomic write path is exercised for real, and adds corrupt-partial and
garbage-non-UUID regeneration cases plus a write-failure case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../coding_discovery_tools/linux/device_id.py |  43 ++++-
 tests/test_container_attribution.py           | 149 +++++++++---------
 2 files changed, 114 insertions(+), 78 deletions(-)

diff --git a/scripts/coding_discovery_tools/linux/device_id.py b/scripts/coding_discovery_tools/linux/device_id.py
index 2ea392c7..7dd2642f 100644
--- a/scripts/coding_discovery_tools/linux/device_id.py
+++ b/scripts/coding_discovery_tools/linux/device_id.py
@@ -1,6 +1,8 @@
 """Device ID extraction for Linux."""
 
 import logging
+import os
+import tempfile
 import uuid
 from pathlib import Path
 
@@ -54,7 +56,7 @@ def _persisted_device_id() -> str:
         """
         try:
             if not cache._ensure_state_dir():
-                logger.debug("No usable state dir for device-id; using ephemeral uuid")
+                logger.warning("No usable state dir for device-id; using ephemeral uuid")
                 return str(uuid.uuid4())
 
             device_id_path = cache.UNBOUND_DIR / _DEVICE_ID_FILENAME
@@ -63,17 +65,46 @@ def _persisted_device_id() -> str:
                 if device_id_path.exists() and device_id_path.is_file():
                     existing = device_id_path.read_text(encoding="utf-8").strip()
                     if existing:
-                        return existing
+                        # Validate the persisted value is a well-formed UUID. A
+                        # truncated/partial write (pre-atomic-write), a manual edit,
+                        # or another tool clobbering the file can leave a non-UUID
+                        # string here; returning it would create a backend device
+                        # row that no valid UUID can ever match. Treat corrupt as
+                        # absent and regenerate.
+                        try:
+                            uuid.UUID(existing)
+                            return existing
+                        except ValueError:
+                            logger.warning(
+                                f"Corrupt (non-UUID) persisted device-id at "
+                                f"{device_id_path!s}; regenerating"
+                            )
             except Exception as e:
-                logger.debug(f"Could not read persisted device-id at {device_id_path}: {e}")
+                logger.warning(f"Could not read persisted device-id at {device_id_path!s}: {e}")
 
             new_id = str(uuid.uuid4())
             try:
+                # Atomic write: a SIGKILL/OOM/power-loss mid-write must never leave
+                # a partial UUID on disk. Write to a temp file in the same dir then
+                # os.replace() (atomic rename on the same filesystem), mirroring
+                # cache.atomic_write_cache().
                 cache.UNBOUND_DIR.mkdir(parents=True, exist_ok=True)
-                device_id_path.write_text(new_id, encoding="utf-8")
+                fd, tmp = tempfile.mkstemp(
+                    prefix=".device-id.", suffix=".tmp", dir=str(cache.UNBOUND_DIR)
+                )
+                try:
+                    with os.fdopen(fd, "w", encoding="utf-8") as f:
+                        f.write(new_id)
+                    os.replace(tmp, str(device_id_path))
+                finally:
+                    if os.path.exists(tmp):
+                        try:
+                            os.unlink(tmp)
+                        except OSError:
+                            pass
             except Exception as e:
-                logger.debug(f"Could not persist device-id at {device_id_path}: {e}")
+                logger.warning(f"Could not persist device-id at {device_id_path!s}: {e}")
             return new_id
         except Exception as e:
-            logger.debug(f"device-id fallback failed: {e}")
+            logger.warning(f"device-id fallback failed: {e}")
             return str(uuid.uuid4())
diff --git a/tests/test_container_attribution.py b/tests/test_container_attribution.py
index b2906049..97050c57 100644
--- a/tests/test_container_attribution.py
+++ b/tests/test_container_attribution.py
@@ -11,6 +11,7 @@
 lru_cached so we clear its cache around container-state toggles.
 """
 
+import tempfile
 import unittest
 import uuid
 from pathlib import Path
@@ -92,11 +93,21 @@ def test_lru_cache_clear_resets_state(self):
 
 
 class TestLinuxDeviceIdFallback(unittest.TestCase):
-    """Persisted-UUID device_id fallback when machine-id is empty/absent."""
+    """Persisted-UUID device_id fallback when machine-id is empty/absent.
+
+    These tests use a REAL temp dir (patched onto ``cache.UNBOUND_DIR``) so that
+    the atomic ``tempfile.mkstemp`` + ``os.replace`` write path is exercised for
+    real rather than mocked away.
+    """
 
     def setUp(self):
         from scripts.coding_discovery_tools.linux.device_id import LinuxDeviceIdExtractor
         self.extractor = LinuxDeviceIdExtractor()
+        self._tmp = tempfile.TemporaryDirectory()
+        self.addCleanup(self._tmp.cleanup)
+        self.state_dir = Path(self._tmp.name) / ".unbound"
+        self.state_dir.mkdir(parents=True, exist_ok=True)
+        self.device_id_path = self.state_dir / "device-id"
 
     def test_machine_id_takes_precedence(self):
         """Existing behavior preserved: machine-id wins, no UUID fallback."""
@@ -116,124 +127,118 @@ def read_text(self, encoding="utf-8"):
             self.assertEqual(self.extractor.extract_device_id(), "abc123machineid")
 
     def test_fallback_generates_and_persists_uuid(self):
-        """First call with empty machine-id -> generate uuid4 and write it."""
+        """First call with empty machine-id -> generate uuid4 and write it.
+
+        Reads the file back off the real fs to prove the atomic write landed.
+        """
         from scripts.coding_discovery_tools.linux import device_id as did_mod
 
         fixed = "11111111-2222-3333-4444-555555555555"
-        written = {}
-        state_dir = Path("/fake/home/.unbound")
-
-        device_id_path = MockPath(state_dir / "device-id", exists=False,
-                                   write_sink=written)
 
         with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
              patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
-             patch.object(did_mod.cache, "UNBOUND_DIR",
-                          MockDir(state_dir, device_id_path)), \
+             patch.object(did_mod.cache, "UNBOUND_DIR", self.state_dir), \
              patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
             result = self.extractor.extract_device_id()
 
         self.assertEqual(result, fixed)
-        self.assertEqual(written.get("content"), fixed)
+        self.assertTrue(self.device_id_path.exists())
+        self.assertEqual(self.device_id_path.read_text(encoding="utf-8"), fixed)
+        # No leftover temp files from the atomic write.
+        leftovers = [p for p in self.state_dir.iterdir() if p.name != "device-id"]
+        self.assertEqual(leftovers, [])
 
     def test_fallback_reads_existing_uuid(self):
         """Second/restarted run -> read the persisted uuid, do NOT regenerate."""
         from scripts.coding_discovery_tools.linux import device_id as did_mod
 
         persisted = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
-        state_dir = Path("/fake/home/.unbound")
-        written = {}
-
-        device_id_path = MockPath(state_dir / "device-id", exists=True,
-                                  read_value=f"  {persisted}\n", write_sink=written)
+        self.device_id_path.write_text(f"  {persisted}\n", encoding="utf-8")
 
         # uuid4 would raise if called — proving we read instead of generate.
         with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
              patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
-             patch.object(did_mod.cache, "UNBOUND_DIR",
-                          MockDir(state_dir, device_id_path)), \
+             patch.object(did_mod.cache, "UNBOUND_DIR", self.state_dir), \
              patch.object(did_mod.uuid, "uuid4",
                           side_effect=AssertionError("uuid4 should not be called")):
             result = self.extractor.extract_device_id()
 
         self.assertEqual(result, persisted)
-        self.assertNotIn("content", written)  # nothing rewritten
+        # File untouched (still has the original whitespace-padded content).
+        self.assertEqual(self.device_id_path.read_text(encoding="utf-8"),
+                         f"  {persisted}\n")
 
-    def test_fallback_unwritable_state_dir_returns_uuid_unpersisted(self):
-        """No usable state dir -> return an (unpersisted) uuid, never raise."""
+    def test_fallback_corrupt_partial_uuid_regenerates(self):
+        """A truncated/partial (non-UUID) persisted value is treated as absent.
+
+        Simulates a pre-atomic-write partial write or a manual edit. The extractor
+        must reject it, mint a fresh valid UUID, and atomically overwrite the file.
+        """
         from scripts.coding_discovery_tools.linux import device_id as did_mod
 
-        fixed = "99999999-8888-7777-6666-555555555555"
+        # Partial UUID left by an interrupted write.
+        self.device_id_path.write_text("11111111-222", encoding="utf-8")
+        fixed = "33333333-4444-5555-6666-777777777777"
+
         with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
-             patch.object(did_mod.cache, "_ensure_state_dir", return_value=False), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
+             patch.object(did_mod.cache, "UNBOUND_DIR", self.state_dir), \
              patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
             result = self.extractor.extract_device_id()
+
+        # Regenerated, and the corrupt value is gone — replaced by a valid UUID.
         self.assertEqual(result, fixed)
+        uuid.UUID(result)  # parses -> well-formed
+        on_disk = self.device_id_path.read_text(encoding="utf-8")
+        self.assertEqual(on_disk, fixed)
+        uuid.UUID(on_disk)  # the persisted value is now valid
 
-    def test_fallback_write_failure_still_returns_uuid(self):
-        """A write error must not crash; the freshly-minted uuid is returned."""
+    def test_fallback_garbage_non_uuid_regenerates(self):
+        """An arbitrary non-UUID string (manual edit / other tool) is rejected."""
         from scripts.coding_discovery_tools.linux import device_id as did_mod
 
-        fixed = "12121212-3434-5656-7878-909090909090"
-        state_dir = Path("/fake/home/.unbound")
-        device_id_path = MockPath(state_dir / "device-id", exists=False,
-                                  write_error=OSError("read-only fs"))
+        self.device_id_path.write_text("not-a-uuid-at-all", encoding="utf-8")
+        fixed = "44444444-5555-6666-7777-888888888888"
 
         with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
              patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
-             patch.object(did_mod.cache, "UNBOUND_DIR",
-                          MockDir(state_dir, device_id_path, mkdir_error=None)), \
+             patch.object(did_mod.cache, "UNBOUND_DIR", self.state_dir), \
              patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
             result = self.extractor.extract_device_id()
-        self.assertEqual(result, fixed)
-
-
-class MockPath:
-    """Stand-in for the resolved ``device-id`` file path."""
-
-    def __init__(self, path, exists=False, read_value="", write_sink=None,
-                 write_error=None):
-        self._path = path
-        self._exists = exists
-        self._read_value = read_value
-        self._write_sink = write_sink if write_sink is not None else {}
-        self._write_error = write_error
-
-    def exists(self):
-        return self._exists
-
-    def is_file(self):
-        return self._exists
 
-    def read_text(self, encoding="utf-8"):
-        return self._read_value
-
-    def write_text(self, content, encoding="utf-8"):
-        if self._write_error is not None:
-            raise self._write_error
-        self._write_sink["content"] = content
-
-    def __str__(self):
-        return str(self._path)
+        self.assertEqual(result, fixed)
+        self.assertEqual(self.device_id_path.read_text(encoding="utf-8"), fixed)
 
+    def test_fallback_unwritable_state_dir_returns_uuid_unpersisted(self):
+        """No usable state dir -> return an (unpersisted) uuid, never raise."""
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
 
-class MockDir:
-    """Stand-in for ``cache.UNBOUND_DIR`` supporting ``/`` and ``mkdir``."""
+        fixed = "99999999-8888-7777-6666-555555555555"
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=False), \
+             patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
+            result = self.extractor.extract_device_id()
+        self.assertEqual(result, fixed)
 
-    def __init__(self, path, child, mkdir_error=None):
-        self._path = path
-        self._child = child
-        self._mkdir_error = mkdir_error
+    def test_fallback_write_failure_still_returns_uuid(self):
+        """A write error must not crash; the freshly-minted uuid is returned.
 
-    def __truediv__(self, name):
-        return self._child
+        Patches the atomic-write primitive (mkstemp) to raise.
+        """
+        from scripts.coding_discovery_tools.linux import device_id as did_mod
 
-    def mkdir(self, parents=False, exist_ok=False):
-        if self._mkdir_error is not None:
-            raise self._mkdir_error
+        fixed = "12121212-3434-5656-7878-909090909090"
 
-    def __str__(self):
-        return str(self._path)
+        with patch.object(did_mod, "_MACHINE_ID_PATHS", []), \
+             patch.object(did_mod.cache, "_ensure_state_dir", return_value=True), \
+             patch.object(did_mod.cache, "UNBOUND_DIR", self.state_dir), \
+             patch.object(did_mod.tempfile, "mkstemp",
+                          side_effect=OSError("read-only fs")), \
+             patch.object(did_mod.uuid, "uuid4", return_value=uuid.UUID(fixed)):
+            result = self.extractor.extract_device_id()
+        self.assertEqual(result, fixed)
+        # Write failed -> nothing persisted.
+        self.assertFalse(self.device_id_path.exists())
 
 
 if __name__ == "__main__":