Tracks implementation work against DESIGN.md. Mark items [x] as they land. Phases mirror DESIGN.md §10.
ssh_server_infodual-surface (v1.5.0) -- New read-tier tool plus anmcp://ssh-mcp/server-infoMCP resource that share a single_collect_server_infohelper. Returns{name, version, total_tools, enabled_tiers, enabled_groups}-- the LLM (and operators) can self-introspect "what version am I talking to / which tiers + groups does the operator have unlocked / how many tools are visible". Resource is the primary discovery path (free per turn, doesn't cost catalog tokens); tool is the fallback for clients that don't surface resources to the LLM (Claude Code, Claude Desktop). Same JSON payload shape on both surfaces. Lives ingroup:host. No new env vars. Tests: 1649 passing / 3 skipped (+6: 5 server-info + 1 SKILL ascii). ADR-0029. (id: server-info-dual-surface-v1.5.0)- Open subdocs: none -- AGENTS/SECURITY/CONFIGURATION untouched (no security boundary shifted, no operator-facing knob added).
-
CAS concurrent-writer safety for
ssh_host_notes_append(v1.4.0, hot-patch) -- INC-065 fix. Operator hit a real lost-update where two MCP server processes both appended to the samenotes/<host>.mdand one entry vanished. The read-modify-write was atomic at the FS level (tmp+os.replace) but had no logical CAS. Fix: optimistic CAS via(mtime_ns, size)snapshot captured at read, re-checked right beforeos.replace; 5-iteration retry loop inssh_host_notes_appendthat rebuilds against fresh snapshots when a concurrent writer beats us. Pathological contention raisesRuntimeErrorinstead of unbounded spin.ssh_host_notes_setdeliberately stays last-writer-wins (whole-file replace; CAS variant with explicitexpected_etagdeferred). Microseconds-wide TOCTOU window between finalstat()andos.replaceis documented and accepted for our contention level;fcntl.flockrejected by operator preference. Tests: 1643 passing / 3 skipped (+3 CAS-specific). INC-065 recorded. (id: notes-append-cas-v1.4.0)- BACKLOG candidate for v1.14+:
ssh_host_notes_set(expected_etag=...)opt-in CAS -- caller passes the snapshot etag they read fromssh_host_notes, write rejects on mismatch with a clear "file changed since you read it, re-read first" error.
- BACKLOG candidate for v1.14+:
-
Sudo-tier path-bearing tools + path-aware cheatsheet (v1.4.0) -- Five new sudo-tier tools:
ssh_sudo_read,ssh_sudo_read_redacted,ssh_sudo_write,ssh_sudo_edit,ssh_sudo_sftp_list. All tagged{dangerous, sudo, group:sudo}, routed through fullresolve_pathpolicy chain. New serviceservices/sudo_file_ops.pywith helperssudo_read_bytes,sudo_stat_owner,sudo_stat_mode,sudo_atomic_write,sudo_ls_parsed.ssh_sudo_writesupports three-way payload mutex (content_text/content_base64/local_path);local_pathreads from MCP host into memory then pipes via stdin, capped atSSH_LOCAL_TRANSFER_MAX_BYTES(2 GiB) vs 256 MiB for inline.ssh_sudo_editpreserves both ownership AND mode (sudo stat -c '%a'+sudo stat -c '%U:%G'pre-step; 0o600 secrets files do not get widened to 0o644). Path-aware cheatsheet expansion: 7 new patterns (read-single,read-ambiguous,list-single,sudo-read-single,sudo-write-single,sudo-edit-single,sudo-list-single);read-single/sudo-read-singleroutes to_redactedvariant when path matchesredact_paths_globs. Live-verified on iruelg4 (sudo cat .env -> RedactBypassBlocked, ssh_sudo_write mode-0o600 round-trip, ssh_sudo_edit mode-preservation). Phase-3 bug fixed:sudo_atomic_writepositional-arg parse failure (sh: Syntax error: word unexpected) -- shell vars inlined at top of script body instead. Tests: 1640 passing / 3 skipped. ADR-0028 recorded. INC-064 updated with partial mitigation. (id: sudo-path-tools-v1.4.0)- BACKLOG candidates for v1.14+:
ssh_sudo_write(local_path=)true streaming (currently in-memory buffer; no chunked pipe to sudo stdin yet)fetch_sudo_passwordper-call caching: 4 keyring/subprocess lookups perssh_sudo_editis operator-visible latency whenSSH_SUDO_PASSWORD_CMDis slow- Project convention: shell-script-body construction tests should dry-run via
subprocess.run(["sh", "-n", "-c", body])to catch parse errors before live-verify (the Phase-3 positional-arg bug would have been caught in milliseconds)
- BACKLOG candidates for v1.14+:
-
Secret-redaction policy (v1.4.0) — Full redact layer shipped and live-verified on a real host. New
ssh_read_redactedtool (read-tier, group:sftp-read): auto-detects format from extension (env/yaml/json/ini/generic), redacts via 3-layer detection (key-match / PEM-always / entropy), HMAC-SHA256 12-char prefix markers, deterministic across calls.redact_bypass_policygates raw-content SFTP tools onredact_paths_globs-listed paths (block/warn/audit_only).restricted_globsadds glob-pattern hard-deny alongside the existing prefixrestricted_paths. Per-host overrides for all 7 knobs inhosts.toml [defaults]block.hosts.toml.example+.env.exampleupdated with full reference. Audit-lineredact_bypass=truefield lands via ContextVar side-effect inservices/audit.pyfor warn/audit_only modes. Raw-exec bypass gap documented as INC-064 (by-design; mitigated by not allowlistingcat/lessincommand_allowlist). Anchor syntax (^PREFIX_,_SUFFIX$,^EXACT$) in_key_matchesprevents over-matching (e.g.^PASS_,_PASS$instead of barePASS). 16 pre-existing mypy errors acrossssh/,services/edit_service.py,telemetry.pyetc. fixed incidentally in this sprint. Tests: 5 new test modules, 1577 passing / 3 skipped. ADR-0027 recorded. (id: redact-policy-v1.4.0)
local_pathmode forssh_upload,ssh_deploy,ssh_sftp_download(v1.3.0) — Adds alocal_pathkeyword-only param to all three tools, streaming bytes between the MCP host's own filesystem and the remote target without encoding them into tool-call arguments. NewSSH_LOCAL_TRANSFER_ROOTSenv var (CSV/JSON allowlist of MCP-host directories; empty = disabled, default) andSSH_LOCAL_TRANSFER_MAX_BYTES(default 2 GiB, separate from the existing 256 MiB base64 cap). Newservices/local_path_policy.pyenforces the MCP-host-side boundary;LocalPathPolicyErroradded tossh/errors.py.WriteResultandDownloadResultgainlocal_path_written: str | None.ssh_sftp_downloadinlocal_pathmode streams remote to local via atomic tmp+os.replace. Fixes the base64-channel bottleneck where the LLM defensively chunked large uploads into many small tool calls. Documented in ADR-0026; skills updated forssh_upload,ssh_deploy,ssh_sftp_download; runbooksssh-deploy-verifyandssh-host-snapshotupdated withlocal_pathguidance. (id: local-path-transfer)
- Exec-discipline sprint kickoff. Eval of a real OS-upgrade session found ~62% of 127
ssh_exec_runcalls were avoidable (matched an existing native tool's cheatsheet entry), with 3 anti-patterns dominating: heredoc file-writes that should go throughssh_upload, lookups that should use native sftp/systemctl/docker tools, and ad-hoc shell composites where the script itself wasn't the artefact. Full breakdown in docs/evals/2026-05-22-exec-run-discipline.md. Kicks off the discipline sprint: A1 (this row + correction #8 in.claude/team/corrections.md+ eval relocation), C1/C2 (first-class systemctl/apt mutation tools sossh_sudo_exec systemctl ...andssh_sudo_exec apt-get ...stop being the default path), B1 (default-onSSH_EXEC_ALLOW_CHEATSHEET_PATTERNS=falsereject patterns so heredoc + native-tool-matching commands fail closed at the tool surface), B2 (hint footer pointing the LLM at the matching native tool when a reject fires), D1/D2 (runbook addition + AGENTS.md sweep so the discipline is documented and audit-checkable). (id: exec-discipline-sprint)
-
Native
--filterkwargs on read-tier docker list tools (v1.8.0) —ssh_docker_psgainsname,status,label,ancestor;ssh_docker_imagesgainsreference(glob-style, supports*/?/digests),dangling,label;ssh_docker_compose_psgainsservice(trailing positional) andstatus(7-value compose set includingremoving). Label key regex widened to[A-Za-z0-9._/-]{1,128}so k8s-style keys (app.kubernetes.io/name) are accepted. All filters validated before I/O via_validate_name,_validate_label_filter,_validate_reference_filter. Argv ordering deterministic and locked by tests intests/test_docker_read_filters.py. (id: docker-filter-kwargs-sprint) -
Follow-up: defense-in-depth regex tightening —
_DOCKER_NAME_RE,_DOCKER_FILTER_RE,_DOCKER_TIME_REin_helpers.pyshould be tightened tore.fullmatchinstead ofre.matchto prevent prefix-match bypass. Deferred from the filter sprint; flagged by senior-reviewer. -
Follow-up: pre-existing format baseline cleanup — one-shot
uv run ruff formatsprint on__init__.py,lifecycle_tools.py,dangerous_tools.py,test_docker_top_cp.py. Deferred; formatting noise in those files was pre-existing and out of scope for the filter sprint. -
Follow-up: mypy baseline cleanup — 9 pre-existing mypy errors across 8 modules outside
tools/docker/(not introduced by the filter sprint). Deferred; needs a dedicated pass to triage and fix without touching unrelated code. -
2026-04-30 (latest) Sprint 3 — Consistency cleanup (v1.4.0). Three independent hardening tasks shipped together. (3a)
as_strhelper consolidation: newservices/text.pyexposesas_str(value: bytes | bytearray | str | None) -> str(errors-replace UTF-8 decode;None → ""). Replaces two private_as_strhelpers (inssh/exec.py+tools/sftp_read_tools.py) and 12+ inline coercion sites across the codebase. Tight signature — noobjectaccepted. (3b)extra="forbid"on systemctl models:models/systemctl.pywas the only model file missing INC-046's strictness. Added_RESULT_MODEL_CONFIG = ConfigDict(extra="forbid")constant + applied to all 9 models (mirrorsmodels/results.pypattern; closes ADR-0025 in DECISIONS.md). (3c)@audited(tier="read")on every read tool (Option B): 16 read tools decorated — SFTP (ssh_sftp_list,ssh_sftp_stat,ssh_sftp_download,ssh_find), sessions (ssh_session_list,ssh_shell_list), host (ssh_host_ping,ssh_host_info,ssh_host_network,ssh_user_info,ssh_host_disk_usage,ssh_host_processes,ssh_host_alerts,ssh_known_hosts_verify,ssh_host_list,ssh_host_notes). New testtests/test_audited_coverage.py— 24 parametrized tests locking the policy in (every read tool MUST carry@audited(tier="read")at module load). (id: consistency-cleanup-sprint3) -
2026-04-27 (latest)
resolve_pathhelper added to path-policy chain — closes the "forgotcheck_not_restricted" footgun class. Newasync def resolve_path(conn, path, policy, settings, *, must_exist=True) -> strinservices/path_policy.pybundlescanonicalize_and_check+check_not_restrictedinto one call so tool authors can't accidentally skip the restricted-zones check.ssh_transferandssh_uploadmigrated to the new helper;ssh_link's two-sided validation and compose-file call sites deliberately kept on the underlying primitives. Docs updated: DESIGN.md §5.6, TOOLS.md low-access intro +ssh_transferrow, BOOTSTRAP.md §5 Path safety + footgun worked example + security checklist, skills/ssh-transfer/SKILL.md, skills/ssh-upload/SKILL.md. Pure service-layer refactor; no tool behavior changes; no version bump (covered by 1.1.0). Suite: 826 unit pass (unchanged), 1 skipped. (id: resolve-path-refactor) -
2026-04-27
ssh_host_pingalso auto-injects agent notes — both layers ride on ping (INC-060). Operator: "ssh ping should have option to show all notes too (on by default)." INC-059 had explicitly held the agent layer back from auto-injection on context-budget grounds; operator overruled -- visibility into past-self memory beats the budget concern, especially since most agent sidecars are far smaller than the 256 KiB cap. AddedSSH_PING_INCLUDES_AGENT_NOTES: bool = Truesetting (parallel structure toSSH_PING_INCLUDES_NOTES, independent toggle so operators can mix-and-match) andPingResult.agent_notes: str | None = Nonefield. The injection logic inssh_host_pingreuses the existing_HOST_NOTES_ALIAS_RE+_read_sidecarhelpers from INC-055 (defense-in-depth: even thoughresolve_hostalready filters aliases, the regex re-validates before path concatenation). 0-byte sidecars return None (matchesssh_host_notessemantics for cleared-via-_set("")files). 6 new regression tests on top of INC-059's 7 (13 total in tests/test_ping_notes_injection.py): sidecar exists + setting on → populated; sidecar missing → None; setting off → None;SSH_HOST_NOTES_DIR=None→ None; 0-byte sidecar → None; independence test exercising all four (operator x agent on/off) combinations. SKILL updates: ssh-host-ping/SKILL.md shows both fields + documents the 256 KiB context-budget caveat + opt-out; ssh-host-notes/SKILL.md "When to call it" rewritten to reframe the dedicated tool as primarily for "re-reads after writes" / "ping injection disabled" cases (the standard discovery flow now puts ping first). .env.example documents the new setting; TOOLS.md ping row updated with both layers. Catalog: still 74 tools (field addition). Suite: 826 unit pass (up from 820), 1 skipped. Ruff: one TC003 stdlib-Pathimport in test fixture flagged + fixed (moved to TYPE_CHECKING -- only used for annotations). Mypy strict: zero new errors. -
2026-04-27
ssh_host_pingauto-injects operator notes — enforcement-by-ergonomics (INC-059). Operator asked: did we make sure the LLM loads host memory first when connecting to a host? Honest audit: no -- INC-055's two-layer notes were only DOCUMENTED as "always call before doing anything substantive" via SKILL files (load on demand) and ahas_notes: boolflag onssh_host_list(LLM had to think to check). Nothing made the LLM actually CALLssh_host_notes(host)first; if it skipped the SKILL load and reached forssh_exec_runstraight off, the operator's hard-rule constraints never entered context. Fix: auto-inject the operator-baseline notes intossh_host_ping's result. Ping is the canonical "I'm starting work on this host" probe; LLMs reach for it naturally early in any host-targeted workflow. Riding the notes on ping means the LLM gets the operator's constraints into context for free. Agent-side memory (the LLM's own session-spanning sidecar) is NOT auto-injected -- it can grow to 256 KiB and would bloat every ping; still requires the dedicatedssh_host_notescall. Three options weighed: (1) auto-inject into ping [shipped], (2) auto-inject into EVERY host-acting tool result [unmissable but invasive across ~15 result models], (3) pre-tool-call hook that fails the first call to a host withhas_notes=Trueuntilssh_host_notes(host)was called [most authoritarian; LLM would just learn to call notes blindly]. Option 1 picked for surface-area-to-value ratio. New settingSSH_PING_INCLUDES_NOTES: bool = True(config.py) is the opt-out for tool-execution-only deployments where ping should stay minimal. New fieldPingResult.operator_notes: str | None = None(models/results.py) populated only when setting is true AND the host has notes set; whitespace-only notes treated as absent (matcheshas_noteslogic). 7 regression tests in tests/test_ping_notes_injection.py covering: notes present + default on injects; setting off omits; no notes returns None; whitespace-only treated as absent; surrounding whitespace stripped; setting off with no notes still None; existing ping fields unaffected. SKILL updates: ssh-host-ping/SKILL.md Returns example showsoperator_notes+ "When to call it" emphasizes "read them before proposing a plan -- they may forbid the obvious approach you were about to take"; ssh-host-notes/SKILL.md rewritten to describe the operator layer as "auto-injected into ping" and reframe the dedicated tool as primarily for the AGENT layer. TOOLS.md ping row + .env.example updated. Catalog: still 74 tools (field addition, not a new tool). Suite: 820 unit pass (up from 813), 1 skipped. Ruff: one TC003 stdlib-Pathimport flagged + fixed (moved to TYPE_CHECKING -- safe becausefrom __future__ import annotationsmakes annotations strings; helpers usingPathare not tools so FastMCP'sget_type_hints()doesn't see them). Mypy strict: zero new errors. -
2026-04-25 Output sanitizer (INC-057) + Pass A extension to file-content surfaces (INC-058). Operator asked what happens when remote tools return "poisoned" output. Audit showed the encoding layer was safe (UTF-8 with
errors='replace', JSON escaping, byte-cap) but the content itself was unfiltered -- prompt-injection / display-hijack surface. INC-057 (sanitizer core): new services/output_sanitizer.py withsanitize(text) -> (cleaned, warnings)-- strips ANSI escape sequences (CSI / OSC with BEL or ST terminators / single-byte private-use) and NUL bytes; flags-only on bidi overrides (U+202D/E,U+2066-U+2069-- the trojan-source attack), zero-width chars (U+200B-U+200D,U+FEFF), C1 controls (U+0080-U+009F), LLM protocol markers (<|im_end|>,</s>,[INST], etc., 16 patterns, case-insensitive), and lines mimicking conversation turns (^User:/^Assistant:/^System:/^Human:/^AI:, line-start anchored, case-insensitive). Wired into ssh/exec.pyrun()+run_streaming()after truncation; newoutput_warnings: list[str] = []field onExecResultcarries the result. Warnings from stdout + stderr merge into a deduplicated list. Streamingchunk_cbdeliberately sees raw bytes (progress is ephemeral); the persistedExecResult.stdoutis always sanitized. Coverage extends transitively to every tool that goes throughexec.run()--ssh_exec_*,ssh_sudo_*,ssh_shell_exec,ssh_broadcast, all 22ssh_docker_*(via_run_docker.model_dump()). 40 regression tests in tests/test_output_sanitizer.py. INC-058 Pass A (extension to non-exec file-content paths):_run_systemctlwidened from 3-tuple to 4-tuple(stdout, stderr, exit_code, output_warnings)-- 8 callers updated, 3 propagate (ssh_systemctl_status/_cat/ssh_journalctl), 5 discard (is_active/is_enabled/is_failed/list_units/show). Their result models gainedoutput_warnings: list[str] = [].ssh_sftp_downloadseparately runs the newscan(text)flag-only helper on a UTF-8 view of the bytes --content_base64is NOT modified (binary safety) but warnings flag what a text decode would surface so the LLM cansanitize()after decoding if processing as text.DownloadResultgainedoutput_warnings._run_dockeralready returnedresult.model_dump()with INC-057's warnings included, so docker tools (ssh_docker_logs,_inspect, etc.) propagate warnings without code changes. 8 propagation tests in tests/test_output_warnings_propagation.py. The trojan-source meta-loop: every non-ASCII codepoint in the sanitizer + its tests is written aschr(0xNNNN)rather than as literal characters -- IDE bidi-aware lints correctly flagged the literal form as the exact "obfuscated source code" pattern the sanitizer exists to defend against. Module docstring + comments capture this so future readers don't "clean up" the chr() calls. Pass B (filename scanning forssh_sftp_list/ssh_find+ structured docker / host fields) deferred -- lower volume than the file/log content paths. Catalog: still 74 tools (model + middleware change, no new tools). Suite: 813 unit pass (up from 765), 1 skipped. Ruff clean on touched files; mypy strict gains zero new errors and clears 2 pre-existing ones insftp_read_tools.pyvia defensivebytescoercion on the SFTP read return. -
2026-04-25
ssh_linkexpanded — hard + symbolic links with both-sides path validation (INC-056 cont.). Addedsymbolic: bool = Falseparameter to the existingssh_link(low-access + group:file-ops). Whensymbolic=True, callssftp.symlink(src, dst)directly — pure SFTP, src stored verbatim (preserves relative-link semantics). Per GNUln's "Using -s ignores -L and -P",follow_symlinksis silently ignored in symbolic mode. Both sides path-validated per operator direction: dst goes through normalcanonicalize_and_check; src is treated as a TARGET STRING (not a real path -- POSIX permits dangling symlinks, target may not exist) and validated string-wise viareject_bad_characters+ relative-resolve against dst's parent +posixpath.normpath+check_in_allowlist+check_not_restricted. Originalsrctext passed verbatim tosftp.symlink(); the policy decision was made on the normalized form, but on-disk semantics keep operator intent. Defense-in-depth rationale: read-through-symlink already re-triggers path policy viacanonicalize, but write-time validation also catches the prompt-injection pattern where a malicious prompt createslink -> /etc/shadowas a marker. Dangling targets remain ALLOWED (POSIX-correct). NUL / control bytes in target rejected up front. SKILL.md rewritten with three-mode coverage + the path-policy notes split per mode + new examples for thecurrent → release-vNrolling-release pattern + dangling-target case. TOOLS.md row expanded. INC-056 detail updated. 7 new test cases (14 total in tests/test_link.py) covering:sftp.symlinkcalled with verbatim target, dangling targets succeed (no lstat call), target outside allowlist raisesPathNotAllowed, relative target resolved against dst's parent for the policy check, NUL bytes rejected,follow_symlinkssilently ignored whensymbolic=True. Catalog still 74 tools (parameter expansion of an existing tool, not a new one). Suite: 765 unit pass (up from 758), 1 skipped. Ruff clean; mypy strict adds zero new errors. The "flip default to match GNUln's-Pdefault" question (raised after the operator quoted the GNU man page) was deferred — current default remainsfollow_symlinks=Truematching OpenSSH SFTP's natural behavior; easy to flip later if real-world surprise materializes. -
2026-04-25
ssh_link-- hard links, default-L, opt-in-P(INC-056). Newlow-access + group:file-opstool:ssh_link(host, src, dst, ctx, follow_symlinks=True). Default mode callssftp.link()directly -- pure SFTP, OpenSSH's sftp-server useslinkat(AT_SYMLINK_FOLLOW)so the link resolves the symlink chain (matchesln -L).follow_symlinks=Falseisln -P --physical("make hard links directly to symbolic links"); SFTP can't express that, so it falls back to a shellln -P -- <src> <dst>invocation viaconn.runwithshlex.joinargv -- same pattern asssh_cp/ssh_mv's shell fallbacks, doesn't requirelnincommand_allowlist. Both src and dst route through path policy. Path-policy weakening for-P(documented in the SKILL): canonicalizing src would resolve the symlink we want to point at, defeating-P's point. Compromise -- canonicalize the parent of src (must exist + be allowlisted) andlstatconfirms src exists in that dir; restricted-paths check still applies to the constructed full path. The check is "the symlink lives in an allowed dir," not "everywhere this symlink could ever point is allowed." Defensive details:-Pmode rejects directory-only src up front, surfaces cleanValueErroron lstat failure (not rawSFTPError), raisesWriteErroron shell non-zero exit. argv-quoted viashlex.join-- no string interpolation into the shell command. POSIX-only viarequire_posix. Existing dst raises (no-f/ force; usessh_deletefirst). No-s(symbolic) for v1 -- explicit ask was hard links. 7 regression tests in tests/test_link.py covering both modes' happy paths, dst-exists propagation,-Plstat-missing -> ValueError,-Pshell failure -> WriteError, directory-only src rejection, WindowsPlatformNotSupported. TOOLS.md row added; skills/ssh-link/SKILL.md authored with explicit "when to use hard links vsssh_cp" + the path-policy weakening callout. Catalog: 74 tools across 9 groups. Suite: 758 unit pass (up from 750), 1 skipped. Ruff clean; mypy strict adds zero new errors. Audit pass earlier in the session confirmed no command-injection vector (argv-list construction, no f-string-into-shell anywhere new). -
2026-04-25 Per-host two-layer memory: operator baseline + agent sidecar (INC-055). Operator wanted CLAUDE.md-style persistent host memory the LLM could write itself, so durable lessons survive across sessions. Shipped a two-layer model: (1) operator notes --
notes = """..."""field on[hosts.<alias>]inhosts.toml, hard-rule baseline READ-ONLY to the agent ("never install apache2 here", ownership, on-call routing). (2) agent notes -- markdown sidecar at<SSH_HOST_NOTES_DIR>/<alias>.md(defaultnotes/<alias>.md), READ-WRITE by the LLM. Three new tools:ssh_host_notes(host)(safe + read + group:host) returns both layers in one call --{operator_notes, agent_notes, agent_notes_path, has_notes};ssh_host_notes_append(host, entry)(low-access) appends a## <UTC iso8601>\n<entry>block (creates the file with a header on first call);ssh_host_notes_set(host, content)(low-access) replaces the sidecar verbatim for consolidation or reset (empty string allowed -- clears to 0 bytes). All writes atomic via temp+os.replace; capped atSSH_HOST_NOTES_MAX_BYTES(default 256 KiB; the append error tells the LLM to consolidate via_setwhen approaching). Aliases validated against^[A-Za-z0-9._-]+$before being concatenated into a sidecar filename -- defense-in-depth against any future code path that bypassesresolve_host(which already filters).ssh_host_list'sHostListEntrycarrieshas_notes: booltrue when EITHER layer is non-empty (onestatper host, cheap). Two new settings:SSH_HOST_NOTES_DIR(None disables the agent layer; operator layer remains) andSSH_HOST_NOTES_MAX_BYTES..env.example+hosts.toml.example+.gitignoreupdated (sidecars excluded from source control). NewHostNotesResult+HostNotesWriteResultmodels withextra="forbid". Three SKILL.md authored (ssh-host-notes, ssh-host-notes-append, ssh-host-notes-set) -- the append skill explicitly lists what NOT to record (re-derivable facts, ephemeral state, secrets, long verbatim output) so the sidecar stays useful. Misunderstanding pivot: first pass shipped operator-write / agent-read only -- anotesfield onHostPolicyplus a read-onlyssh_host_notestool. Operator clarified they wanted the agent to write its own notes; the operator's role is to seed hard rules, not maintain everything. Pivoted to the two-layer model that keeps the first pass useful (it's now Layer 1) and adds the missing write side as Layer 2. 20 regression tests in tests/test_host_notes.py: operator-layer parsing,has_notesacross both layers (operator only, agent only, both, whitespace, 0-byte sidecar, dir disabled),ssh_host_notesreturning both layers cleanly, append creates header on first call + preserves history + rejects empty entries + enforces cap + raises when dir disabled + creates parent dir, set writes verbatim + replaces existing + empty clears to 0 bytes + enforces cap, unknown-alias propagation throughresolve_hostfor all three tools. Catalog: 73 tools across 9 groups (up from 71). Suite: 750 unit pass (up from 727), 1 skipped. Ruff + mypy strict clean on touched files. Three SKILL traps hit + fixed: triple-quote inside a Python docstring (SyntaxError), em-dash in a SKILL front-matter description (ASCII-guard test), unescaped|in an INCIDENTS table row (markdownlint column count). -
2026-04-25
ssh_upload/ssh_deployacceptcontent_text;ssh_exec_runframing sharpened against heredoc misuse (INC-054). Operator reported long opaquessh_exec_runcalls in tool-output transcripts that were probably justcat > path <<EOF-style file writes -- the LLM was reaching forssh_exec_runbecause (a) the discouraging language for file writes was buried as one bullet in a 14-row mapping table, and (b)ssh_upload/ssh_deployrequiredcontent_base64, which is real friction for plain-text configs. Two parallel fixes: (1) Sharpened framing -- added an explicit "NEVER usessh_exec_runfor file writes" section to both the tool docstring and skills/ssh-exec-run/SKILL.md, called out the four most common patterns by name (cat > path <<EOF,tee path,echo > path,printf > path), and expanded the cheat-sheet from 14 rows to ~22 (every file-write pattern →ssh_upload(content_text=...); added missing entries forssh_broadcast,ssh_transfer,ssh_host_network,ssh_user_info,ssh_file_hash,ssh_systemctl_*,ssh_journalctlfrom INC-052 / earlier sprints). (2) Removed encoding friction -- addedcontent_text: str | None = Noneas a sibling tocontent_base64on bothssh_uploadandssh_deploy. Plain UTF-8 (configs, scripts, code) goes viacontent_text; binaries keep usingcontent_base64. New shared helper_resolve_upload_payloadvalidates the exactly-one-of contract; empty string is a deliberate valid input (writes a zero-byte file) so the validator usesis not None, not truthiness. Existing callers passingcontent_base64positionally keep working -- parameter moved from required to optional but stayed in the same position. 7 new tests in tests/test_upload_payload.py covering plain-text encoding, unicode round-trip, empty string, binary round-trip, both-set / neither-set rejection, malformed base64. TOOLS.md rows for both tools updated with the new payload semantics + explicit "use this instead ofssh_exec_runforcat > path <<EOF/tee/echo > path/printf > path" callout. SKILL.md files for both tools rewritten with both-payload examples. Suite: 727 unit pass (up from 720), 1 skipped; ruff clean on touched files; mypy strict adds no new errors. -
2026-04-17 INC-052 step 2 —
ssh_transfer+ssh_user_info+ssh_host_network+ssh_host_infoextension + audit-log README polish. Closed out the remaining port items from the upstream comparison (analyze/ssh-server-mcp-main/).ssh_transfer(tools/multi_host_tools.py,low-access + group:file-ops) streams a file between two remotes via SFTP channels on both connections -- 256 KiB chunks, atomic write on dst (temp +posix_rename), cleanup-on-failure. Both endpoints route throughcanonicalize_and_check+check_not_restrictedindependently so per-host path policy applies; size cap fromSSH_UPLOAD_MAX_FILE_BYTES; same-host call rejected (usessh_cp); cross-platform via SFTP. Throughput bottlenecks at the slower of (src→MCP) and (MCP→dst) -- documented so operators know to use directscpviassh_exec_runwhen A and B already trust each other and want gigabit. 7 tests with a fake SFTP harness (tests/test_transfer.py) covering pre-flight rejection, overwrite gating, size cap, atomic temp+rename, mid-transfer cleanup, throughput field.ssh_user_info(tools/host_tools.py,safe + read + group:host) returns structured/etc/passwdrow + group memberships viagetent passwd+id -Gn+id -gnin parallel;username=Noneresolves the SSH user viaid -un; username regex-validated (POSIX 3.437) before reaching remote argv; no sudo. Dropped the upstream'slist-all-usersaction -- structured per-user lookup is the win.ssh_host_network(tools/host_tools.py,safe + read + group:host) parsesip -j addr showinto{name, state, mac, addresses[]}per interface; kernel-internal fields dropped; busybox hosts without iproute2 get[]instead of a raise.ssh_host_infoextended withcpu_model(parses/proc/cpuinfomodel name/ ARMModel/Hardwarefallback),cpu_count(parsesnproc),hostname_fqdn(parseshostname -f) -- three new probes added to the existingasyncio.gather(return_exceptions=True)so a missing one doesn't lose the siblings. 14 parser unit tests in tests/test_host_extensions.py covering Intel/AMD/ARM cpuinfo, ip-json happy-path + garbage-tolerance, passwd-line parsing. README audit-log section (README.md:438-486) documents thessh_mcp.auditJSON-line schema and gives fourjqrecipes (errors-last-hour, slowest-dangerous-calls, count-by-tool, trace-by-correlation_id); replaces the upstream'sssh_get_logsaudit-query tool per the INC-052 design-no rationale (audit flows one-way to operators, never back to the agent). Two new result modelsHostNetworkResult/UserInfoResult/TransferResult+ helpersNetworkInterfaceAddress/NetworkInterfaceEntryin models/results.py all withextra="forbid"per INC-046. INC-052 status →resolved.ssh_snapshotstill deferred (runbook-first);ssh_get_logs+ port-forwarding + local-FS SFTP +ssh_crondesign-nos hold. Catalog: 71 tools across 9 groups (up from 68). Suite: 720 unit pass (up from 683), 1 skipped. Ruff clean on touched files; mypy strict adds one pre-existing-patternattr-definedonasyncssh.sftp.FX_NO_SUCH_FILEmirroring low_access_tools.py:158. -
2026-04-17
ssh_broadcast— fan-out exec across pre-configured hosts (INC-052 step 1). First port from the upstream tool-surface comparison (analyze/ssh-server-mcp-main/ — TypeScript SSH-MCP server). Newdangerous + group:exectool runs the same command on multiple hosts in parallel, returns a structured per-host result. Hard cap of 50 hosts per call; aliases deduplicated; per-hostcommand_allowlistandplatformchecked independently so one host'sCommandNotAllowed/PlatformNotSupported/ transport failure does NOT abort the others. Pre-flight validation distinguishes caller errors (empty list, over-cap,HostNotAllowed/HostBlockedaliases — RAISE up front) from transient per-host failures (captured in theerrorsmap). Result shape{command, results{alias→ExecResult}, succeeded[], failed[], errors{alias→exception-class}, elapsed_ms}—commandechoed because the audit decorator recordshost="?"for fan-out tools, so the result body is the durable record of what ran where. New module tools/multi_host_tools.py (sibling toexec_tools.py; futuressh_transferwill land here too). New result model BroadcastResult withextra="forbid"per INC-046. 13 regression tests in tests/test_broadcast.py: empty/over-cap/typo/blocked rejection, dedup, all-succeed happy path, per-host allowlist denial, Windows host PlatformNotSupported, generic transport-error catch-all, timeout-as-failed, non-zero-exit-as-failed, command echo, unique-acquire pin. INC-045 trap dodged —Contextis a runtime import intools/**because FastMCP's@toolcallsget_type_hints()at registration. INC-052 status flipped topartial(broadcast shipped;ssh_transfer+ssh_user_info+ssh_host_infoextension still pending; design-no forssh_get_logs+ port forwarding stands). TOOLS.md row added; skills/ssh-broadcast/SKILL.md authored (ASCII-only pertest_skills_ascii.py). Catalog: 68 tools, 9 groups. Suite: 683 unit pass (up from 670), 1 skipped; ruff + mypy strict clean on touched files. -
2026-04-17 Host-catalog introspection + runtime policy reload. Two new
group:hosttools:ssh_host_list(safe tier) — enumerate aliases currently loaded fromhosts.toml+SSH_HOSTS_ALLOWLIST. Returns{alias, hostname, port, platform, user, auth_method}— credentials never exposed. Unblocks LLM self-discovery of the fleet without the operator pre-listing aliases in prompts.ssh_host_reload(low-access tier, gated byALLOW_LOW_ACCESS_TOOLS=true) — re-readSSH_HOSTS_FILEfrom disk and swap the in-memory policy atomically. Returns{loaded, source, added, removed, changed}diff. Validates new file BEFORE swap — parse/validation failure keeps the existing fleet intact (no brick-by-bad-config). Does NOT invalidate pooled connections; live sessions retain original policy until keepalive drops them. Typed accessorhosts_from(ctx)added to tools/_context.py alongside the existingpool_from/settings_from/known_hosts_from— rawlifespan_context["hosts"]access eliminated from tools. Catalog: 67 tools, 9 groups. Suite: 664 unit pass (up from 646), 1 skipped.
-
2026-04-17 Windows file-hash path fixed (INC-028 lineage).
_hash_windowswas failing against real Windows OpenSSH:Get-FileHashemits Write-Progress records that OpenSSH-for-Windows serializes as CLIXML (#< CLIXML <Objs…) into stderr, and the script ending on an expression causedexit_status=Nonechannel closes. Three-part fix: (1) prepend$ProgressPreference='SilentlyContinue';to silence progress records; (2) append;exit 0to force an exit-status channel request; (3) shape-validating fallback — acceptexit_status ∈ {0, None}IFF the digest hex matches the expected length for the requested algorithm. E2Etest_file_hash[test_windows11]now green. 5 unit tests updated to use algorithm-length-appropriate fake digests (was 12-char stub for all); 2 new assertions pin the$ProgressPreferenceline +exit 0trailer. Suite: 664 unit, 93 e2e pass (up from 72 pre-Win11). -
2026-04-17
systemctlsafe-tier domain (8 tools). Newgroup:systemctl—ssh_systemctl_status,ssh_systemctl_is_active,ssh_systemctl_is_enabled,ssh_systemctl_is_failed,ssh_systemctl_list_units,ssh_systemctl_show,ssh_systemctl_cat,ssh_journalctl. All tagged{safe, read, group:systemctl},version="1.0". Result models in models/systemctl.py. Lifecycle ops (start/stop/restart/reload/enable/disable/daemon-reload) intentionally documented asssh_sudo_exec systemctl …examples in the runbook rather than first-class tools — they require root on stock hosts and gating them through the sudo tier avoids a false-low-access ergonomic trap. 8 per-tool skills + 1 consolidated runbook at runbooks/ssh-systemd-diagnostics/SKILL.md._JOURNALCTL_TIME_REinitially copied docker'ss/m/h-only posture, caught by the e2e run (since="30d"→ ValueError) — tightened to match systemd.time(7):s/m/h/d/w/M/y. Catalog: 65 tools, 9 groups. Suite: 162 unit pass (systemctl), 641 full unit + 1 skipped, 16 e2e pass (8 tests × 2 reachable Linux hosts) + 8 skipped (windows11 unreachable/platform). -
2026-04-30 Resolved (Sprint 3, v1.4.0): Pydantic
model_config = ConfigDict(extra="forbid")policy now uniform. All 9 systemctl result models in models/systemctl.py now carry_RESULT_MODEL_CONFIG = ConfigDict(extra="forbid")— matching the pattern from models/results.py. Everymodels/*.pyresult class is now strict; no per-model exceptions remain. See ADR-0025 in DECISIONS.md. -
2026-04-17 Deferred:
SSH_RUNBOOKS_DIRprovider wiring. lifespan.py mountsSkillsDirectoryProvideronSSH_SKILLS_DIR+SSH_RUNBOOKS_DIR, BUT the latter is wired into_mount_skills— confirm against current code before closing. If already wired, mark this closed retroactively; if not, add a second provider or repoint. -
2026-04-17 Deferred: Widen tests/test_skills_ascii.py to cover
runbooks/*/SKILL.md. The ASCII-only guard scansskills/but notrunbooks/. Would have caught a pre-review em-dash in the new runbook automatically. Also flagstests/test_systemctl_tools.py:364which has a minor em-dash in a comment (non-blocking today; would need attention if the policy extends to test comments). -
2026-04-17
_canonicalize_posixmust_exist=Trueactually enforces existence now. Surfaced by the full e2e run after the audit-redaction landing: test_cp_mv_edit_patch_deploymvs a file then expectsssh_sftp_staton the now-missing source to raisePathNotAllowed, but it raised rawasyncssh.sftp.SFTPNoSuchFileinstead. Root cause: _canonicalize_posix ranrealpathwith no flags whenmust_exist=True(only added-mformust_exist=False); GNUrealpath's default mode requires the parent to exist but tolerates a missing leaf, so the canonicalize step silently succeeded and the missing-file signal leaked out of the next SFTP op as a transport error. Fix is one line:argv.append("-e" if must_exist else "-m"). With-e(canonicalize-existing), every component must exist — exactly the contract the kwarg name advertises. 5 unit-test fixture lines updated to the new argv shape. Suite: 475 unit + 56 e2e pass, 34 e2e skipped (sudo-gated, Windows-no-docker). -
2026-04-17 Argv-secret redaction wired into audit. Closed the long-standing "Redact
--password=*/--token=*in telemetry and audit" line under ongoing/cross-cutting. Telemetry side was already covered by exclusion (spans don't attach argv per the redaction posture in telemetry.py module docstring); audit side had acommand_hashfield that nothing populated AND, when populated, would have stably hashed the raw secret value. Two-part fix: (1) new redact_command_string helper (regex-based, length-preserving, mirrorsredact_argvsemantics for raw-string commands whereshlex.splitwould fail on partial pipelines); (2) @audited wrapper extractscommand: str(ssh_exec_run / ssh_exec_run_streaming / ssh_sudo_exec / ssh_docker_exec) andargs: list[str](ssh_docker_run) into the audit line, whilescript:bodies (ssh_exec_script) stay deliberately out of the capture per the tool's stdin-only contract.record()itself redacts BEFORE hashing AND strips the:Nlength suffix via _REDACTED_LEN_SUFFIX_RE so two--password=Xinvocations with different X produce the samecommand_hash— dedup-by-shape instead of a stable per-secret fingerprint that would be trivially rainbow-tableable for guessable passwords. 10 new tests pin both helpers and the wiring (includingtest_record_redacts_secret_flags_before_hashingwhich asserts hash equality across two different secret values). Suite: 475 unit pass (up from 465), 1 skipped; ruff clean. -
2026-04-17 Phase 5 telemetry wiring landed. Closed the last open item on Phase 5:
telemetry.span()now actually wraps the three transport call sites it was always supposed to (connection.py:_open_single, exec.py:run + run_streaming, path_policy.py:canonicalize_and_check). Span names match the DESIGN.md naming (ssh.connect,ssh.exec,path.canonicalize); attributes attach host, port, user, auth method, proxy hop count, exit code, duration, timed-out flag, and stdout/stderr byte counts — but never argv strings, path content, or auth secrets, per the redaction posture stated in telemetry.py module docstring. Without OTel installed everything degrades to the existing_NoopSpanso this is a zero-cost addition for users who don't opt into[telemetry]. Three new regression tests in test_telemetry.py lock the wiring:test_exec_run_opens_ssh_exec_spanandtest_path_policy_opens_canonicalize_spanmonkeypatch thespanimport in each consumer module, drive the call site, and assert both the open-time attributes AND the absence of argv/path content in any captured value (this is how a future "let's just attachargsfor debugging" regression gets caught at PR time);test_connection_module_imports_spanis a static binding check becauseopen_connectionrequires a live asyncssh handshake to exercise end-to-end. Also closed two pre-existing INC-046 leftovers intests/e2e/: dict-style accesses onHashResult/PingResult/DownloadResultreturns (now attribute access). Addeduv.lockto .gitignore — the file churns per-machine and was generating noise across every shell that ran auvcommand. Suite: 465 unit pass (up from 462), 1 skipped; ruff + mypy strict clean on all touched files. -
2026-04-17
SSH_CONFIG_FILEwired through (INC-051, ext:classfang/ssh-mcp-server#22). Closed a doc/code drift uncovered while cross-checking an upstream feature request: theSSH_CONFIG_FILE: Path | None = Nonefield had been declared in config.py, surfaced in .env.example, promised by AGENTS.md:594, and echoed in DESIGN.md:451 — but the only consumer insrc/was the test conftest cleaning it out of the environment; _open_single never passed anything toasyncssh.connect(config=...). Operators setting the field saw zero effect —ProxyJump,IdentityFile, host-aliasHostNameresolution, andCiphers/MACs/KexAlgorithmsoverrides from their personal SSH config were silently ignored. Fix is three small edits: (1) _open_single appendskwargs["config"] = [str(settings.SSH_CONFIG_FILE.expanduser())]when set —expanduser()called explicitly because pydantic'sPathcoercion does not, and asyncssh treats config-file values as fallbacks for kwargs not explicitly passed so our explicithost/port/username/known_hostsstill win; (2) config.py:107-118 adds an_empty_path_to_nonefield validator soSSH_CONFIG_FILE=(blank in.env.example) doesn't smuggle aPath("")through (truthy, points at CWD); (3) lifespan.py:236-249 emitsssh_config: honoring <abs-path>when set+exists or WARNING when set+missing — asyncssh tolerates a missing config file silently, which makes "I set the env var but ProxyJump still doesn't apply" debug sessions awful. Five regression tests in test_ssh_config_file.py pin the contract:configkwarg appears when set, absent when unset,~expanded before forwarding, blank env normalized to None, whitespace-only env normalized to None — pattern is monkeypatchasyncssh.connectwith a fake that captures kwargs and raises_StopHereto abort before networking. README quickstart now has an "Inheriting from~/.ssh/config" subsection right after the hosts.toml writeup so first-time operators with a populated SSH config see the option immediately; .env.example knob got a 6-line block-comment with the precedence rule; AGENTS.md §1.3 line tightened with the precedence + use-case detail. Suite: 462 unit pass (up from 457), 1 skipped; ruff + mypy strict clean on all touched files. -
2026-04-17 Release-prep incident sweep (INC-035 → INC-048, plus INC-027 superseded). Closed 14 INCIDENTS entries from the project review in one window, including 7 that surfaced during code review of the e2e-suite landing. Highlights: INC-035 awaited cancelled pump tasks in
ssh/exec.py(no moreTask exception was never retrievedon timeout); INC-036 snapshottedConnectionPool._reap_once()so concurrentacquirecan't triggerRuntimeError: dictionary changed size; INC-037 replaced the<received>literal inHostKeyMismatchwith explicit "asyncssh did not expose the received key" wording + thessh-keyscanrecovery path; INC-038 addedreturn_exceptions=Truetossh_host_info/_alertsasyncio.gatherso one missing probe (nouptime, restricted/proc) doesn't lose the others; INC-039 narrowed_atomic_writeexcept Exception:to(asyncssh.Error, OSError)soCancelledError/MemoryErrorpropagate; INC-040 typed_docker_prefix/_compose_prefixfromAnytoHostPolicy/SettingsunderTYPE_CHECKING; INC-041 dropped unusedtenacitydep; INC-042 introduced aLifespanContextTypedDict+ singlecastso all 4 ctx accessors drop their# type: ignore[no-any-return]; INC-043 splittools/docker_tools.pyfrom 1020 lines into adocker/subpackage (_helpers.py360,read_tools.py361,lifecycle_tools.py155,dangerous_tools.py228) with a 103-line facade preserving all historical imports — two test files updated to monkeypatch the correct submodule; INC-045 ran a 104-finding ruff cleanup then expandedselectwithASYNC/PERF/PT/PLE/TCH(waivingASYNC109because every tool intentionally exposestimeout=for per-call MCP override); the ruff--unsafe-fixesforTCH002moved 11from fastmcp import Context/from pathlib import Pathimports underTYPE_CHECKINGand broke 131 tests because FastMCP's@toolcallsget_type_hints()at registration and pydantic'smodel_rebuild()does the same on field annotations — restored as runtime imports and pinned via per-file["TC001", "TC002"]ignores ontools/**andmodels/**; INC-046 (both steps) addedConfigDict(extra="forbid")to all 13 result models (typos at construction now raiseValidationError) AND rewired 22 tools to return their typedBaseModeldirectly so MCP clients see real schemas intools/listinstead of genericobject(~60 test assertions converted fromresult["foo"]toresult.foo); legitimately-merged-dict tools (everyssh_docker_*,ssh_shell_exec,ssh_host_alerts,ssh_known_hosts_verify,ssh_session_*, bimodalssh_delete_folder, extendingssh_deploy) deliberately stay asdict[str, Any]with rationale captured at the call site; INC-047 introducedShellSession.exec_scope()async context manager +set_cwd()that assertsself.lock.locked()at the write site, so the "caller forgot to acquire the lock" regression class is now eliminated by construction (3 new regression tests; INC-027 closes asn/a (superseded)because the bypass-the-lock failure mode it worried about is unreachable now); INC-048 type-checks bothkwargs["host"]andargs[0]inaudit.pyso a misordered tool signature drops to "?" instead of smearing a__repr__into the audit stream; INC-028 unblocked Windowsssh_file_hashvia PowerShell-EncodedCommand(base64-UTF16LE of aGet-FileHashscript with''-escapedLiteralPath) — sidesteps every cmd.exe / PowerShell shell-quoting corner the priorshlex.joinattempt couldn't reach. Suite: 457 unit pass (up from 448), 1 skipped; ruff clean (0 findings under the expanded ruleset). Open: only INC-044 (CI / pre-commit /.python-versionscaffolding, deliberately deferred). -
2026-04-17
tests/e2e/suite + Windows SFTP realpath fix (INC-034). Newtests/e2e/suite drives every registered tool against the real hosts inhosts.toml, with per-alias parametrization, session-scoped fixtures for Settings / pool /hosts.tomlloading, and TCP reachability probes so unreachable hosts skip rather than fail. Six test modules: test_e2e_real_hosts.py (core tools — ping / host_info / sftp / file-ops / exec / sessions / shell, 15 test functions), test_e2e_docker.py (full docker + compose lifecycle, auto-skip viadocker version/docker compose versionprobes), test_e2e_sudo.py (gated onSSH_E2E_SUDO_PASSWORDso accidental runs can't mutate production), test_e2e_path_policy.py (allowlist +restricted_pathsenforcement — each test rebuilds a narrow ctx becausepolicy.path_allowlist ∪ settings.SSH_PATH_ALLOWLISTwith either containing"*"would mask confinement). Newe2epytest marker registered; suite covers all 57 registered tools across 90 parametrized cases; 62 pass + 13 skipped without sudo, opt-in sudo bumps to 83 pass + 7 skipped / 90 total. First full-catalog e2e run againsttest_windows11surfaced INC-034 (High): OpenSSH-for-Windows returns SFTPrealpathresults in Cygwin form (C:\Users→/C:/Users), which_is_windows_absoluterejects — every Windows SFTP path failed withPathNotAllowed: canonicalized path is not absolute. Fixed in _canonicalize_windows by stripping the single leading/when the next two chars form a drive prefix (C:/orC:\); predicate is tight enough to leave UNC paths (//host/share) alone. Post-merge code review flagged that the fix shipped without a regression unit test; closed by test_sftp_realpath_cygwin_form_is_normalized + test_unc_realpath_is_not_stripped which pin both the Cygwin normalization and the UNC pass-through contract. Also this pass: skill/runbook count floors in test_skills_ascii.py tightened (23 → 50, 3 → 7) so a mass-accidental-deletion trips CI instead of sliding through; config.py now explains why there's no symmetricSSH_ENABLE_SKILLStoggle (per-tool skills near-free, runbooks heavy — asymmetry deliberate). Suite: 448 unit pass (up from 446), 1 skipped; e2e suite 62 pass + 13 skipped without sudo. -
2026-04-16 Skills / runbooks directory split + 5 new runbooks.
skills/now holds only per-tool SKILL.md files (57, one per registered tool);runbooks/holds multi-tool workflow procedures (8 total). Two separateSkillsDirectoryProviderinstances mount the directories; newSSH_ENABLE_RUNBOOKS: bool = Truesetting skips the runbooks mount for tool-execution-only assistants. Lifespan log distinguishes the two (mounted skills provider at ...vsmounted runbooks provider at ...). Three existing runbooks moved out of skills/:ssh-incident-response,ssh-docker-incident-response,ssh-verify-signature. Five new runbooks added:ssh-deploy-verify(upload + hash-verify + compose_up + log-tail +.bak-<ts>rollback),ssh-host-healthcheck(identity + alerts + disk + processes + uptime → green/yellow/red),ssh-disk-cleanup(find before prune; branches for logs / Docker / app data; never volume-prune from LLM),ssh-integrity-audit(pinning + hash drift + signature verify + SUID delta),ssh-container-rollout(standalonedocker runrollout withState.Healthverification + image rollback). Cross-refs fixed: skills → runbooks use../../runbooks/<name>/, runbooks → per-tool skills use../../skills/<name>/. Test suite:test_skills_ascii.pyscans both directories; newtest_runbooks_directory_existsguard. Suite: 446 pass, 7 skipped. -
2026-04-16 Security findings consolidated into INCIDENTS.md. Central append-only ledger with stable
INC-NNNIDs replaces the previous scatter across progress entries, ADR context blocks, and inline code comments. 33 entries migrated: 20 internal findings (INC-003..INC-014, INC-021..INC-028), 5 external-project issue scans, 2 external-feedback ADR-triggers, 5 post-merge code reviews, 1 commit-message-style review. Status index at the top of INCIDENTS.md gives the at-a-glance view; detailed per-entry blocks below with refs to fix commit, tests, ADRs. Code comments acrosssrc/+tests/migrated from legacy IDs toINC-NNNreferences. README Architecture section updated. -
2026-04-16 Post-review fixes for file-hash + docker events/volumes. B1 (blocking):
ssh_file_hashWindows branch used POSIXshlex.joinquoting which is incompatible with Windows OpenSSH's cmd.exe / PowerShell host -- the'"'"'single-quote escape is POSIX-shell-only. Windows support gated viarequire_posix;_hash_windows/_WINDOWS_HASH_ALGOremoved; SKILL.md + docstring updated; INC-028 filed as open with the-EncodedCommandfix-forward path. 2 Windows-argv tests replaced by a singlePlatformNotSupportedassertion. I1:_DOCKER_TIME_REdropped thedunit -- Go'stime.ParseDuration(whichdocker events --sincefeeds into) only acceptss/m/h, and acceptingdwould have routed to a confusingtime: unknown unit "d"daemon error. Twod-smuggling cases (7d,1d2h) added to the bad-since parametrize;7dremoved from the good-time list. I2:ssh_docker_volumesdocstring now names theempty volumes on any non-zero exit_codesemantics explicitly so LLMs don't silently read "no such volume" as "genuinely empty". Defaultsinceforssh_docker_events:10m->1h-- operators paged mid-incident weren't seeing the trigger. Also explicit"--filter" not in argvassertion in the default-argv test to lock the no-filters-means-no-filter-token invariant. Minor:_stat_sizenow catchesasyncssh.Error(base class) not justSFTPErrorso transport failures also degrade to the -1 sentinel; unusedAsyncMockimport removed. Suite: 440 pass, 7 skipped. -
2026-04-16 Two new read-tier Docker tools:
ssh_docker_events+ssh_docker_volumes. Fill the two biggest gaps the docker incident-response runbook exposed.ssh_docker_eventsrunsdocker events --since <since> --until <until> --format '{{json .}}'over a bounded time window (we never pass unbounded events — would hang untilSSH_COMMAND_TIMEOUT); time-anchor regex accepts relative (10m,24h30m), Unix epoch, RFC3339, andnow;filters: list[str]accepts conservativeKEY=VALUEexpressions validated by a regex.ssh_docker_volumescombines list + inspect: withoutnamerunsvolume ls --format '{{json .}}', withnamerunsvolume inspect -- <name>(parity withssh_docker_inspect). Closes the "don't blind-prune volumes from an LLM turn" gap — operators can now enumerate + inspect before anyprune(scope='volume')decision. 29 new tests in test_docker_events_volumes.py covering good/bad time formats, injection-attempt filters, argv shape (default + with-filters), NDJSON parsing, volume ls vs inspect argv, non-zero-exit inspect. Docker incident-response runbook updated to reference both in §2 (events for "what just happened") and §6 (volumes before prune). Catalog: 57 tools, docker group 26, suite 443 pass, 7 skipped. -
2026-04-16 Docker incident-response runbook (runbooks/ssh-docker-incident-response/SKILL.md) parallel to
ssh-incident-response. Eight-section workflow: fullps --allinventory, host-level resource triage (disk_usage+alerts+stats),inspect-first root-cause for failing containers (exit-code decoding: 137 OOM, 143 SIGTERM, restart-count/healthcheck semantics), running-but-broken diagnosis (top, network-mode, saturated-resources), compose-stack failure path withcompose_ps+ service-scopedcompose_logs, disk-pressure prune path with an explicit "don't volume-prune from an LLM turn" boundary, tiered recovery actions (low-access restart vs dangerous compose up/down), escalation triggers (unhealthy-after-restart, repeated identical exit codes, image-drift flag that cross-refs the signature-verify runbook). Read-only up to Section 5; low-access + dangerous explicitly gated. Podman-agnostic (SSH_DOCKER_CMD). README runbooks section lists all three canonical runbooks. Suite: 412 pass, 7 skipped. -
2026-04-16 Signature-verification runbook (runbooks/ssh-verify-signature/SKILL.md) instead of a new tool. Covers GPG / cosign / minisign as a workflow via
ssh_exec_run+ per-hostcommand_allowlist-- no crypto-library dep, no trust-store management in ssh-mcp, no one-size-fits-all wrapper. Explicit responsibility boundary: CI/CD signs BEFORE artifact reaches the target; ssh-mcp does second-line verify of deployed artifacts only. Anti-pattern called out: distributing pubkey + artifact + signature through the same channel. Per-tool gotchas documented (gpg pinentry hang,Good signature from <unexpected uid>, cosign keyless needs network to Rekor). Thessh_file_hashSKILL's security note rewritten to point here with a clean integrity-vs-authenticity boundary statement. README runbooks section updated. Declined to shipssh_file_verify_signatureper ADR-0009 (workflow tools out, skills in). Suite: 411 pass, 7 skipped. -
2026-04-16
ssh_file_hash: standalone read-tier tool for transfer-verification / drift-detection. Computesmd5/sha1/sha256/sha512of a remote file; returns lowercase hex digest + byte size. POSIX runs<algo>sum -- <canonical>(coreutils); Windows runspowershell -NoProfile -NonInteractive -Command "(Get-FileHash -Algorithm <ALGO> -LiteralPath '<path>').Hash"with PowerShell single-quote escape ('->''). Kept as standalone two-step verify flow rather than auto-integrated intossh_upload/ssh_deploy/ssh_docker_cpper operator preference -- it's a debug / manual-verify helper. Path goes throughcanonicalize_and_check+restricted_pathslike every other sftp-read tool. 16 new tests in test_file_hash.py covering invalid algorithm rejection, all four POSIX binaries, path-with-spaces parsing, non-zero-exit →HashError, unparseable-digest →HashError, uppercase-digest lowercased, Windows-Algorithm <ALGO>verification, single-quote escape viashlex.splitroundtrip,check_not_restrictedinvocation (happy-path test catches missing-import class of bug). Catalog: 55 tools, suite 406 pass, 7 skipped. -
2026-04-16 Post-review fixes for
ssh_docker_cp(review B1 + I1 + I2 + M1 + coverage gap). B1 (blocking):effective_restricted_pathsandcheck_not_restrictedwere used but not imported -- every real call would have died withNameError. The pre-flight tests passed because they short-circuit on_validate_name/directionvalidator before reaching the path-policy block. Imports added. I1: rewrote theif/elifoverdirectiontoif/elseso a future fourth direction added without updating the validator fails fast at the branch instead of silently leavingargvunbound. I2: comment at the firstpool.acquireconfirms keyed-pool semantics -- the_run_dockerre-acquire returns the cached connection, one TCP/SSH session, two channels. M1:ssh_docker_topdocstring now explains why the metachar check runs on the raw input BEFOREshlex.split(\n-as-whitespace silently smuggles redirects past per-token checks). Coverage gap: added two happy-path tests intests/test_docker_top_cp.pythat monkeypatchcanonicalize_and_check+_run_dockerand assert the resulting argv shape for both directions, plus a third that assertscheck_not_restrictedis invoked. Verified the new tests would have caught B1 by temporarily stripping the imports and observing the expectedNameError. M3:to_containeradded to the bad-container-name parametrize. Suite: 390 pass, 7 skipped. -
2026-04-16 (latest) Two new docker tools:
ssh_docker_top+ssh_docker_cp.ssh_docker_top(read tier) runsdocker top <container>with optionalps_optionsargv suffix; shell metacharacters in the raw option string are rejected beforeshlex.splitso\n-split tricks can't smuggle redirects. Output is plainps-style text instdout(docker has no JSON format fortop).ssh_docker_cp(low-access) does bidirectionaldocker cpwith explicitdirection: Literal["from_container", "to_container"]; host-side path goes throughcanonicalize_and_check+restricted_pathslikessh_cp; container-side path intentionally NOT policy-checked (we don't manage policy inside containers). Docstring + SKILL document the compromised-image symlink-surprise caveat. 10 new input-validation tests, 2 new SKILL-ASCII tests, registration+tag assertions extended. Catalog: 54 tools, suite 386 pass, 7 skipped. -
2026-04-15 (latest) Integration test keypair + known_hosts fixtures landed. The
tests/integration/suite is no longer a placeholder: conftest.py bootstraps an ephemeral ed25519 keypair undertests/integration/keys/on first run (session-scoped, reused across sessions), session-pins the container's live host key via oneknown_hosts=Nonehandshake, and builds aConnectionPoolbound to a realHostPolicy. Six real tests replace the old TCP-probe placeholder in test_integration_readonly.py: pool acquire,echoviaexec_run, non-zero-exit-is-data,canonicalize_and_checkin-scope + out-of-scope, SFTP listdir, pool reuse. All keys + known_hosts files are.gitignore'd. Unit suite still 374 green; integration skips cleanly when the container is down. README Testing section documents thepytest --collect-only+docker compose up+pytest -m integrationflow and therm tests/integration/known_hostsrecovery for container recreate. -
2026-04-15 (latest) MCP ToolAnnotations derived from tier tags. The MCP spec defaults
readOnlyHint=falseanddestructiveHint=true, so everysafe/readtool was surfacing as "destructive" in clients (Claude Desktop, MCP Inspector)._apply_mcp_annotations()runs once in the lifespan, iteratesserver._list_tools(), and maps our existing tag taxonomy ontoToolAnnotations:safe/read→ read-only & non-destructive & idempotent; low-access file ops → additive (non-destructive) exceptssh_delete*/ssh_docker_rm*/ssh_docker_prunewhich stay destructive;dangerous/sudo→ destructive +openWorldHint=Trueeverywhere. Five in-process regression tests in tests/test_mcp_annotations.py + a stdio round-trip smoke at scripts/check_annotations.py that spawns the server via the MCP Python SDK and dumps atool / readOnly / destructivematrix. Suite: 374 pass, 2 skipped. -
2026-04-15 (latest) Dropped Poetry for PEP 621 + hatchling + uv.
pyproject.tomluses standard[project]metadata instead of[tool.poetry];hatchling.buildas PEP 517 backend (replacespoetry-core). Optional deps moved from[tool.poetry.group.*]to[project.optional-dependencies](.[tasks],.[telemetry],.[dev]).[tool.poetry.scripts]→[project.scripts]. README installation section rewritten arounduv sync/uv run/uvx --from ./pip install -e .withpoetryreferences removed.hosts.toml+hosts.toml.examplefingerprint-lookup snippets switched touv run. Suite: 369 pass, 2 skipped;uvx --from . ssh-mcpverified end-to-end. -
2026-04-15 (latest) Podman + arbitrary Docker-compatible CLI support. New
SSH_DOCKER_CMDglobal setting (defaultdocker) + per-hostdocker_cmdfield inhosts.toml(mirrors the existingSSH_DOCKER_COMPOSE_CMDpattern).SSH_DOCKER_COMPOSE_CMDchanged to empty default and derives from the docker cmd at runtime — soSSH_DOCKER_CMD=podmanautomatically yieldspodman composewithout a second knob. All 22 Docker tools now route through_docker_prefix(policy, settings)/_compose_prefix(policy, settings)helpers;_run_dockertakes acompose: boolflag and prepends the right prefix._run_command_for_secret/_run_secret_cmdmagictimeout=10extracted to_SECRET_CMD_TIMEOUT_SECONDSnamed constants. Also fixeduvx --from . ssh-mcpcrash:fastmcpdep now carriesextras = ["tasks"]sossh_exec_run_streaming'sTaskConfigdecorator validates at import time without manual--with tasksinstall. 13 new regression cases intests/test_docker_cmd.py. Suite: 369 pass, 2 skipped. -
2026-04-15 (latest) Post-fix findings INC-024 / INC-025 / INC-026 landed. Close two real gaps in the docker-run escalation deny-list plus a test cleanup:
- INC-024 (Medium):
--mount source=/,target=/hostbypassed the existing--volume=/:check. New_mount_source_is_host_root()parser decodes the KV value format (type=bind,source=/,target=...), handles both--mount Xand--mount=Xforms.posixpath.normpathcatches//,/./, trailing-slash variants. - INC-025 (Low): container-namespace join (
--pid=container:victim,--network container:bar, etc. for all six namespace flags × both prefix + two-token forms) now rejected alongside the existinghostmatch. - INC-026 (Low): tautological
(entered[0], exited[0]) == (entered[0], entered[0])assertion removed from shell-session lock test; theexited[i] == entered[i]pair already pins the serialization invariant. - INC-027 (Low): deferred per author note ("optional; only if ssh_shell_exec grows").
- 18 new parametrized regression cases in
tests/test_docker_run_escalation.py. Suite: 356 pass, 2 skipped. Post-fix verification documented in INCIDENTS.md.
- INC-024 (Medium):
-
2026-04-15 (latest) Windows SSH targets — Scope 1 minus docker (ADR-0023). New
HostPolicy.platformfield (defaultposix, legacylinux/macos/bsd/darwinaliases normalize toposix, newwindowsoption).require_posix()helper raisesPlatformNotSupportedon POSIX-assuming tools when target is Windows:ssh_host_info/_disk_usage/_processes/_alerts,ssh_exec_*,ssh_sudo_*,ssh_shell_open/_exec, allssh_docker_*,ssh_cp. Error message names the missing capability and points at SFTP alternatives. Supported on Windows: SFTP file-ops (mkdir, delete, delete_folder via SFTP-walk, upload, edit, patch, deploy, mv without cross-fs fallback), SFTP reads (list, stat, download),ssh_findvia SFTP-walk withfnmatchglob, plus ping/known_hosts_verify/session tools. Path policy platform-aware:canonicalize()routes to SFTPrealpath(+ntpath.normpathfallback) on Windows, prefix match case-insensitive + separator-agnostic.path_allowlistvalidator acceptsC:\\...andC:/.... New tests intests/test_windows_target.py(FakeConn + FakeSFTP shims, 29 cases). Suite: 338 pass, 2 skipped. -
2026-04-15 (later) Hardening + ergonomics pass driven by internal review + cross-project issue scan (bvisible/mcp-ssh-manager#13, tufantunc/ssh-mcp#2/#42/#44, classfang/ssh-mcp-server#31):
- INC-021 / INC-022 / INC-023 fixed: hook task tracking with backlog warning (hooks.py),
ssh_docker_runrejects host-escape flags (--privileged,--cap-add, host namespace, host-root volume) by default behindALLOW_DOCKER_PRIVILEGED(docker_tools.py), per-sessionasyncio.LockonShellSessionacquired byssh_shell_exec(shell_sessions.py + shell_tools.py). Full detail in INCIDENTS.md. - Docker list-tools
include_labelsflag:ssh_docker_ps/_images/_compose_psstripLabelsby default and rewritestdoutas compact NDJSON. OCI labels on common images blew the MCP output cap on hosts with 20+ containers. - TTY hint (#31 echo):
ExecResult.hintpopulated when stderr matchesis not a tty/must be run from a terminal. Tells the LLM to use batch-mode flags orssh_exec_script. Defends our "no remote PTY" design choice without surprising the operator. - Risky-config hint (#13 echo):
_warn_on_risky_confignow warns whenpath_allowlist=["*"]and neither per-hostrestricted_pathsnor envSSH_RESTRICTED_PATHScover/etc/shadow,/etc/sudoers,/etc/ssh. Warning, not error -- some hosts genuinely don't have these. ssh_exec_runlast-resort docstring: tool docstring + SKILL.md now lead with "Last-resort tool" and a 14-row mapping table (mkdir -p ... -> ssh_mkdir, etc.) so the LLM stops reaching forssh_exec_runwhen a dedicated wrapper exists.- BM25SearchTransform integration (revisits BACKLOG line 139): now reachable via
SSH_ENABLE_BM25=true(default OFF),SSH_BM25_MAX_RESULTS=8,SSH_BM25_ALWAYS_VISIBLE=ssh_host_ping,ssh_host_info,ssh_session_list,ssh_shell_list. Replaces tools/list withsearch_tools+call_toolonce 50+ schemas eat too much context per turn. - Tool catalog overview at startup:
_log_tool_catalogemitstools registered: N total, M visible (after tier+group filters)plus per-tier and per-group counts. Helps operators verify theirALLOW_*/SSH_ENABLED_GROUPSactually does what they think. - Logging fix for
fastmcp run:fastmcp run fastmcp.jsonskips ourrun_server.main(), so root logger stayed at WARNING and our INFO lines vanished into Python'slastResorthandler. Lifespan now attaches a stderr handler to thessh_mcplogger and respectsLOG_LEVEL.
- Suite: 309 pass, 2 skipped. 52 tools registered in 8 groups.
- INC-021 / INC-022 / INC-023 fixed: hook task tracking with backlog warning (hooks.py),
-
2026-04-14 Phase 0 skeleton landed (pyproject, fastmcp.json, src layout, smoke tests).
-
2026-04-14 Phase 1a host configuration landed (models/policy.py, hosts.py, 14 loader tests pass). Python target bumped 3.11–3.13 → 3.11–3.14 (ADR-0013 unchanged; FastMCP 3 still the target).
-
2026-04-14 Phase 1b SSH transport landed: errors, known_hosts loader, agent fingerprint matching, connection opener with ProxyJump, keyed pool with idle reaper.
app.pysplit off to break a circular import betweenserver.pyand tool modules. -
2026-04-14 Phase 1c read-only tools landed: 11 tools registered (
ping,host_info,disk_usage,processes,known_hosts_verify,session_list/stats,sftp_list/stat/download,find). Tests: 33 pass, 1 skipped (integration, needs live sshd). Phase 1 ✅. -
2026-04-14 Host blocklist added (ADR-0015):
SSH_HOSTS_BLOCKLISTenv var, deny wins over allow. Resolution centralized inservices/host_policy.py::resolve(); pool usescheck_policy()for defense-in-depth. List env vars (SSH_HOSTS_ALLOWLIST,SSH_HOSTS_BLOCKLIST,SSH_PATH_ALLOWLIST,SSH_COMMAND_ALLOWLIST) now accept comma-separated strings in addition to JSON arrays. Tests: 48 pass, 1 skipped. -
2026-04-14 Phase 2 low-access tier landed:
services/path_policy.py(remoterealpath+ allowlist check),services/edit_service.py(structured edit + unidiff patch), 8 tools (mkdir,delete,delete_folder,cp,mv,upload,edit,patch) tagged{low-access, group:file-ops}. All tools SFTP-first with fixed-argv shell fallback; atomic writes via<path>.ssh-mcp-tmp.<hex>+posix_rename; caps enforced from Settings. Tests: 78 pass, 1 skipped. Phase 2 ✅. -
2026-04-14 Phase 5 polish landed: tool groups wiring (
SSH_ENABLED_GROUPS→ per-groupVisibilitywith ADR-0016 permissive default), audit log service +@auditeddecorator applied to all 8 low-access tools, FastMCP Skills provider mounted whenskills/exists (seed runbook atskills/ssh-incident-response/), telemetry helper (span()+redact_argv), README. BM25 search transform skipped (19 tools, under threshold). Tests: 95 pass, 1 skipped. -
2026-04-15 Feature expansion beyond the original 5 phases (all driven by operator feedback):
- Docker (22 tools) —
ssh_docker_ps/logs/inspect/stats/images/compose_ps/compose_logs(read),ssh_docker_start/stop/restart/compose_start/compose_stop/compose_restart(low-access),ssh_docker_exec/run/pull/rm/rmi/prune/compose_up/compose_down/compose_pull(dangerous). All taggedgroup:docker; new group added toALL_GROUPS.SSH_DOCKER_COMPOSE_CMDenv var (defaultdocker compose) for legacy-binary hosts. Log tools tighten defaulttailto 50 and defaultmax_bytesto 64 KiB to protect LLM context; tool-levelmax_bytesparam bounded [1 KiB, 10 MiB]. - Smart Alerts (
ssh_host_alerts) — read-only tool evaluating per-host thresholds (disk_use_percent_max,load_avg_1min_max,mem_free_percent_min, optionaldisk_mountsfilter) configured in[hosts.<name>.alerts]. Runsdf,/proc/loadavg,/proc/meminfoin parallel; returns structuredbreaches[]+metrics. No SMTP/Slack/webhook — caller (LLM or cron) decides what to do with the report. - Persistent Shell Sessions — 4 new tools (
ssh_shell_open/exec/close/list) withgroup:shell. In-memorySessionRegistrytracks cwd across calls;wrap_commandprefixescd <cwd>+ emits__SSHMCP_STATE__<pwd>sentinel for cwd-tracking. No real remote PTY. Four-gate story: tierdangerous+ groupshell+ envALLOW_PERSISTENT_SESSIONS(hides onlyopen/exec, leaveslist/closefor drain) + per-hostpersistent_session = true|false. - Smart Deployment (
ssh_deploy) — extendsssh_uploadwith automatic pre-deploy backup: if file exists andbackup=True, SFTPposix_renameto<path>.bak-<UTC-iso8601>before writing tmp + rename-into-place. - Restricted Paths — per-host
restricted_paths+ envSSH_RESTRICTED_PATHScarve out zones insidepath_allowlistwhere low-access and sftp-read tools refuse to operate (typical use: SMB-mounted shared data). Exec/sudo tools unaffected (don't go through path policy). NewPathRestrictederror with explicit "use ssh_exec_run/ssh_sudo_exec" pointer. - Hooks infrastructure —
HookRegistry+HookEvent(STARTUP/SHUTDOWN/PRE_TOOL_CALL/POST_TOOL_CALL), bounded per-hook timeout, exception isolation, blocking vs non-blocking emit,load_external_hooks(registry, module_path)dotted-path loader.SSH_HOOKS_MODULEenv points at an operator module exposingregister_hooks(registry). Zero hooks registered by default. Side-effect only for now; blocking pre-hooks deferred.
- Suite: 267 pass, 2 skipped. 52 tools registered in 8 groups.
- Docker (22 tools) —
-
2026-04-15 INC-006 — critical bug found during Windows end-to-end verification:
KnownHosts.fingerprint_forunpacked 3 values from asyncssh's 7-tuplematch()return and caughtValueError, silently returningNonefor every lookup. Impact:ssh_host_pingnever reported the pinned fingerprint;ssh_known_hosts_verifyalways reportedexpected_fingerprint=None; and the INC-007 fix (UnknownHost vs HostKeyMismatch disambiguation) silently degraded to "always UnknownHost" — a real host-key rotation or MITM would have been mislabeled. Fix: use tuple indexing + stop swallowingValueError. Regression guard:test_fingerprint_for_resolves_real_entrygenerates a real ed25519 key viaasyncssh.generate_private_keyand round-trips it through the full match path. Suite: 178 pass, 2 skipped. -
2026-04-15 Internal review pass (all 13 findings addressed, detailed status in INCIDENTS.md). Highlights: (INC-003)
SSH_ENABLED_GROUPSadded to Settings — without this, every server startup would crash. Regression guardtest_config_has_every_field_lifespan_reads. (INC-004)assert re_canonical == canonicalbeforerm -rfreplaced withraise WriteError— survivespython -O. (INC-005) streaming_pumpupdates byte counters nonlocally per chunk sostdout_truncatedis accurate on timeout. (INC-007)UnknownHostvsHostKeyMismatchnow disambiguated byknown_hosts.fingerprint_forlookup, not exception message text. (INC-008) auditerrorfield reduced to exception class name; full text at DEBUG only. (INC-009)SSH_SUDO_PASSWORDenv-var rejected at startup (hard fail); operators must useSSH_SUDO_PASSWORD_CMDor OS keychain. (INC-010) non-UTF-8 files inssh_edit/ssh_patchraise cleanWriteError. (INC-011) streamingchunk_cbreceives only captured bytes, matching buffer. (INC-012) absolute-path command_allowlist entries now require exact match; basename matching restricted to bare entries. (INC-013) port-range / type / absolute-path validation tests added. (INC-014) magic SFTP error codes replaced withasyncssh.sftp.FX_*constants. Suite: 177 pass, 2 skipped. -
2026-04-15 Phase 4 sudo tier landed (per-call mode):
ssh/sudo.pybuildssudo -S -p '' -- sh -c/s --wrappers with shlex-quoted commands and pipes password on stdin;fetch_sudo_passwordpriority chain (SSH_SUDO_PASSWORD_CMD→keyring→SSH_SUDO_PASSWORDenv → passwordless); startup WARNINGs for env-password and for unsupportedpersistent-sumode; two new toolsssh_sudo_exec(allowlist-checked) +ssh_sudo_run_script(stdin body) both tagged{dangerous, sudo, group:sudo}and@audited(tier="sudo"); 2 new per-tool skills (ASCII-guarded); README "Sudo" section added with recommended scopedNOPASSWDsudoers pattern. 24 tools total registered. Tests: 159 pass, 2 skipped. All 5 phases complete. -
2026-04-14 External review (Findings: trust-before-verify, read-scope gap, host-policy ambiguity, empty-allowlist footgun, status/planned blur). Tightenings landed: (1) ADR-0017 — path confinement applies to every path-bearing read tool;
ssh_sftp_list,ssh_sftp_stat,ssh_sftp_download,ssh_findnow route paths throughcanonicalize_and_check. (2) ADR-0018 — emptycommand_allowlistnow fails closed; newALLOW_ANY_COMMANDenv flag is the only way to permit arbitrary exec. (3) ADR-0019 — allow/block rules evaluate on canonicalpolicy.hostnameonly; aliases are pure lookup keys. README rewritten: quickstart §2 and troubleshooting now scan-to-tempfile + verify fingerprint out-of-band before appending toknown_hosts; Phase 3 marked shipped, Phase 4 sudo mentions watermarked (planned). Regression guard:tests/test_read_tool_path_confinement.py. Suite: 142 pass, 2 skipped. -
2026-04-14 Tool skills authored for every tool (22 per-tool + 1 workflow). Each SKILL.md documents tier/group, inputs, returns, when/when-not, an example, common failures, and related tools. Skills are pure ASCII to work around an upstream FastMCP 3.2.4 bug where
Path.read_text()is called withoutencoding=(Windows defaults to cp1252). Regression test:tests/test_skills_ascii.pyfails any non-ASCII byte in any SKILL.md. All 23 skills load viaSkillsDirectoryProvider. Suite: 134 pass, 2 skipped. -
2026-04-14 Phase 3 exec tier landed:
ssh/exec.py(run+run_streamingwith timeout + pkill cleanup),services/exec_policy.py(command allowlist check), 3 tools (ssh_exec_run,ssh_exec_script,ssh_exec_run_streamingwithTaskConfig(mode="optional")). All tagged{dangerous, group:exec}, audited, gated byALLOW_DANGEROUS_TOOLS. Startup warns if docket backend is in-memory (ADR-0011). 22 tools registered total. Tests: 110 pass, 2 skipped. -
2026-04-14 SSH agent integration verified end-to-end against live Pageant on Windows. Fixed two real bugs discovered only by running it: (1)
SSHAgentClient(None)is wrong — asyncssh needs""for auto-detect;_resolve_socketnow returns""on Windows whenSSH_AUTH_SOCKis unset. (2) Agent keys areSSHAgentKeyPair, notSSHKey, and lackget_fingerprint()— fingerprints now computed frompublic_datavia SHA-256.pywin32added as a Windows-only dependency for asyncssh's Pageant backend. New live smoke test (test_live_agent_returns_well_formed_fingerprints) runs against the operator's real agent when one is present. Tests: 95 pass, 2 skipped.
-
pyproject.tomlwith PEP 621 metadata + hatchling build backend, console script, dev/tasks/telemetry optional-dependencies -
fastmcp.jsonpointing atsrc/ssh_mcp/server.py:mcp_server -
.env.example,.gitignore -
src/ssh_mcp/layout:__init__,__main__,run_server,server,lifespan,config - Subpackage stubs:
ssh/,services/,models/,tools/ -
ConnectionPoolno-op stub -
models/results.py—ExecResult,StatResult,WriteResult - Tier gating wired in lifespan via
Visibility(False, tags={...}) - Smoke test: imports, server constructs, default-deny config
-
models/policy.py—AuthPolicy+HostPolicyPydantic models -
hosts.py—load_hosts(path)TOML loader with defaults merging - Validate ProxyJump references + detect circular chains at load
- Refuse
method = "password"unlessALLOW_PASSWORD_AUTH=true - Fingerprint shape check (
SHA256:prefix + minimum length) - Path allowlist entries must be absolute
- Warn on empty
command_allowlist+ALLOW_DANGEROUS_TOOLS=true - Wire loader into
ssh_lifespan→lifespan_context["hosts"]+host_allowlist - Tests (14): minimal host, defaults merge, unknown proxy_jump, circular chain, relative path, password refusal, password allowed with flag, bad fingerprint, malformed TOML, merged allowlist union, proxy_jump list, empty command_allowlist warning, None/missing file
-
services/host_policy.pyresolvesHostPolicyby name at tool-call time (ADR-0019;resolve(name, hosts, settings)inservices/host_policy.py)
-
ssh/errors.py—UnknownHost,HostKeyMismatch,HostNotAllowed,AuthenticationFailed,AgentFingerprintNotFound,ConnectError,CommandTimeout,PathNotAllowed -
ssh/known_hosts.py— loader; missing file → empty (warn);fingerprint_for(host, port)helper -
ssh/agent.py—list_agent_fingerprints(),select_agent_key(agent_path, fingerprint)viaasyncssh.SSHAgentClient -
ssh/connection.py—open_connection()honoringHostPolicy.auth(agent / key / password);passphrase_cmdandpassword_cmdvia subprocess -
ssh/pool.py— keyed_Entrypool with per-keyasyncio.Lock, proactive 60 s idle reaper,stats()+close_all() - ProxyJump / bastion chaining recursive through the pool (
asyncssh.tunnel=) -
app.pysplit fromserver.pyto break the circular import between tools and the FastMCP instance - Startup fail-fast: agent reachable + fingerprint present when
identity_fingerprintset (deferred to Phase 1b+ alongside real-host integration)
-
ssh_host_ping— pool-backed probe; reports reachable, auth_ok, latency_ms, server_banner, known_host_fingerprint -
ssh_host_info— parallel fixed argvuname -a+cat /etc/os-release+uptime; parsed -
ssh_host_disk_usage— fixed argvdf -PTh, parsed toDiskUsageEntry[] -
ssh_host_processes— fixed argvps -eo pid,user,pcpu,pmem,comm --sort=-pcpu; top-N capped -
ssh_known_hosts_verify— reuses pool acquire; reports expected vs live fingerprint + error reason -
ssh_session_list,ssh_session_stats— pool introspection -
ssh_sftp_list— SFTPlistdir+ per-entrylstat; offset/limit pagination;has_more -
ssh_sftp_stat— SFTPlstat+readlinkfor symlinks -
ssh_sftp_download— SFTP read, size-capped, base64-encoded -
ssh_find— fixed argvfind <path> -maxdepth N -type T -name PATTERN; pattern regex-validated; results capped -
tools/_context.pyhelper —pool_from,settings_from,known_hosts_from,resolve_host - Result models added:
PingResult,HostInfoResult,DiskUsageResult/Entry,ProcessListResult/Entry,SftpListResult/SftpEntry,FindResult,DownloadResult - Tool-registration test: all 11 tools present, carry
{safe, read, group:*}tags - Docker-compose fixture at
tests/integration/docker-compose.yml+ skipif-marked integration placeholder - Populated integration tests against
linuxserver/openssh-server— scaffolded but deferred (needs a repeatable keypair fixture)
-
services/path_policy.py—canonicalize_and_check(conn, path)viarealpath [-m] --;reject_bad_charactersfor NUL/control;effective_allowlistunions per-host + env -
services/edit_service.py—apply_edit(single/all occurrence) +apply_unified_diffviaunidiff, rejects on context/removal mismatch -
ssh_mkdir— SFTP mkdir withparents=Truewalk -
ssh_delete— SFTP remove; rejects directories -
ssh_delete_folder— SFTP rmdir (non-recursive); SFTP walk withSSH_DELETE_FOLDER_MAX_ENTRIEScap +rm -rf --fallback (recursive);dry_runreturns would-delete list -
ssh_cp— fixed argvcp -a -- src dst -
ssh_mv— SFTPposix_renamefirst; fallback tomv --on cross-FS errors -
ssh_upload— base64-decoded payload written to<path>.ssh-mcp-tmp.<hex>thenposix_rename; size-capped atSSH_UPLOAD_MAX_FILE_BYTES -
ssh_edit— download →apply_edit→ atomic write; preserves mode; size-capped atSSH_EDIT_MAX_FILE_BYTES -
ssh_patch— download →apply_unified_diff→ atomic write; preserves mode - Tests: 12 path policy tests (NUL/control/empty reject, allowlist normalization, subdir vs sibling,
realpath/realpath -mscripting viaFakeConn, symlink-out-of-allowlist blocked, traversal blocked), 14 edit_service tests (single vs all mode, missing/duplicate/no-op rejection, patch context/removal mismatch, multi-file rejection, multi-hunk in-order) - All 8 tools carry
{low-access, group:file-ops}; visible only whenALLOW_LOW_ACCESS_TOOLS=true - Low-access tool registration + tag test (expanded
test_tool_registration.py)
-
ssh/exec.py::run—asyncssh.conn.runwrapped withasyncio.wait_forfor per-call timeout; returns fullExecResultwithexit_code,stdout,stderr,duration_ms,timed_out,killed_by_signal; non-zero exit is data, not raise -
ssh/exec.py::run_streaming—create_processwith async pumps for stdout/stderr;chunk_cbcallback for every non-empty read; caps + truncation flag; terminates + pkills on timeout - Timeout + pkill cleanup —
timeout 3s pkill -f -- <pattern>withshlex.quote, bounded at 5 s so stuck pkill can't compound the problem -
services/exec_policy.py::check_command—shlex.split+ first-token match; accepts bare program or absolute path by basename; empty allowlist = no restriction; malformed shell syntax rejected -
ssh_exec_run— accepts a single command string; caller owns quoting -
ssh_exec_script— script body via stdin tosh -s --; never in argv -
ssh_exec_run_streaming—task=TaskConfig(mode="optional", poll_interval=3s); emits latest stdout/stderr line viaProgress.set_message - All 3 tools tagged
{dangerous, group:exec}, audited with@audited(tier="dangerous") - Redis docket warning —
_warn_task_backend()fires at startup whenALLOW_DANGEROUS_TOOLS=true+FASTMCP_DOCKET_URL=memory://(ADR-0011) -
fastmcp[tasks]extra available in dev;fastmcp-tasks = {extras = ["tasks"]}recommended in pyproject (docket is installed in.venvfor tests) - Tests: 8 exec_policy (allowlist bypass via quoting, absolute path basename match, union dedup), 7 exec.py (truncation, stderr-as-data, signal propagation, timeout triggers pkill, stdin piping, streaming chunk cb)
-
ssh/sudo.py—build_sudo_wrapper+build_sudo_script_wrapper(shlex-quoted command,-S -p ''for password piping,-nfor passwordless);run_sudo+run_sudo_scriptcompose withssh/exec.py::run - Password sources:
fetch_sudo_passwordwith priority chainSSH_SUDO_PASSWORD_CMD→keyringservicessh-mcp-sudouserdefault→SSH_SUDO_PASSWORDenv → passwordless -
warn_if_env_password()+warn_if_persistent_mode()emitted from lifespan at startup whenALLOW_SUDO=true -
ssh_sudo_exec— allowlist-checked (samecheck_commandas exec tier; respectsALLOW_ANY_COMMAND) -
ssh_sudo_run_script— body on stdin after password line, no allowlist check (same rationale asssh_exec_script) - Both tools tagged
{dangerous, sudo, group:sudo}and@audited(tier="sudo") - Startup validation: persistent-su mode logs WARNING + falls back to per-call
- Skills:
skills/ssh-sudo-exec/SKILL.md,skills/ssh-sudo-run-script/SKILL.md(ASCII-guarded) - Tests (14): wrapper shapes passwordless + with-password, quoting defends against
$(...)and embedded quotes, script wrapper concatenates password + body, password-cmd priority, env fallback, timeout propagation,run_sudo+run_sudo_scriptplumb correctly to exec.run - Persistent-su mode — deferred. Scoped
NOPASSWDsudoers entries are the recommended alternative; revisit if operator feedback shows repeated sudo prompts are a bottleneck
-
telemetry.py—span()wrapper degrading to a noop when OTel is not wired +redact_argv()for--password=*/--token=*/--secret=*/--api-key=*(length preserved) -
services/audit.py—record()emits one JSON line to thessh_mcp.auditlogger (path + command SHA-256-hashed);@audited(tier=...)decorator; applied to all 8 low-access tools - FastMCP Skills provider mounted (
_mount_skillsin lifespan) whenSSH_SKILLS_DIRexists; seed runbook atskills/ssh-incident-response/SKILL.md - Tool groups —
SSH_ENABLED_GROUPSwired to per-groupVisibility(False, tags={"group:<name>"})in lifespan. Empty = all groups enabled (ADR-0016). Unknown groups logged and ignored. - README — setup, env vars,
hosts.tomlschema, tier flags, tool groups, per-host identity patterns, key rotation runbook - Config knob:
SSH_SKILLS_DIR(defaultskills/) for optional skills directory - Evaluate
BM25SearchTransform— landed 2026-04-15 behindSSH_ENABLE_BM25(default OFF). 52 tools is past the 30 threshold; opt-in keeps the small-deployment ergonomics intact. - Wire
telemetry.span()intossh/connection.py,ssh/exec.py,services/path_policy.py— landed 2026-04-17.ssh.connect,ssh.exec(buffered + streaming variants), andpath.canonicalizespans attach host / port / exit code / duration without ever recording argv, path content, or auth secrets (redaction posture pertelemetry.pymodule docstring). Wiring locked in bytests/test_telemetry.py::test_*_opens_*_span. - Wire
@auditedinto exec + sudo tools (Phase 3+4) — applied via@audited(tier="dangerous"/"sudo")intools/exec_tools.pyandtools/sudo_tools.py
- FastMCP 3.2.4 skills loader —
fastmcp.server.providers.skills.skill_providercallsPath.read_text()in 5 places withoutencoding="utf-8". On Windows this defaults to cp1252 and fails on any non-ASCII byte. Workaround: pure-ASCII SKILL.md files (enforced bytests/test_skills_ascii.py). Revisit when FastMCP adds explicit encoding.
- Host blocklist (
SSH_HOSTS_BLOCKLIST) with deny-wins precedence, centralized inservices/host_policy.py - CSV parsing for list-valued env vars (
SSH_HOSTS_ALLOWLIST,SSH_HOSTS_BLOCKLIST,SSH_PATH_ALLOWLIST,SSH_COMMAND_ALLOWLIST) - CI: ruff + mypy + pytest on push
- Version bump on any tool-signature change (strict semver
MAJOR.MINOR.PATCH) - Redact
--password=*/--token=*in telemetry and audit — landed 2026-04-17. Telemetry side:span()already attaches no argv (redaction by exclusion). Audit side: new redact_command_string() helper for raw-string commands, mirrored toredact_argv()for list-form.@auditednow extractscommand: str(ssh_exec_run / ssh_exec_run_streaming / ssh_sudo_exec / ssh_docker_exec) andargs: list[str](ssh_docker_run) into the audit line;script:bodies (ssh_exec_script) are deliberately NOT captured per the tool's stdin-only contract.record()redacts BEFORE hashing AND strips the:Nlength suffix so two--password=Xcalls with different X produce the samecommand_hash(dedup-by-shape) instead of leaking the secret value via stable-hash rainbow lookup. - Lint rule or review check for
shell=True, f-string commands,os.system - Glob / pattern matching for allowlist + blocklist (DESIGN.md §11 Q4 — deferred)
- Per-host
blocked = trueflag inhosts.toml(currently env-only; revisit if operators ask)
- Hooks system (pre/post-connect, pre/post-command)
- Workflow tools (backup/restore, db_dump, deploy) — exposed via Skills instead
- Port/X11 forwarding, tunneling
- Windows target hosts
- 5a —
ssh_host_alertstyped result —HostAlertsResult+AlertBreachPydantic models replace the previousdict[str, Any]return.extra="forbid"on both models; LLM gets schema validation. Aligns with ADR-0025 / INC-046 consistency. (id: sprint5-alerts-typed) - 5b — Output sanitizer reach extension —
output_warnings: list[str]added toHostInfoResultandUserInfoResult.ssh_host_infoscansuname,uptime, and eachos_releasevalue;ssh_user_infoscansgecos(attacker-controllable viachfnon shared boxes). Strings unchanged; warnings only (INC-058 pattern). (id: sprint5-sanitizer-reach) - 5c — Agent-notes hygiene docs —
skills/ssh-host-notes-append/SKILL.mdextended with "What is SAFE vs UNSAFE to write" section addressing the self-reinforcing-channel risk from INC-060 ping auto-injection. (id: sprint5-notes-hygiene) - 5d —
ssh_linkinternal refactor — 152-line tool split into 3 mode helpers (_create_symbolic_link,_create_hard_link_followed,_create_hard_link_unfollowed). Tool body now dispatch +WriteResultassembly only. Behavior 1:1. Largest cosmetic change of the sprint. (id: sprint5-link-refactor) - 5e —
session_tools.pymerged intoshell_tools.py—ssh_session_listmoved next tossh_shell_list(semantic siblings).session_tools.pydeleted.server.py,test_audited_coverage.py, e2e test, and DESIGN.md file-tree updated. Closes the deferred Sprint 4 step. (id: sprint5-session-merge) - 14 new tests across all five items.
-
Sprint 4 — SOC refactor:
host_tools.pysplit (v1.5.0) —tools/host_tools.pytrimmed 909→695 lines; newservices/host_notes.py(public API:HOST_NOTES_ALIAS_RE,either_notes_present,resolve_sidecar_path,read_sidecar,atomic_write_sidecar) + newtools/host_notes_tools.py(3 notes tools:ssh_host_notes,ssh_host_notes_append,ssh_host_notes_set). No tool surface change.server.pyregistershost_notes_tools. Addresses code review M6+M7. Step 6 (session_tools merge) deferred due to e2e import coupling. (id: soc-refactor-sprint4) -
Sprint 2 — Dead-code purge + OTEL_ENABLED wire-up (v1.3.0) — Removed 9 confirmed-dead items:
src/ssh_mcp/ssh/argv.py(full module),CommandTimeoutexception class,_partial_on_timeouthelper,_DOCKER_VOLUME_FLAGSconstant + re-export,SessionRegistry.reap_idlemethod,SSH_ALLOW_KNOWN_HOSTS_WRITEsetting, stale "removed" comment block insftp_read_tools.py, andssh_session_statstool (tests + e2e + SKILL folder deleted). WiredOTEL_ENABLEDsetting to gatetelemetry._get_tracer(was declared but never read; now returnsNonewhen false, suppressing all span emissions). Shell-session lifecycle clarified as caller-owned viassh_shell_open/ssh_shell_close— no idle reaper exists or is intended. Addresses reviewer findings M1-M5, L1, L3, H2, H3. (id: dead-code-purge-sprint2)
- Sprint 1 — Compose-file path-policy tightening (v1.2.0) — Migrated all 5
ssh_docker_compose_*call sites intools/docker/lifecycle_tools.pyfromcanonicalize_and_checktoresolve_path(which bundles canonicalize + allowlist + restricted-zones). Compose files inrestricted_pathszones now raisePathRestrictedinstead of silently executing. Closes INC-061 (code-review 2026-04-30). 5 new parametrized unit tests intests/test_compose_path_policy.py. Stalecanonicalize_and_checkcarve-out reference removed from DESIGN.md §5.6 and TOOLS.md Docker lifecycle section. All 8 compose SKILL.md files updated withrestricted_pathsconstraint andPathRestrictedfailure entry. (id: compose-path-policy-sprint1)
-
Thread
ResolvedHostthrough systemctl wrapper helpers — T1 (arjancodes sprint) preserved the double-resolve pattern atsystemctl_tools.pylines 424, 452, 478, 505, 562, 596, 620, 686 to keep scope tight. Each site callsresolve_host(ctx, host).policythen passesHostPolicyinto a helper that internally callsresolve_hostagain. The same redundancy exists inssh_cp/ssh_mv/ssh_docker_exec. Mechanical cleanup: thread theResolvedHostreturned by the outerresolve_hostcall through to each helper, removing the inner re-resolution. Low priority; no behavior change. (id: resolved-host-thread-systemctl) -
Adopt
SshTransportProtocol at caller signatures — T2 (arjancodes sprint, pending merge from worktreeagent-a51fd28feb5f56992) addedsrc/ssh_mcp/ssh/protocols.pywith theSshTransport(Protocol)interface coveringrun,start_sftp_client,close, andwait_closed. No callers were migrated (intentionally — T2 was type-declaration only). Future sprint: updatepool.py/exec.py/connection.pyparameter signatures fromasyncssh.SSHClientConnectiontoSshTransportso the paramiko-fallback hook described in AGENTS.md §6.5 has a concrete type to plug into. (id: ssh-transport-protocol-callers) -
Per-host group overrides (
hosts.<name>.groups = [...]) — revisit if operators ask
- APT package tools — new
pkggroup — 3 new read-tier tools:ssh_apt_list(apt list with installed/upgradable/all modes + glob filter),ssh_apt_search(apt-cache search by name + description),ssh_apt_show(combined apt-cache show + policy for one package). New files:models/apt.py,services/apt_parser.py,tools/apt_tools.py. POSIX-only; non-Debian hosts receive cleanPlatformNotSupportedvia apt-binary probe. Pattern/package argv-validated. All three carry@audited(tier="read"), tags{safe, read, group:pkg}. 3 new SKILL.md files underskills/ssh-apt-list/,skills/ssh-apt-search/,skills/ssh-apt-show/. 87 new tests; 968 unit tests passing total. (id: sprint6-apt-pkg-group)