feat(enterprise): proxy/registry mode + security hardening + e2e tests by dgokeeffe · Pull Request #38 · databrickslabs/coding-agents-databricks-apps

dgokeeffe · 2026-05-16T11:17:58Z

Summary

Enables CoDA deployment into restricted enterprise environments (JFrog / Nexus / GitHub Enterprise behind a firewall), plus closes 6 security findings from independent code review, plus codifies a Playwright e2e test that re-verifies the fixes in ~11 seconds.

Branched off main, rebased on top of the importlib-metadata hotfix (#f0741cc) that unblocks deploys.

What's in this PR

Enterprise proxy/registry mode (the original feature)

New enterprise_config.py module — single source of truth for the env-var contract. All helpers (npm_env, uv_env, proxy_env, bootstrap, doctor, validate_mirror_env, claude_installer_url, hermes_pip_url, deepwiki_mcp_url, exa_mcp_url) live here.
install_*.sh scripts read GITHUB_API_BASE and GITHUB_RELEASE_MIRROR with public fallback. No behavior change when env vars are unset.
setup_claude.py, setup_hermes.py, setup_opencode.py honour DEEPWIKI/EXA MCP overrides.
scripts/enterprise_doctor.py + make enterprise-doctor — pre-deploy reachability check.
docs/enterprise.md — operator-facing config matrix + threat model + known limits.
app.yaml — commented env-var examples so customers see the knobs.

Security fixes (from independent CCR review)

F-01 (P0) — _build_terminal_shell_env() extracted from create_session(), strips NPM_TOKEN, UV_INDEX_*_PASSWORD/USERNAME, npm_config_//host/:_authToken from PTY env. Prevents deployer-level registry credentials leaking to the user terminal.
F-02 (P1) — Shell injection guard for CLAUDE_INSTALLER_URL: _validate_url() rejects values containing shell metacharacters; setup_claude.py switched to positional-arg curl-into-bash form.
F-03 (P1) — validate_mirror_env() called at top of bootstrap() rejects unsafe GITHUB_API_BASE/GITHUB_RELEASE_MIRROR/HERMES_PIP_URL before any subprocess uses them.
F-04 (P1) — DEEPWIKI/EXA helpers wired into all three setup scripts. Empty env var now actually drops the MCP server (was a dead-letter override before).
F-05 (P1) — ~/.hermes/config.yaml chmod 0o600 after write_text. Prevents in-container PAT exfil via cat.
F-06 (P1) — DEFAULT_HERMES_PIP_URL pinned to commit SHA 8e4f3ba4 (≥7 days old, matching npm cooldown semantics). Mitigates force-push compromise of NousResearch/hermes-agent.

Test infrastructure (codifies verification)

Unit tests — 135 new tests in tests/test_enterprise_config.py + tests/test_terminal_env_strip.py. Cover env-var permutations, URL validation, secret masking, doctor reachability, terminal-env strip.
Docker integration test (tests/integration/) — Dockerfile.apps-like + pytest driver. Runs the full setup pipeline in an Ubuntu 22.04 + uv container approximating Databricks Apps. Verifies F-01/F-04/F-05/F-06 + cooldown via verify.sh. Auto-forwards host pypi proxy / CA bundle into the container so the test works on Databricks-employee networks too.
Playwright e2e test (tests/e2e/) — drives the live deployed app via stored SSO cookies, mints a fresh PAT per run, inlines verify.sh via base64 so the test is self-contained against any deployed branch state. Verified 7/7 PASS on daveok in 11 seconds.

Test plan

Unit:

uv run pytest tests/ --ignore=tests/integration --ignore=tests/e2e — 348/349 pass (one pre-existing live-npm flake)
make test runs the unit suite, excludes the slow integration + e2e

Live verification on daveok:

Earlier today via chrome-devtools MCP — all P0/P1 findings observed passing in the live container

Today via make e2e-test PROFILE=daveok — 7/7 verify checks PASS:

[PASS] F-01 terminal env has no leaked credentials
[PASS] F-04 Claude MCP wiring (default: deepwiki,exa)
[PASS] F-05 Hermes config chmod 0o600
[PASS] F-06 Hermes installed (Hermes Agent v0.13.0 (2026.5.7))
[PASS] cooldown opencode stable version (1.14.41)
[PASS] cooldown codex stable version (0.130.0)
[PASS] cooldown gemini stable version (0.41.2)

Docker integration:

Image builds cleanly on linux/amd64
validate_mirror_env() rejection test passes in ~10s (verified)
Full pipeline run blocked locally on colima sshfs+disk pressure; designed to work in CI

Notes for review

Default behaviour is unchanged when no enterprise env vars are set. Non-enterprise deployments see no behavioural difference.
The Hermes SHA pin needs deliberate rotation on each CoDA release. Documented inline in enterprise_config.py and in docs/enterprise.md § Security model and known limits.
F-07/F-08/F-10/F-11 (P2/P3 hardening findings) are also addressed in enterprise_config.py — NO_PROXY auto-injection for DATABRICKS_HOST, UV_INDEX password masking, registry resolution unification, doctor SSRF guard. Unit-tested only (require operator env vars to exercise).
F-12/F-13/F-14 are deliberately scoped as open ProdSec policy questions, not code changes — surfaced in docs/enterprise.md as the threat-model boundary.

Out of scope

Air-gap full vendoring (deferred follow-up — depends on a vendor/ bundling pipeline)
Replacing OpenCode with the dgokeeffe fork (rejected on supply-chain grounds — see docs/enterprise.md § 8.5)
ENABLE_ toggles for selectively disabling agent CLIs — separate branch (feat: ENABLE_<CLI> toggles + default-deny for Gemini, OpenCode, Hermes #30)

This pull request and its description were written by Isaac.

… mode) Adds the single source of truth for the enterprise env-var contract that redirects every external reach in CoDA (PyPI, npm, GitHub, claude.ai, Hermes git URL) to internal mirrors (JFrog, Nexus, GitHub Enterprise). Defaults preserve current behaviour entirely — non-enterprise deployments see no behavioural change because every helper falls back to the upstream public URL when its env var is unset. This is PR 1 of 4 in the enterprise-hardening series. The module is unwired (no callers yet) — that lands in PR 2 (install scripts + app.py:run_setup integration). - enterprise_config.py: is_enabled, proxy_env, npm_env, uv_env, subprocess_env, write_npmrc, mirror_github_release, mirror_github_api, claude_installer_url, hermes_pip_url, deepwiki_mcp_url, exa_mcp_url, startup_banner (secret-masked), bootstrap, doctor. - tests/test_enterprise_config.py: 72 tests covering env-var permutations, URL rewriting, secret masking, idempotent npmrc writes, doctor with injected http_get. - docs/enterprise.md: operator-facing config matrix, JFrog mirror conventions, sample app.yaml, troubleshooting, known-gotcha note about the requests-from-GitHub override in pyproject.toml. Co-authored-by: Isaac

PR 2 of 4 in the enterprise-hardening series — turns the env-var contract introduced in PR 1 into actual behaviour. With no enterprise env vars set the behaviour is unchanged; setting any var redirects the corresponding reach to the internal mirror. - app.py: call enterprise_config.bootstrap() at the top of run_setup() so ~/.npmrc is written and derived vars (npm_config_registry, CURL_CA_BUNDLE, proxy mirrors, etc.) land in os.environ before any subprocess fires. Every _run_step inherits them automatically via `env = os.environ.copy()`. - install_databricks_cli.sh, install_gh.sh, install_micro.sh: replace hardcoded api.github.com / github.com with GH_API / GH_RELEASES vars derived from GITHUB_API_BASE / GITHUB_RELEASE_MIRROR (public fallback). Three-line patch each; no other behaviour touched. - enterprise_config: extend proxy_env() to mirror REQUESTS_CA_BUNDLE into CURL_CA_BUNDLE + SSL_CERT_FILE so install_*.sh pick up the corporate root CA without explicit --cacert handling. 2 new tests, 74 total passing. - scripts/enterprise_doctor.py + `make enterprise-doctor`: pre-deploy reachability check. Probes every configured target and reports PASS/FAIL. Uses .venv/bin/python directly when available so it doesn't itself trigger a uv resolve (the failure mode it's meant to diagnose). Default-behaviour smoke: `make enterprise-doctor` with no vars set prints the banner and "No enterprise targets configured" — no network calls, no errors. `uv run python -c "import app"` still imports cleanly. Co-authored-by: Isaac

PR 3 of 4 in the enterprise-hardening series — the per-script URL knobs. codex/gemini/opencode pick up enterprise behaviour transparently via the ~/.npmrc written by PR 2, so they need no code change. Only Claude and Hermes have their own out-of-npm install paths and need explicit override hooks. - setup_claude.py: read CLAUDE_INSTALLER_URL via enterprise_config.claude_installer_url() instead of hardcoding https://claude.ai/install.sh. The installer URL is interpolated into a single-quoted curl argument to avoid shell escaping issues. - setup_hermes.py: read HERMES_PIP_URL via enterprise_config.hermes_pip_url() instead of hardcoding the upstream git URL. The value is the full uv tool install spec, so customers can pass either `hermes-agent @ git+<url>` for a mirrored git repo or `hermes-agent==1.2.3` once Hermes lives in their internal PyPI. - enterprise_config: tighten the Hermes default to include the `hermes-agent @ ` prefix so the env-var contract is "the full uv install spec" rather than "either a URL or a spec, we figure it out". Test updated to match. All 74 enterprise_config tests still pass. No change for non-enterprise deployments — both overrides fall back to upstream URLs when their env vars are unset. Co-authored-by: Isaac

…xamples PR 4 of 4 in the enterprise-hardening series — purely documentation/UX, no code changes. Customers deploying CoDA into restricted environments now see the available enterprise knobs directly in app.yaml without having to hunt through docs/enterprise.md first. Every var is commented out, so default behaviour is unchanged. The section is grouped under a clear "Enterprise mode" header with a one-line pointer to docs/enterprise.md for the full contract. Co-authored-by: Isaac

…eview Three independent reviewer passes (initial CCR, threat-model agent, devil's- advocate agent) converged on a set of P0/P1 issues in the enterprise proxy/registry feature. This commit addresses all of them with surgical fixes plus tests. None of these require architectural rework. F-01 (P0): Strip NPM_TOKEN and UV_INDEX credentials from terminal session env - app.py: extract _build_terminal_shell_env() helper; add NPM_TOKEN, UV_DEFAULT_INDEX, UV_INDEX_*_PASSWORD, UV_INDEX_*_USERNAME, and npm_config_//host/:_authToken pattern to the strip list. - tests/test_terminal_env_strip.py: 13 new tests covering every credential shape the strip pattern needs to catch. - Why this is the highest-severity finding: bootstrap() pushed these deployer-level credentials into os.environ, and the terminal session inherited os.environ.copy(). Any user with a CoDA terminal could read the JFrog service-account token via `env | grep -i npm`. F-02 (P1): Shell injection in setup_claude.py via single-quote bypass - enterprise_config.py: add _validate_url() + _SAFE_URL_RE with a restrictive char allow-list (no $, (, ), ;, &, single quote, backtick, whitespace). claude_installer_url() now validates the value. - setup_claude.py: replace `bash -c f"curl -fsSL '{url}' | bash"` with a positional-arg form that pipes curl into bash. Even if validation were bypassed, the URL never lands as shell. F-03 (P1): Validate GitHub mirror env vars before they reach install_micro.sh - enterprise_config.py: validate_mirror_env() rejects unsafe values for GITHUB_API_BASE / GITHUB_RELEASE_MIRROR / CLAUDE_INSTALLER_URL / HERMES_PIP_URL. Called from bootstrap() so misconfig fails loud at startup instead of becoming a shell injection inside install_micro.sh's `eval`. F-04 (P1): DEEPWIKI_MCP_URL / EXA_MCP_URL overrides were dead-letter code - setup_claude.py, setup_opencode.py (both gateway and fallback paths), setup_hermes.py: replace hardcoded mcp.deepwiki.com / mcp.exa.ai URLs with calls to enterprise_config.deepwiki_mcp_url() / exa_mcp_url(). Empty env var → omit the MCP entry entirely (the documented behaviour that previously had no effect). - setup_claude.py: read-merge-write ~/.claude.json instead of overwriting (also addresses F-09). F-05 (P1): ~/.hermes/config.yaml written without chmod 0o600 (plaintext PAT) - setup_hermes.py: chmod the config file 0o600 after write_text, mirroring what setup_opencode.py already does for its auth.json. Container UID is shared across processes, so without chmod the PAT was readable by any sibling process via `cat ~/.hermes/config.yaml`. F-06 (P1): Hermes git URL was unpinned HEAD — force-push compromise risk - enterprise_config.py: DEFAULT_HERMES_PIN_SHA constant pinned to commit 8e4f3ba4 (2026-05-08, > 7 days old per cooldown semantics). - DEFAULT_HERMES_PIP_URL now includes `@<sha>` so `uv tool install` resolves to a fixed commit, not whatever HEAD points at. - Bump deliberately on CoDA releases; do not auto-update. Test results: 348/349 pass. Single failure is TestNpmVersionLive:: test_resolves_real_package, a pre-existing live-network flake unrelated to this change. Reviewers' verdict before this commit: BLOCK MERGE. After this commit: all P0/P1 findings closed; F-07 through F-14 follow. Co-authored-by: Isaac

…wn limits Adds a "Security model and known limits" section to docs/enterprise.md covering what enterprise mode does and does not verify. This is honest framing of the residual risks that the code-level fixes (previous commit) do not address — they are open ProdSec policy questions, not bugs. F-12: removes the doc's implication of "fail closed" for write_npmrc — the actual behaviour is "degrade gracefully and log", which is fine but should not be advertised as a hard security guarantee. F-13 (open question, not fixed): documents that no checksum or signature verification is performed on mirror-served binaries. Today the mirror is trusted by URL alone. Adding SHA256 verification requires the customer to ship a manifest alongside their mirror — a policy decision worth raising in ProdSec review. F-14 (open question, not fixed): documents that REQUESTS_CA_BUNDLE override enables MITM of every outbound TLS call. This is by design (needed for corporate TLS interception) but worth surfacing to anyone signing off on an enterprise deployment. Also surfaces: - the Hermes pin rotation policy (now pinned to a specific SHA per F-06) - the absence of mirror allow-listing (operator-controlled URLs) - the localhost:4000 content-filter proxy as a residual local attack surface These are the deliberate trade-offs of the proxy/registry-mode threat model. They are NOT bugs in the code — they are the threat model. Co-authored-by: Isaac

…ht suites Replaces the manual chrome-devtools MCP verification with two automated, zero-LLM-token test layers that codify what we verified by hand on daveok. Layer 1 — Docker apps-like integration test (tests/integration/) ================================================================ What it does: builds an Ubuntu 22.04 + uv + node container that approximates the Databricks Apps runtime, runs the full setup_*.py + install_*.sh pipeline inside, then asserts the resulting filesystem / env / version state. Files: - Dockerfile.apps-like — Ubuntu 22.04 + uv + node + 'app' user (uid 1001) - run_pipeline.sh — copies repo from RO mount to writable /work, runs enterprise_config.bootstrap() + every setup_*.py + install_*.sh, then invokes verify.sh under _build_terminal_shell_env() so F-01 (cred strip) is exercised exactly like a real PTY session. - verify.sh — shared assertion script. [PASS]/[FAIL] markers for F-01 (env cred strip), F-04 (DEEPWIKI/EXA MCP wiring with default and empty-override branches), F-05 (Hermes config chmod 0o600), F-06 (Hermes SHA-pinned install completes), and the npm cooldown still picking stable (non-pre-release) versions for opencode/codex/gemini. - test_setup_pipeline.py — 3 pytest cases: 1. Happy-path full pipeline + verify.sh in default mode 2. MCP-override mode (DEEPWIKI_MCP_URL=`` && EXA_MCP_URL=``) → verifies servers actually get omitted (the F-04 dead-letter regression the first review caught) 3. Malicious mirror env (GITHUB_API_BASE with shell metacharacters) → verifies validate_mirror_env() rejects at bootstrap Runs locally and in CI. Skips cleanly when Docker isn't available. Wall time: ~3-5 min for first build; ~30s on cache hits. Layer 2 — Playwright e2e against live deployed app (tests/e2e/) ================================================================ What it does: drives a real CoDA deployment (default profile: daveok) via Playwright. Reuses stored SSO state to skip the Microsoft Entra flow on each run, mints a fresh PAT, fills the PAT prompt, waits for 'Ready', then runs verify.sh inside the live terminal session via /api/input + DOM scrape (same pattern the chrome-devtools MCP session used by hand). Parses [PASS]/[FAIL] markers and asserts the exit code. Files: - conftest.py — fixtures for app URL resolution, PAT minting, storage_state injection. Skips the whole module if auth.json isn't recorded or the Databricks CLI isn't authed. - test_live_security.py — drives the live flow end-to-end. - README.md — one-time setup walkthrough (e2e-auth + browser install). - auth.json is gitignored (contains workspace cookies). Wall time: ~1-2 min per test. LLM tokens per run: zero. Makefile additions ================== - `make test` — unit tests only (~3 min). Excludes integration + e2e. - `make integration-test` — Docker-based pytest (3-5 min). - `make e2e-test PROFILE=daveok` — Playwright against live app. - `make e2e-auth PROFILE=daveok` — one-time SSO session recording. Dev deps ======== Added a [dependency-groups] dev table with playwright + pytest-playwright + pytest-timeout. Install with `uv sync --group dev`. Tests skip cleanly when not installed — `playwright` is only required for the e2e suite. Test results ============ 348/349 pass for `make test`. The one failure is TestNpmVersionLive::test_resolves_real_package — a pre-existing network-dependent flake unrelated to this change. Docker image builds cleanly. Pytest collects all 3 integration tests + the e2e test. Token math ========== Authoring this took LLM tokens (one-time cost). Every subsequent run of `make integration-test` or `make e2e-test` is zero LLM tokens — the tests run themselves in CI. Co-authored-by: Isaac

Two improvements after running the integration test against a real Databricks-employee laptop network: 1. Pre-check pypi.org reachability from inside a container BEFORE running the full pipeline. Corporate networks (notably Databricks-employee laptops) block pypi.org at the egress proxy with an internal-blocklist policy — the test would otherwise burn 3 minutes failing inside pip with confusing "no matching distribution" errors. Now skips in 2s with a clear reason: "pypi.org not reachable from container (... likely blocked by corporate proxy). Run on a non-corporate network or in CI." Uses the already-built apps-like image when available (saves ~30s vs pulling ubuntu:22.04 + apt install curl). Falls back to curlimages/curl if the apps image isn't built yet. 2. Switch from `uv venv` to `python3 -m venv` in run_pipeline.sh — `uv venv` skips pip by default, which caused `pip install` to resolve to the *system* Python's pip (in user-install mode without CA bundle env), so package resolution silently broke under corporate TLS interception. `python3 -m venv` seeds pip into the venv. 3. Use `pip install --no-build-isolation` so pip's isolated build-deps subprocess (which doesn't inherit our CA bundle env) doesn't fail trying to reach pypi for setuptools. Pre-install setuptools+wheel upfront in the venv so build-isolation isn't needed. 4. Add `-rs` to the pytest invocation so skip reasons are visible in `make integration-test` output. Verified behaviour on my Databricks-employee laptop: - `make integration-test` cleanly skips with the pypi-blocked reason - The test will run in CI (where pypi is reachable) and in customer environments evaluating CoDA (where they have pypi access or an internal mirror) The Playwright e2e test (tests/e2e/) is unaffected — it talks to the already-deployed app, doesn't need pypi from a local container. Co-authored-by: Isaac

…orm pin Three substantial fixes to the Docker integration test after iterating on a real Databricks-employee laptop network. The test infrastructure is now correct; current host failures are environmental (Docker daemon storage corruption), not design bugs. 1. Forward the host's configured PyPI proxy into the container ================================================================ The integration test couldn't reach pypi.org from inside Docker on a Databricks-employee laptop — corporate proxy explicitly blocks pypi with an internal-blocklist policy. Solution: detect the host's configured pypi index (UV_DEFAULT_INDEX / PIP_INDEX_URL / ~/.pip/pip.conf / ~/.config/uv/uv.toml) and forward it as PIP_INDEX_URL + UV_DEFAULT_INDEX into the container. This *is* the enterprise feature's contract — operators on firewalled networks configure an internal proxy via these env vars, and our pipeline already supports that path. The test now exercises the same flow. New helpers in test_setup_pipeline.py: - `_host_pypi_index()` — discovers proxy from env vars / pip.conf / uv.toml - `_pypi_reachable_from_container()` — probes whichever index is configured 2. Reorder run_pipeline.sh: bootstrap BEFORE install scripts ================================================================ Previously: Stage 1 (pip install) → Stage 2 (install_*.sh) → Stage 3a (bootstrap). This meant install_micro.sh / install_gh.sh / install_dbcli.sh used the operator-supplied GITHUB_API_BASE / GITHUB_RELEASE_MIRROR env vars BEFORE bootstrap had a chance to validate them. Production app.py:run_setup() calls bootstrap() FIRST, before any _run_step(). The test pipeline now matches that order: bootstrap() validates the env, then install_*.sh runs with confidence the URLs are shell-safe. With `set -e` in effect, a bad URL now fails at bootstrap with a useful UnsafeUrlError, not deep inside install_gh.sh's eval. Also wrap install_*.sh calls in `|| echo "(failed)"` so a single failure doesn't kill the whole pipeline (verify.sh will catch it via missing binaries). 3. Force linux/amd64 platform on build + run ================================================================ Databricks Apps runs amd64, AND the install_*.sh scripts hardcode `linux_amd64` GitHub release URLs. On Apple Silicon hosts the container defaults to arm64 and the install downloads 404. Both `docker build` and `docker run` now pass `--platform linux/amd64` explicitly. (Tried `FROM --platform=linux/amd64` in the Dockerfile but Docker emits a const-platform warning — moved to the CLI flags instead.) 4. Lightweight bootstrap-only test ================================================================ `test_unsafe_mirror_url_rejected_at_bootstrap` used to run the full ~5-min pipeline just to verify bootstrap rejection. Pytest-timeout killed it before output was captured. Now runs ONLY the bootstrap call in isolation (no pip install, no install scripts) — completes in ~10s, asserts on UnsafeUrlError appearing and exit code != 0. The other two tests (full pipeline + MCP override) still run the whole flow because they need to assert on the *final* state. 5. Timeout bumps ================================================================ Per-test docker run timeout 600s → 900s. Stage 1 pip install + Hermes git fetch + npm installs add up; the previous ceiling was tight. What's verified vs what's still gated ================================================================ - Direct `docker run python3 -c "enterprise_config.bootstrap()"` with a malicious GITHUB_API_BASE produces the expected UnsafeUrlError (verified by hand outside pytest — see test docstring). - Full pipeline test was blocked at the apps_like_image build by a Docker daemon storage corruption issue on the test machine ("input/output error" reading containerd blob store). This is a Docker Desktop environmental issue, not a test design bug. The test infrastructure will run in: - CI runners (uncorrupted Docker, unrestricted pypi) - Other developer machines once Docker Desktop is restarted - Customer environments evaluating CoDA (with their own pypi proxy) Co-authored-by: Isaac

Verified end-to-end against daveok — 7/7 verify checks PASS in 11 seconds, zero LLM tokens per run. Three substantive fixes from running the test against a live deployment: 1. /api/sessions returns a bare list, not {"sessions": [...]} ================================================================ The endpoint returns a JSON array directly. The original test code indexed into a non-existent 'sessions' key and crashed. 2. Drive via /api/input + /api/output, not xterm DOM scrape ================================================================ xterm only renders the currently-attached session. We want to target a specific bash session regardless of which terminal the UI is showing. HTTP API drives any session directly and /api/output drains the per-session buffer (independent of the WebSocket-attached UI view). The xterm-DOM-scrape approach also broke because the user's *other* browser tabs polling /api/output were stealing chunks of the buffer. Polling /api/output directly from the test gives us a fair share of the output stream. 3. Inline verify.sh content via base64 ================================================================ The deployed daveok app predates the addition of tests/integration/verify.sh — file isn't present in /app/python/source_code on the container. Inlining lets the test work against any deployment without requiring a re-deploy to refresh test fixtures. The test now reads verify.sh from the local repo, base64-encodes it, and sends a one-liner that decodes + executes inside the container. Sidesteps shell-escape issues entirely. 4. Wait for VERIFY-EXIT-CODE *with a digit*, not just the bare string ================================================================ The string 'VERIFY-EXIT-CODE=' appears in the *echoed command* before the script has actually run — checking for the substring exited the polling loop too early. Now we require the full regex match 'VERIFY-EXIT-CODE=(\d+)'. Plus minor: timeout bumped 90s → 120s (verify.sh + Hermes/cooldown checks add up to ~30s; buffer-race with concurrent pollers extends it). What this test now covers, autonomously: ================================================================ F-01 — terminal env credentials stripped (live PTY env) F-04 — DEEPWIKI/EXA MCP wiring in ~/.claude.json F-05 — ~/.hermes/config.yaml chmod 0o600 F-06 — Hermes installed from SHA-pinned source (v0.13.0) cooldown — opencode/codex/gemini are stable versions (not pre-release) Live run output: [PASS] F-01 terminal env has no leaked credentials [PASS] F-04 Claude MCP wiring (default: deepwiki,exa) [PASS] F-05 Hermes config chmod 0o600 [PASS] F-06 Hermes installed (Hermes Agent v0.13.0 (2026.5.7)) [PASS] cooldown opencode stable version (1.14.41) [PASS] cooldown codex stable version (0.130.0) [PASS] cooldown gemini stable version (0.41.2) === Summary: 7 passed, 0 failed === Co-authored-by: Isaac

dgokeeffe added 10 commits May 16, 2026 14:17

dgokeeffe requested a review from datasciencemonkey May 16, 2026 11:18

This was referenced May 16, 2026

fix(hermes): chmod 600 config.yaml + pin install URL to commit SHA #39

Open

sec(enterprise): SECURITY.md + auth-leak fix + telemetry opt-out + stable Flask key #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(enterprise): proxy/registry mode + security hardening + e2e tests#38

feat(enterprise): proxy/registry mode + security hardening + e2e tests#38
dgokeeffe wants to merge 10 commits into
mainfrom
feat/enterprise-proxy-registry

dgokeeffe commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dgokeeffe commented May 16, 2026

Summary

What's in this PR

Enterprise proxy/registry mode (the original feature)

Security fixes (from independent CCR review)

Test infrastructure (codifies verification)

Test plan

Notes for review

Out of scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant