Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions .github/workflows/codeql.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: CodeQL

on:
push:
branches: [main]
pull_request:
branches: [main]
schedule:
- cron: '0 3 * * 0'

concurrency:
group: codeql-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write

strategy:
fail-fast: false
matrix:
language: [python]

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}

- name: Autobuild
uses: github/codeql-action/autobuild@v4

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
53 changes: 53 additions & 0 deletions .github/workflows/premerge.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,56 @@ jobs:

- name: Run ruff
run: uvx ruff check .

integration-tests:
# Runs the suite gated by `-m integration` against the Docker daemon that
# ships pre-installed on ubuntu-latest runners. Skipped if the unit tests
# didn't pass — there's no point burning daemon time on a broken branch.
runs-on: [ubuntu-latest]
name: Run integration tests
needs: [pytest-run]
timeout-minutes: 20
# The Hub-login step below gates on `env.DOCKERHUB_TOKEN`, which means the env
# var needs to be resolved by the time the step `if:` is evaluated. Step-scoped
# `env:` is set up too late for that, so we hoist the values to job-level env
# (which IS resolved before any step `if:` runs). Referencing `secrets.*`
# directly in the step `if:` is rejected at workflow-validation time with
# "Unrecognized named-value: 'secrets'", so this hoist is the only working
# shape for an opt-in-via-secret gate.
env:
DOCKERHUB_USER: ${{ secrets.DOCKERHUB_USER }}
DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
steps:
- name: Check out source repository
uses: actions/checkout@v4

- name: Set up Python environment
uses: actions/setup-python@v5
with:
python-version: "3.14"

- name: Set up uv
uses: astral-sh/setup-uv@v5

- name: Install dependencies
run: uv sync --all-groups

- name: Verify Docker tooling on the runner
# ubuntu-latest ships Docker Engine + docker CLI + compose v2 + buildx.
# Scout is NOT pre-installed; tests skip cleanly via has_plugin().
run: |
docker version
docker info | head -20
docker compose version
docker buildx version
docker scout version 2>/dev/null || echo "scout not installed; scout integration tests will skip"

- name: Optional Docker Hub login to dodge anonymous rate limits
# Opt-in: set DOCKERHUB_USER / DOCKERHUB_TOKEN as repo secrets to authenticate.
# Without them, the job runs anonymous and the existing skip-on-pull-failure
# behaviour in the compose lifecycle test catches throttling.
if: ${{ env.DOCKERHUB_TOKEN != '' && env.DOCKERHUB_USER != '' }}
run: echo "$DOCKERHUB_TOKEN" | docker login --username "$DOCKERHUB_USER" --password-stdin

- name: Run integration tests
run: uv run pytest -m integration -v --tb=short
4 changes: 3 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,9 @@ Each file maps to one Docker SDK domain (or, for CLI-only and registry-only feat
| `tools/swarm.py` | Swarm init, join, leave | docker-py |
| `tools/compose.py` | Docker Compose v2 | `docker compose` CLI via `_cli.py` |
| `tools/context.py` | Docker CLI contexts | `docker context` CLI via `_cli.py` |
| `tools/registry.py` | OCI v2 registries + Docker Hub | HTTPS via `httpx` (no daemon) |
| `tools/buildx.py` | Buildx / BuildKit (multi-arch builds, imagetools — supersedes `docker manifest`) | `docker buildx` CLI via `_cli.py` |
| `tools/scout.py` | Vulnerability scanning, SBOMs, base-image recommendations | `docker scout` CLI via `_cli.py` |
| `tools/registry.py` | OCI v2 registries + Docker Hub (with 429 retry policy) | HTTPS via `httpx` (no daemon) |
| `tools/prompts.py` | `@mcp.prompt()` workflow templates | — |
| `tools/resources.py` | `@mcp.resource()` doc endpoints | — |

Expand Down
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,9 @@ Once loaded, the agent gets MCP tools grouped by Docker domain. A few examples:
- **System** — `ping`, `info`, `version`, `df`, `events`
- **Compose** — `compose_up`, `compose_down`, `compose_ps`, `compose_logs`, `compose_config`, `compose_build`, `compose_pull`, `compose_run`, `compose_exec`, `compose_ls` *(wraps the `docker compose` CLI plugin)*
- **Contexts** — `context_ls`, `context_inspect`, `context_create`, `context_use`, `context_rm` *(wraps the `docker context` CLI)*
- **Registry / Hub** — `registry_list_tags`, `registry_inspect_manifest`, `hub_list_tags`, `hub_repo_info` *(HTTPS to OCI v2 registries and the Docker Hub API — no daemon required)*
- **Registry / Hub** — `registry_list_tags`, `registry_inspect_manifest`, `hub_list_tags`, `hub_repo_info` *(HTTPS to OCI v2 registries and the Docker Hub API — no daemon required; transparent retry on a brief 429)*
- **Buildx** — `buildx_build`, `buildx_bake`, `buildx_imagetools_inspect`, `buildx_imagetools_create`, `buildx_ls`, `buildx_inspect`, `buildx_du`, `buildx_prune`, `buildx_create`, `buildx_use`, `buildx_rm` *(wraps the `docker buildx` CLI plugin). Use `buildx_imagetools_*` in place of `docker manifest` — that command is in maintenance mode and lacks support for OCI image indexes and attestations.*
- **Scout** — `scout_cves`, `scout_quickview`, `scout_recommendations`, `scout_compare`, `scout_sbom` *(wraps the `docker scout` CLI plugin; most features benefit from `docker login` on the host running this server).*

The SDK-backed surface mirrors the [Docker SDK reference](https://docker-py.readthedocs.io/en/stable/) — if it's documented there, it's available here. The Compose and Context surfaces follow the [Compose CLI](https://docs.docker.com/reference/cli/docker/compose/) and [docker context](https://docs.docker.com/reference/cli/docker/context/) references.

Expand Down Expand Up @@ -97,6 +99,18 @@ Many AI clients let you invoke registered MCP prompts directly (in Claude Code,
/find_latest_image_tag image=ghcr.io/org/repo
```

**Buildx, Scout, and multi-arch manifests**

```
/plan_multiarch_build image=ghcr.io/org/app:v1 platforms=linux/amd64,linux/arm64
/audit_image_cves image=alpine:3.19
/compare_image_versions old_image=org/app:v1 new_image=org/app:v2
/recommend_base_image image=org/app:v1
/inspect_multiarch_manifest image=alpine:3.19
/create_multiarch_manifest target_tag=org/app:v1 source_tags=org/app:v1-amd64,org/app:v1-arm64
/migrate_from_docker_manifest
```

…or in plain English:

> Pull `redis:7-alpine` and run it as a container called `cache` on a new `app-net` network, exposing port 6379 only inside that network.
Expand All @@ -120,7 +134,7 @@ Connecting this server to an AI agent grants it the same level of access as a lo
- **HTTPS-backed registry tools** (`registry_list_tags`, `registry_inspect_manifest`, `hub_list_tags`, `hub_repo_info`) talk to the registry directly over HTTPS and do NOT read `~/.docker/config.json`. The `registry_*` tools accept `username` / `password` for private registries; the `hub_*` tools currently support public Hub repositories only. Use a per-invocation token with the minimum required scope rather than a long-lived password.
- **`exec_in_container`, `compose_exec`, and `compose_run` run arbitrary commands.** When any part of the command is derived from agent-controlled input, use an exec-form argv list that does not invoke a shell (e.g. `["python", "-V"]`). A list like `["sh", "-c", template]` that invokes a shell will interpret shell metacharacters in the untrusted substrings.
- **Container archive paths.** `get_container_archive` and `put_container_archive` forward the supplied path verbatim to the daemon. The container is the trust boundary — if you do not trust its filesystem, do not assume `..` traversal will be rejected.
- **Destructive operations have no built-in confirmation.** `prune_*`, `remove_*`, `kill_container`, `leave_swarm`, and `compose_down(volumes=True)` execute immediately. The shipped `clean_environment` prompt asks the agent to confirm before pruning volumes, but tool calls themselves are not gated. If you need an approval step, configure it at the MCP client (e.g. Claude Code's permission prompts) rather than relying on the server.
- **Destructive operations have no built-in confirmation.** `prune_*`, `remove_*`, `kill_container`, `leave_swarm`, `compose_down(volumes=True)`, `buildx_prune` (always runs with `--force`), and `buildx_rm` execute immediately. The shipped `clean_environment` prompt asks the agent to confirm before pruning volumes, but tool calls themselves are not gated. If you need an approval step, configure it at the MCP client (e.g. Claude Code's permission prompts) rather than relying on the server.
- **CLI shell-out attack surface.** Compose and Context tools spawn `docker` subprocesses on the host running this MCP server. Every invocation passes arguments as a list (no shell, no metacharacter interpretation), resolves the binary via `shutil.which`, and runs against a scrubbed environment (DOCKER_HOST and related vars only). Filesystem paths supplied to `compose_*` (project_dir, files) are read by the docker CLI on the server host — passing an unfamiliar path can expose any compose file the server's user can read.
- **Docker Context retargeting.** `context_use` only changes the CLI default for subsequent CLI-backed tools. SDK-backed tools (`list_containers`, `pull_image`, etc.) keep using whatever daemon the docker-py client connected to at server startup. Restart the server with a different `DOCKER_HOST` / `DOCKER_CONTEXT` to retarget those. `context_create(skip_tls_verify=True)` disables TLS verification for a context; use only against trusted local daemons.

Expand Down Expand Up @@ -150,6 +164,8 @@ Contributions are welcome. The project values a tight mapping between the Docker
│ ├── plugins.py
│ ├── compose.py # `docker compose` CLI plugin (shells out via _cli.py)
│ ├── context.py # `docker context` CLI (shells out via _cli.py)
│ ├── buildx.py # `docker buildx` CLI plugin (shells out via _cli.py)
│ ├── scout.py # `docker scout` CLI plugin (shells out via _cli.py)
│ ├── registry.py # OCI v2 registries + Docker Hub HTTPS APIs (no daemon)
│ ├── prompts.py # @mcp.prompt() templates for common docker workflows
│ └── resources.py # @mcp.resource() endpoints exposing SDK + CLI + registry docs
Expand Down
62 changes: 62 additions & 0 deletions tests/integration/test_buildx.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# integration tests for buildx — require a real Docker daemon AND the `docker buildx` plugin.
# run with: uv run pytest -m integration

from pathlib import Path

import pytest

from tools._cli import has_plugin
from tools.buildx import buildx_build, buildx_du, buildx_imagetools_inspect, buildx_ls

# A minimal Dockerfile that produces a tiny image without pulling anything large.
# `scratch` is the empty base image and ships with the buildx plugin's defaults.
_DOCKERFILE = """\
FROM scratch
COPY hello.txt /hello.txt
"""


@pytest.fixture(scope="module", autouse=True)
def _require_buildx_plugin():
if not has_plugin("buildx"):
pytest.skip("docker buildx plugin not installed on this host; skipping buildx integration tests")
yield


@pytest.fixture
def build_context(tmp_path: Path) -> Path:
(tmp_path / "Dockerfile").write_text(_DOCKERFILE)
(tmp_path / "hello.txt").write_text("hello\n")
return tmp_path


def test_buildx_ls_lists_at_least_one_builder():
builders = buildx_ls()
assert isinstance(builders, list)
assert builders, "expected at least one buildx builder to be configured"
assert all("Name" in b for b in builders)


def test_buildx_du_returns_records():
records = buildx_du()
# An empty cache is allowed but the call must succeed and return a list.
assert isinstance(records, list)


def test_buildx_build_scratch_context_succeeds(build_context: Path):
result = buildx_build(
context=str(build_context),
tags=["docker-mcp-it-buildx-scratch:test"],
load=True,
timeout_seconds=300.0,
)
assert result["returncode"] == 0, result["stderr"]


def test_buildx_imagetools_inspect_alpine_returns_manifest():
# `alpine:3` is a multi-arch manifest list on Docker Hub. The call hits the registry
# over HTTPS via buildx; no local image is required.
result = buildx_imagetools_inspect("alpine:3", raw=True)
if result["returncode"] != 0:
pytest.skip(f"buildx imagetools inspect unreachable (registry/network?): {result['stderr'][:200]}")
assert result["stdout"].strip().startswith("{")
27 changes: 27 additions & 0 deletions tests/integration/test_scout.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# integration tests for scout — require a real Docker daemon AND the `docker scout` plugin.
# Scout is NOT pre-installed on plain Engine hosts (only Docker Desktop), so the whole
# module skips cleanly when the plugin isn't available.
# run with: uv run pytest -m integration

import pytest

from tools._cli import has_plugin
from tools.scout import scout_quickview


@pytest.fixture(scope="module", autouse=True)
def _require_scout_plugin():
if not has_plugin("scout"):
pytest.skip("docker scout plugin not installed on this host; skipping scout integration tests")
yield


def test_scout_quickview_alpine_returns_json_or_skip():
# Scout requires network access to its CDN. If the CDN is unreachable or the host
# is offline, skip rather than fail — this test exercises the wiring, not Scout itself.
result = scout_quickview("alpine:3")
if result["raw"]["returncode"] != 0:
pytest.skip(f"scout quickview unreachable (offline or auth required?): {result['raw']['stderr'][:200]}")
assert result["format"] == "json"
# `result` should be a parsed dict or the raw text (if Scout returned non-JSON for some reason).
assert result["result"] is not None
Loading
Loading