Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,6 @@ Key environment variables (server) from packages/platform-server/.env.example an
- DOCKER_MIRROR_URL (default http://registry-mirror:5000)
- DOCKER_RUNNER_GRPC_HOST (default docker-runner)
- DOCKER_RUNNER_GRPC_PORT (default 50051; DOCKER_RUNNER_PORT is accepted as an alias)
- DOCKER_RUNNER_SHARED_SECRET (required HMAC credential)
- DOCKER_RUNNER_TIMEOUT_MS (optional request timeout; default 30000)
- DOCKER_RUNNER_OPTIONAL (default true; set to false to keep fail-fast bootstrap)
- DOCKER_RUNNER_CONNECT_RETRY_BASE_DELAY_MS (default 500)
Expand Down
4 changes: 2 additions & 2 deletions docs/product-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ Configuration matrix (server env vars)
- VAULT_ENABLED: true|false (default false)
- VAULT_ADDR, VAULT_TOKEN
- DOCKER_MIRROR_URL (default http://registry-mirror:5000)
- DOCKER_RUNNER_GRPC_HOST, DOCKER_RUNNER_GRPC_PORT (or DOCKER_RUNNER_PORT), DOCKER_RUNNER_SHARED_SECRET (required for docker-runner), plus optional DOCKER_RUNNER_TIMEOUT_MS (default 30000), DOCKER_RUNNER_OPTIONAL (default true; set false to fail-fast), and DOCKER_RUNNER_CONNECT_* knobs (RETRY_BASE_DELAY_MS=500, RETRY_MAX_DELAY_MS=30000, RETRY_JITTER_MS=250, PROBE_INTERVAL_MS=30000, MAX_RETRIES=0 for unlimited background retries).
- DOCKER_RUNNER_GRPC_HOST, DOCKER_RUNNER_GRPC_PORT (or DOCKER_RUNNER_PORT), plus optional DOCKER_RUNNER_TIMEOUT_MS (default 30000), DOCKER_RUNNER_OPTIONAL (default true; set false to fail-fast), and DOCKER_RUNNER_CONNECT_* knobs (RETRY_BASE_DELAY_MS=500, RETRY_MAX_DELAY_MS=30000, RETRY_JITTER_MS=250, PROBE_INTERVAL_MS=30000, MAX_RETRIES=0 for unlimited background retries).
- MCP_TOOLS_STALE_TIMEOUT_MS
- LANGGRAPH_CHECKPOINTER: postgres (default)
- POSTGRES_URL (postgres connection string)
Expand All @@ -133,7 +133,7 @@ HTTP API and sockets (pointers)
Runbooks
- Local dev
- Prereqs: Node 18+, pnpm, Docker, Postgres.
- Set: LITELLM_BASE_URL, LITELLM_MASTER_KEY (LLM_PROVIDER optional; defaults to litellm; only `openai` is also accepted), GITHUB_*, GH_TOKEN, AGENTS_DATABASE_URL, DOCKER_RUNNER_GRPC_HOST, DOCKER_RUNNER_GRPC_PORT (or DOCKER_RUNNER_PORT), DOCKER_RUNNER_SHARED_SECRET. Optional VAULT_* and DOCKER_MIRROR_URL.
- Set: LITELLM_BASE_URL, LITELLM_MASTER_KEY (LLM_PROVIDER optional; defaults to litellm; only `openai` is also accepted), GITHUB_*, GH_TOKEN, AGENTS_DATABASE_URL, DOCKER_RUNNER_GRPC_HOST, DOCKER_RUNNER_GRPC_PORT (or DOCKER_RUNNER_PORT). Optional VAULT_* and DOCKER_MIRROR_URL.
- Start deps (compose or local Postgres)
- Server: pnpm -w -F @agyn/platform-server dev
- UI: pnpm -w -F @agyn/platform-ui dev
Expand Down
2 changes: 1 addition & 1 deletion docs/technical-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Per-workspace Docker-in-Docker and registry mirror

Remote Docker runner
- The platform-server always routes container lifecycle, exec, and log streaming calls through the external docker-runner service (not part of this monorepo).
- The runner exposes authenticated gRPC endpoints; every request includes HMAC metadata derived solely from `DOCKER_RUNNER_SHARED_SECRET`.
- The runner exposes gRPC endpoints; platform-server currently sends no auth metadata.
- Only the docker-runner service mounts `/var/run/docker.sock` in default stacks; platform-server and auxiliary services talk to it over the internal network (default `docker-runner:${DOCKER_RUNNER_GRPC_PORT}` with `DOCKER_RUNNER_GRPC_PORT` defaulting to 50051; `DOCKER_RUNNER_PORT` remains an accepted alias).
- Container events, logs, and exec streams flow over long-lived gRPC streams so the existing watcher pipeline (ContainerEventProcessor, cleanup jobs, metrics) remains unchanged.
- Connectivity is tracked by a background `DockerRunnerConnectivityMonitor` that probes the gRPC `Ready` method with exponential backoff (base-delay, max-delay, jitter, probe interval, and optional retry cap are configurable via DOCKER_RUNNER_CONNECT_* env vars).
Expand Down
3 changes: 1 addition & 2 deletions packages/platform-server/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,10 @@ VAULT_TOKEN=dev-root

# (Slack env removed intentionally; no global Slack config or tokens)

# docker-runner gRPC endpoint and credentials (required)
# docker-runner gRPC endpoint (required)
DOCKER_RUNNER_GRPC_HOST=docker-runner
DOCKER_RUNNER_GRPC_PORT=7171
# DOCKER_RUNNER_PORT=7171 # legacy alias still accepted
DOCKER_RUNNER_SHARED_SECRET=dev-shared-secret
# Optional overrides
# DOCKER_RUNNER_TIMEOUT_MS=30000
# DOCKER_RUNNER_CONNECT_MAX_RETRIES=0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,6 @@ describe('App bootstrap smoke test', () => {
DOCKER_RUNNER_GRPC_HOST: process.env.DOCKER_RUNNER_GRPC_HOST,
DOCKER_RUNNER_PORT: process.env.DOCKER_RUNNER_PORT,
DOCKER_RUNNER_GRPC_PORT: process.env.DOCKER_RUNNER_GRPC_PORT,
DOCKER_RUNNER_SHARED_SECRET: process.env.DOCKER_RUNNER_SHARED_SECRET,
DOCKER_RUNNER_OPTIONAL: process.env.DOCKER_RUNNER_OPTIONAL,
VOLUME_GC_ENABLED: process.env.VOLUME_GC_ENABLED,
VOLUME_GC_INTERVAL_MS: process.env.VOLUME_GC_INTERVAL_MS,
Expand All @@ -361,7 +360,6 @@ describe('App bootstrap smoke test', () => {
process.env.DOCKER_RUNNER_GRPC_HOST = '127.0.0.1';
process.env.DOCKER_RUNNER_PORT = '59999';
delete process.env.DOCKER_RUNNER_GRPC_PORT;
process.env.DOCKER_RUNNER_SHARED_SECRET = 'shared-secret';
process.env.DOCKER_RUNNER_OPTIONAL = 'true';
process.env.VOLUME_GC_ENABLED = 'true';
process.env.VOLUME_GC_INTERVAL_MS = '25';
Expand Down
1 change: 0 additions & 1 deletion packages/platform-server/__e2e__/bootstrap.di.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ const REQUIRED_ENV = {
process.env.AGENTS_DATABASE_URL ?? 'postgresql://postgres:postgres@127.0.0.1:5432/agents_test?schema=public',
DOCKER_RUNNER_GRPC_HOST: '127.0.0.1',
DOCKER_RUNNER_PORT: '59999',
DOCKER_RUNNER_SHARED_SECRET: 'dev-shared-secret',
DOCKER_RUNNER_OPTIONAL: 'true',
CONTAINERS_CLEANUP_ENABLED: 'false',
VOLUME_GC_ENABLED: 'false',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,8 @@ import { PrismaClient as Prisma } from '@prisma/client';

import {
DEFAULT_SOCKET,
RUNNER_SECRET,
hasTcpDocker,
runnerAddressMissing,
runnerSecretMissing,
socketMissing,
startDockerRunner,
startPostgres,
Expand All @@ -33,7 +31,7 @@ import {
Reflect.defineMetadata('design:paramtypes', [PrismaService, ContainerAdminService, ConfigService], ContainersController);
Reflect.defineMetadata('design:paramtypes', [Object, ContainerRegistry], ContainerAdminService);

const shouldSkip = process.env.SKIP_DOCKER_DELETE_E2E === '1' || runnerAddressMissing || runnerSecretMissing;
const shouldSkip = process.env.SKIP_DOCKER_DELETE_E2E === '1' || runnerAddressMissing;

const describeOrSkip = shouldSkip || (socketMissing && !hasTcpDocker) ? describe.skip : describe;

Expand Down Expand Up @@ -80,7 +78,7 @@ describeOrSkip('DELETE /api/containers/:id docker runner integration', () => {
await registry.ensureIndexes();

runner = await startDockerRunner();
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress, sharedSecret: RUNNER_SECRET });
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress });

const moduleRef = await Test.createTestingModule({
controllers: [ContainersController],
Expand Down Expand Up @@ -290,7 +288,7 @@ describeOrSkip('DELETE /api/containers/:id docker runner external process integr
await registry.ensureIndexes();

runner = await startDockerRunner();
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress, sharedSecret: RUNNER_SECRET });
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress });

const moduleRef = await Test.createTestingModule({
controllers: [ContainersController],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -278,7 +278,6 @@ describe('ContainersController wiring via InfraModule', () => {

beforeAll(async () => {
registerTestConfig({
dockerRunnerSharedSecret: 'runner-secret',
dockerRunnerGrpcHost: 'runner-grpc.test',
dockerRunnerGrpcPort: 9091,
agentsDatabaseUrl: 'postgresql://postgres:postgres@localhost:5432/agents_test',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,8 @@ import { RunnerGrpcClient, DockerRunnerRequestError } from '../src/infra/contain

import {
DEFAULT_SOCKET,
RUNNER_SECRET,
hasTcpDocker,
runnerAddressMissing,
runnerSecretMissing,
socketMissing,
startDockerRunner,
startPostgres,
Expand All @@ -32,7 +30,7 @@ import {
type PostgresHandle,
} from './helpers/docker.e2e';

const shouldSkip = process.env.SKIP_PLATFORM_FULLSTACK_E2E === '1' || runnerAddressMissing || runnerSecretMissing;
const shouldSkip = process.env.SKIP_PLATFORM_FULLSTACK_E2E === '1' || runnerAddressMissing;
const describeOrSkip = shouldSkip || (socketMissing && !hasTcpDocker) ? describe.skip : describe;
const TEST_IMAGE = 'nginx:1.25-alpine';

Expand Down Expand Up @@ -84,12 +82,11 @@ describeOrSkip('workspace create → delete full-stack flow', () => {
await runPrismaMigrations(dbHandle.connectionString);

runner = await startDockerRunner();
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress, sharedSecret: RUNNER_SECRET });
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress });

clearTestConfig();
const [grpcHost, grpcPort] = runner.grpcAddress.split(':');
configService = registerTestConfig({
dockerRunnerSharedSecret: RUNNER_SECRET,
dockerRunnerGrpcHost: grpcHost ?? '127.0.0.1',
dockerRunnerGrpcPort: grpcPort ? Number(grpcPort) : undefined,
agentsDatabaseUrl: dbHandle.connectionString,
Expand Down
1 change: 0 additions & 1 deletion packages/platform-server/__tests__/helpers/config.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import { ConfigService, configSchema } from '../../src/core/services/config.service';

export const runnerConfigDefaults = {
dockerRunnerSharedSecret: 'test-shared-secret',
dockerRunnerGrpcHost: '127.0.0.1',
dockerRunnerGrpcPort: 50051,
litellmKeyAlias: 'agents/test/local',
Expand Down
36 changes: 10 additions & 26 deletions packages/platform-server/__tests__/helpers/docker.e2e.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,9 @@ import { spawn } from 'node:child_process';
import { setTimeout as sleep } from 'node:timers/promises';
import { credentials, Metadata } from '@grpc/grpc-js';
import { create } from '@bufbuild/protobuf';

import { NonceCache, buildAuthHeaders } from '../../src/infra/container/auth';
import { RunnerServiceGrpcClient, RUNNER_SERVICE_READY_PATH } from '../../src/proto/grpc.js';
import { RunnerServiceGrpcClient } from '../../src/proto/grpc.js';
import { ReadyRequestSchema } from '../../src/proto/gen/agynio/api/runner/v1/runner_pb.js';

export const RUNNER_SECRET = process.env.DOCKER_RUNNER_SHARED_SECRET ?? '';
export const DEFAULT_SOCKET = process.env.DOCKER_SOCKET ?? '/var/run/docker.sock';
export const hasTcpDocker = Boolean(process.env.DOCKER_HOST);
export const socketMissing = !fs.existsSync(DEFAULT_SOCKET);
Expand All @@ -20,8 +17,6 @@ const runnerPort = process.env.DOCKER_RUNNER_GRPC_PORT ?? process.env.DOCKER_RUN
export const runnerAddress =
process.env.DOCKER_RUNNER_GRPC_ADDRESS ?? (runnerHost && runnerPort ? `${runnerHost}:${runnerPort}` : undefined);
export const runnerAddressMissing = !runnerAddress;
export const runnerSecretMissing = !RUNNER_SECRET;
const readinessNonceCache = new NonceCache();

export type RunnerHandle = {
grpcAddress: string;
Expand All @@ -34,39 +29,39 @@ export type PostgresHandle = {
};

export async function startDockerRunner(): Promise<RunnerHandle> {
if (!runnerAddress || !RUNNER_SECRET) {
throw new Error('DOCKER_RUNNER_GRPC_ADDRESS and DOCKER_RUNNER_SHARED_SECRET are required to run docker e2e tests.');
if (!runnerAddress) {
throw new Error('DOCKER_RUNNER_GRPC_ADDRESS is required to run docker e2e tests.');
}
await waitForRunnerReadyOnAddress(runnerAddress, RUNNER_SECRET);
await waitForRunnerReadyOnAddress(runnerAddress);
return {
grpcAddress: runnerAddress,
close: async () => undefined,
};
}

async function waitForRunnerReady(client: RunnerServiceGrpcClient, secret: string): Promise<void> {
async function waitForRunnerReady(client: RunnerServiceGrpcClient): Promise<void> {
await waitFor(async () => {
try {
await callRunnerReady(client, secret);
await callRunnerReady(client);
return true;
} catch {
return false;
}
}, { timeoutMs: 30_000, intervalMs: 250 });
}

async function waitForRunnerReadyOnAddress(address: string, secret: string): Promise<void> {
async function waitForRunnerReadyOnAddress(address: string): Promise<void> {
const client = new RunnerServiceGrpcClient(address, credentials.createInsecure());
try {
await waitForRunnerReady(client, secret);
await waitForRunnerReady(client);
} finally {
client.close();
}
}

function callRunnerReady(client: RunnerServiceGrpcClient, secret: string): Promise<void> {
function callRunnerReady(client: RunnerServiceGrpcClient): Promise<void> {
const request = create(ReadyRequestSchema, {});
const metadata = authMetadata(secret, RUNNER_SERVICE_READY_PATH);
const metadata = new Metadata();
return new Promise<void>((resolve, reject) => {
client.ready(request, metadata, (err) => {
if (err) {
Expand All @@ -78,17 +73,6 @@ function callRunnerReady(client: RunnerServiceGrpcClient, secret: string): Promi
});
}

function authMetadata(secret: string, path: string): Metadata {
const nonce = randomUUID();
readinessNonceCache.add(nonce);
const headers = buildAuthHeaders({ method: 'POST', path, body: '', secret, nonce });
const metadata = new Metadata();
for (const [key, value] of Object.entries(headers)) {
metadata.set(key, value);
}
return metadata;
}

export async function startPostgres(): Promise<PostgresHandle> {
const containerName = `containers-pg-${randomUUID()}`;
const port = await getAvailablePort();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ import { describe, expect, it, vi } from 'vitest';
import { EventEmitter } from 'node:events';
import type { ClientDuplexStream } from '@grpc/grpc-js';
import { Metadata, status } from '@grpc/grpc-js';
import { NonceCache, verifyAuthHeaders } from '../../src/infra/container/auth';
import { RUNNER_SERVICE_TOUCH_WORKLOAD_PATH } from '../../src/proto/grpc.js';
import type { RunnerServiceGrpcClientInstance } from '../../src/proto/grpc.js';

import {
Expand All @@ -21,8 +19,8 @@ class MockClientStream<Req = unknown> extends EventEmitter {
}

describe('RunnerGrpcClient', () => {
it('sends signed runner metadata on touchLastUsed calls', async () => {
const client = new RunnerGrpcClient({ address: 'grpc://runner', sharedSecret: 'test-secret' });
it('sends empty runner metadata on touchLastUsed calls', async () => {
const client = new RunnerGrpcClient({ address: 'grpc://runner' });
const captured: { metadata?: Metadata } = {};

const touchStub = vi.fn((_: unknown, metadata: Metadata, maybeOptions?: unknown, maybeCallback?: (err: Error | null) => void) => {
Expand All @@ -40,26 +38,11 @@ describe('RunnerGrpcClient', () => {

expect(touchStub).toHaveBeenCalledTimes(1);
expect(captured.metadata).toBeInstanceOf(Metadata);

const headers: Record<string, string> = {};
const metadataMap = captured.metadata?.getMap() ?? {};
for (const [key, value] of Object.entries(metadataMap)) {
headers[key] = Buffer.isBuffer(value) ? value.toString('utf8') : String(value);
}

const verification = verifyAuthHeaders({
headers,
method: 'POST',
path: RUNNER_SERVICE_TOUCH_WORKLOAD_PATH,
body: '',
secret: 'test-secret',
nonceCache: new NonceCache(),
});
expect(verification.ok).toBe(true);
expect(Object.keys(captured.metadata?.getMap() ?? {})).toHaveLength(0);
});

it('sanitizes infra details from gRPC errors', async () => {
const client = new RunnerGrpcClient({ address: 'grpc://runner', sharedSecret: 'secret' });
const client = new RunnerGrpcClient({ address: 'grpc://runner' });
const error = Object.assign(new Error('Deadline exceeded after 305.002s,LB pick: 0.001s,remote_addr=172.21.0.3:50051'), {
code: status.DEADLINE_EXCEEDED,
details: 'Deadline exceeded after 305.002s,LB pick: 0.001s,remote_addr=172.21.0.3:50051',
Expand Down Expand Up @@ -89,7 +72,6 @@ describe('RunnerGrpcExecClient', () => {
);
const execClient = new RunnerGrpcExecClient({
address: 'grpc://runner',
sharedSecret: 'secret',
client: { exec: execStub } as unknown as RunnerServiceGrpcClientInstance,
});

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,14 @@ import { afterAll, beforeAll, describe, expect, it } from 'vitest';

import { RunnerGrpcClient } from '../src/infra/container/runnerGrpc.client';
import {
RUNNER_SECRET,
hasTcpDocker,
runnerAddressMissing,
runnerSecretMissing,
socketMissing,
startDockerRunner,
type RunnerHandle,
} from './helpers/docker.e2e';

const shouldSkip = process.env.SKIP_RUNNER_EXEC_E2E === '1' || runnerAddressMissing || runnerSecretMissing;
const shouldSkip = process.env.SKIP_RUNNER_EXEC_E2E === '1' || runnerAddressMissing;
const describeOrSkip = shouldSkip || (socketMissing && !hasTcpDocker) ? describe.skip : describe;

describeOrSkip('runner gRPC exec cancellation integration', () => {
Expand All @@ -23,7 +21,7 @@ describeOrSkip('runner gRPC exec cancellation integration', () => {

beforeAll(async () => {
runner = await startDockerRunner();
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress, sharedSecret: RUNNER_SECRET });
dockerClient = new RunnerGrpcClient({ address: runner.grpcAddress });
}, 120_000);

afterAll(async () => {
Expand Down
1 change: 0 additions & 1 deletion packages/platform-server/__tests__/vitest.setup.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,3 @@ import 'reflect-metadata';
process.env.LITELLM_BASE_URL ||= 'http://127.0.0.1:4000';
process.env.LITELLM_MASTER_KEY ||= 'sk-dev-master-1234';
process.env.CONTEXT_ITEM_NULL_GUARD ||= '0';
process.env.DOCKER_RUNNER_SHARED_SECRET ||= 'test-shared-secret';
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,13 @@ import { RunnerGrpcClient } from '../src/infra/container/runnerGrpc.client';
import type { ContainerRegistry } from '../src/infra/container/container.registry';
import { DockerWorkspaceRuntimeProvider } from '../src/workspace/providers/docker.workspace.provider';

const RUNNER_SECRET_OVERRIDE = process.env.DOCKER_RUNNER_SHARED_SECRET_OVERRIDE;
const RUNNER_SECRET = RUNNER_SECRET_OVERRIDE ?? process.env.DOCKER_RUNNER_SHARED_SECRET;
const RUNNER_ADDRESS_OVERRIDE = process.env.DOCKER_RUNNER_GRPC_ADDRESS;
const RUNNER_HOST = process.env.DOCKER_RUNNER_GRPC_HOST ?? process.env.DOCKER_RUNNER_HOST;
const RUNNER_PORT = process.env.DOCKER_RUNNER_GRPC_PORT ?? process.env.DOCKER_RUNNER_PORT;

const resolvedRunnerAddress =
RUNNER_ADDRESS_OVERRIDE ?? (RUNNER_HOST && RUNNER_PORT ? `${RUNNER_HOST}:${RUNNER_PORT}` : undefined);
const shouldRunTests = Boolean(RUNNER_SECRET && resolvedRunnerAddress);
const shouldRunTests = Boolean(resolvedRunnerAddress);
const TEST_IMAGE = 'ghcr.io/agynio/devcontainer:latest';
const THREAD_ID = `grpc-exec-${Date.now()}`;
const TEST_TIMEOUT_MS = 30_000;
Expand All @@ -30,7 +28,7 @@ const describeRunner = shouldRunTests ? describe : describe.skip;

describeRunner('DockerWorkspaceRuntimeProvider exec over gRPC runner', () => {
beforeAll(async () => {
runnerClient = new RunnerGrpcClient({ address: resolvedRunnerAddress!, sharedSecret: RUNNER_SECRET! });
runnerClient = new RunnerGrpcClient({ address: resolvedRunnerAddress! });
provider = new DockerWorkspaceRuntimeProvider(runnerClient, registry);

const ensure = await provider.ensureWorkspace(
Expand Down
Loading
Loading