Skip to content

"shim disconnected" error from dind container causing long initialization #620

@jgrabenstein

Description

@jgrabenstein

We have codefresh runtime installed using helm (cf-runtime-8.3.1) from artifacthub.io, on a GKE cluster (1.32.6-gke.1060000).

Sometimes our runs of pipelines in codefresh take an extremely long time in the "initializing process" (validating connection to docker daemon) step. It seems to retry automatically after 10+ minutes but will still fail and retry again. Other times, they run normally. This is causing a lot of frustration and delay.

From the logs, I see the dind container throwing some messages that seem related.

time="2025-09-15T15:39:08.919776421Z" level=info msg="shim disconnected" id=a3e103d17bb7421d6178970f350e8d703cbc505297c4e50d63bceca1bc4ae471 namespace=moby
time="2025-09-15T15:39:08.919838401Z" level=warning msg="cleaning up after shim disconnected" id=a3e103d17bb7421d6178970f350e8d703cbc505297c4e50d63bceca1bc4ae471 namespace=moby
time="2025-09-15T15:39:08.919851111Z" level=info msg="cleaning up dead shim" namespace=moby
time="2025-09-15T15:39:08.919968911Z" level=info msg="ignoring event" container=a3e103d17bb7421d6178970f350e8d703cbc505297c4e50d63bceca1bc4ae471 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

There is also this message from the engine container:

fn: inspect, attempt: 1 failed with: Error: (HTTP code 404) no such container - No such container: bcea459935492295070c1c12933ac385f6d17a6838cbdba2a7d6dc9cd169dcd3  (transaction: 82932591-72b8-4a1e-a920-0085922e2828). throwing error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions