docker

Drives the docker CLI on a remote host over SSH. All operations are shell commands sent through ssh -o BatchMode=yes -o StrictHostKeyChecking=accept-new — there is no Docker API client.

The provider takes no configuration block.

Update strategy

docker_container updates are recreate: docker rm -f <name> followed by docker pull <image> and docker run. There is no in-place update. Any change to any attribute triggers a recreate. Networks are reconciled by name; the create path is idempotent (docker network inspect ... || docker network create ...), so re-applying without changes is a no-op.

docker_image is a producer kind: it runs docker build on the host and records the resulting image_id. Drift on build_args / context / dockerfile / target triggers a rebuild. Drift on tag itself recreates (the new tag won't share an image ID with the old one). See docker_image below.

Kinds

docker_network

Ensures a user-defined network exists on the host.

resource "docker_network" "edge" {
  host = host.primary.addr
  name = "stratum-edge"
}
attrrequiredtypedefaultdescription
hostyesstringSSH target.
namenostringresource nameDocker network name. Falls back to the resource label.
drivernostringbridgeNetwork driver passed to --driver.

The create/update path runs docker network inspect <name> >/dev/null 2>&1 || docker network create --driver <driver> <name>, so applying twice is safe.

Delete runs docker network rm <name>.

Drift detection: read runs docker network inspect <name> --format '{{json .}}'. Returns Absent on No such network, otherwise Present { host, name, driver }. The inspect output is parsed for Name and Driver.

docker_container

A long-running container.

resource "docker_container" "traefik" {
  host    = host.primary.addr
  name    = "traefik"
  image   = "traefik:v3.1"
  restart = "unless-stopped"
  env = {
    NODE_ENV = "production"
  }
  ports    = ["80:80", "443:443"]
  volumes  = ["/var/run/docker.sock:/var/run/docker.sock:ro"]
  networks = ["stratum-edge"]
  labels = {
    "traefik.enable" = "true"
  }
  command = "--api.dashboard=true"
}
attrrequiredtypedefaultdescription
hostyesstringSSH target.
imageyesstringImage reference. Pulled before run unless pull = false.
namenostringresource nameContainer name (--name). Falls back to the resource label.
restartnostringunless-stoppedPassed to --restart.
pullnobooltrueIf false, skip the docker pull step. Use for locally-built images that aren't in any registry (see pull = false below).
envnomapnoneEach entry becomes -e KEY=VALUE. Non-string values are JSON-stringified. A value that resolves from secret.<name>.value flows through as a normal env var; state stores only a redaction marker. See Secrets.
portsnolist of stringnoneEach string becomes -p <spec>. Use the standard host:container[/proto].
volumesnolist of stringnoneEach string becomes -v <spec>.
labelsnomapnoneEach entry becomes -l KEY=VALUE. Dotted keys (Traefik) must be quoted.
networksnolist of stringnoneEach string becomes --network <name>.
commandnostring | list of stringnoneAppended after the image. String form is split on whitespace; list form preserves each element as one argv token (see command).
memorynostringnoneHard memory limit, passed to --memory. Same syntax as docker (256m, 1g).
memory_swapnostringnoneTotal memory + swap limit, passed to --memory-swap. See docker's docs for the swap-disable / swap-unlimited shorthands.
healthchecknomapnoneLowered to --health-* flags on docker run. See healthcheck below.
depends_onnolist of stringnoneResource addresses (<kind>.<name>) this container depends on. The planner topo-sorts before applying. See depends_on below.

The constructed command is:

docker pull <image> >/dev/null; docker run -d --name <name> --restart <restart> [...flags...] <image> [<command tokens>]

On update it becomes:

docker rm -f <name> >/dev/null 2>&1 || true; docker pull <image> >/dev/null; docker run -d ...

Stored state preserves the input attrs verbatim and adds container_id (the last line of docker run's stdout, which is the container ID).

Delete runs docker rm -f <name>.

pull = false: locally-built images

When pull = false, the docker pull <image> >/dev/null; prefix is dropped from both the create and the recreate command. Use it when the image is built directly on the host (a sibling ssh_exec runs docker build -t myapp:dev ...) so there is no registry to pull from. With the default pull = true, docker pull myapp:dev errors and the create/update fails.

docker run itself still errors loudly if the image isn't present on the host, so a typo in image is caught at apply time — there's no silent fallback to a stale image.

For a locally-built image, prefer an immutable SHA-derived tag over :latest so rebuilds recreate the container through the normal plan diff. See Vars.

resource "ssh_exec" "build-app" {
  host    = host.primary.addr
  command = "cd /srv/repos/app && docker build -t app:dev -f Dockerfile ."
}

resource "docker_container" "app" {
  host  = host.primary.addr
  image = "app:dev"
  pull  = false
  # ...
}

command: string or argv list

# String form — split on whitespace at apply time.
command = "node server.js --port 4000"

# List form — each element is one argv token. Spaces inside an element are
# preserved (the third element below is a single shell line).
command = ["sh", "-c", "redis-server --requirepass ${secret.redis_pw.value}"]

Use the list form when an argument contains whitespace, embedded quotes, or other shell metacharacters. Each list element is shell-escaped independently when the run command crosses the SSH boundary, so spaces inside one element do not split it into two arguments.

A non-string, non-list value (a number, a bool, a map) errors at apply time.

healthcheck

A map lowered to --health-* flags on docker run. test is the only required field; everything else has a docker-side default or is sensible to omit.

resource "docker_container" "cache" {
  host  = host.primary.addr
  image = "redis:7-alpine"
  healthcheck = {
    test         = "redis-cli ping | grep -q PONG"
    interval     = "5s"
    retries      = 5
    timeout      = "3s"
    start_period = "10s"
  }
}
fieldrequiredtypeloweringdefault
testyesstring--health-cmd <value>
intervalnostring--health-interval <v>docker default (30s)
retriesnonumber--health-retries <v>docker default (3)
timeoutnostring--health-timeout <v>30s
start_periodnostring--health-start-period <v>0s

Declaring healthcheck opts the container into the post-apply readiness wait: apply will not move on to dependent steps until docker inspect --format '{{.State.Health.Status}}' reports healthy, with a 60s budget.

A healthcheck map without a test field is a hard error at apply time:

`<name>` healthcheck missing required field `test`

depends_on and the post-apply wait

resource "docker_container" "api" {
  host       = host.primary.addr
  image      = "api:dev"
  depends_on = ["docker_container.db", "docker_container.cache"]
}

Each entry must be a <kind>.<name> resource address. The planner uses these edges in three places:

  1. Topo sort of create / update steps. api is reordered after both of its dependencies, regardless of file order. The sort is stable: resources with no edges keep their relative input order, and implicit _stratum_* resources (per-host swap, sshd OOM tuning) fall through with in_degree = 0 and stay at the front.
  2. Cycle detection. A cycle is a hard error at plan time, with the cycle path in the message.
  3. Forward-topo delete order. When api and db are both removed from config, api is deleted first (forward topo over the state-resident edges) — the dependent goes down before its dependency.

A missing reference is a hard error at plan time, naming both addresses:

depends_on edge: `docker_container.api` references unknown resource `docker_container.db`

depends_on edges only resolve within a single namespace. If a producer and consumer end up in different namespaces, duplicate the producer into the consumer's namespace — see Cross-namespace depends_on.

Post-apply readiness wait. After every successful docker_container create or update, the planner pauses before moving on:

  • If healthcheck is declared, it polls docker inspect --format '{{.State.Health.Status}}' <name> once a second for up to 60s, waiting for healthy. A status of unhealthy or a timeout fails the apply with a named error. An empty status (no healthcheck configured at the docker level) is treated as ready immediately.
  • Otherwise, a cosmetic 500ms pause gives docker time to wire networks and volumes before the next step pokes the container.

This wait is what makes depends_on actually useful — a dependent container starts against a healthy dependency, not just a running one.

Drift detection: read runs docker inspect <name> --format '{{json .}}'. Returns Absent on No such object / No such container, otherwise Present. The parsed shape is { host, name, image, restart, labels, networks, container_id }.

Two normalization rules in parse_container_inspect keep observed labels from being noisy:

  • All com.docker.* labels (compose metadata, etc.) are dropped.
  • Remaining labels are intersected with the state's label key set, so image-baked LABELs and daemon-injected labels don't surface as drift.

Networks come from NetworkSettings.Networks, sorted lexicographically. (diff_observed treats string arrays as sets, so order changes don't drift either.)

--refresh and rebuilt images

A running container keeps the image SHA it started with even after the image tag is re-pointed by a rebuild, so comparing the container's own .Image against the tag tells you nothing. To catch a rebuilt-but-unchanged-tag image as drift, read resolves what the tag currently points to (docker images --no-trunc --format '{{.ID}}' <image>) and compares it against the image_id stratum recorded at the last apply. A mismatch surfaces as an image_id change under plan --refresh.

This only runs when prior state already tracks an image_id (so legacy state doesn't drift spuriously) and the tag still resolves on the host. It is the secondary safety net for the mutable-:latest case. The primary recommended pattern is an immutable SHA-derived tag built with a var: the tag changes with the source commit, so the container's image attr changes and the ordinary plan diff recreates it without any --refresh.

docker_image

Builds an image on a remote host from a context directory. Producer kind: state captures the resulting image_id so drift can be detected when the image is rebuilt or removed underneath stratum.

resource "docker_image" "api" {
  host       = host.primary.addr
  context    = "/srv/repos/api"
  dockerfile = "Dockerfile"
  target     = "runner"
  tag        = "api:dev"
  pull_base  = true
  build_args = {
    NODE_ENV = "production"
    API_URL  = "https://api.example.com"
  }
}
attrrequiredtypedefaultdescription
hostyesstringSSH target. The build runs on this host's docker daemon.
tagyesstringImage tag (-t) and the lookup key for read.
contextyesstringAbsolute path on the host containing the Dockerfile and build context. Stratum cds into it before docker build.
dockerfilenostringDockerfileDockerfile filename, passed to -f.
targetnostringnoneMulti-stage build target, passed to --target.
build_argsnomapnoneEach entry becomes --build-arg KEY=VALUE. Keys are sorted alphabetically for deterministic command output.
pull_basenoboolfalseIf true, passes --pull so docker re-pulls the base image instead of reusing a cached one.

The build line is:

cd <context> && DOCKER_BUILDKIT=1 docker build [--target T] [--build-arg K=V ...] [--pull] -t <tag> -f <dockerfile> .

DOCKER_BUILDKIT=1 is always set. The host must have the buildx plugin available (docker buildx).

After a successful build, stratum runs docker images --no-trunc --format '{{.ID}}' <tag> and records the full image ID as both image_id and id in state (the id alias is reserved for future resource-attr refs of the form docker_image.X.id).

Update strategy. docker_image does not have a separate update path — both create and update run the same build. The desired-vs-prior diff drives whether the build runs at all: a change to any of context, dockerfile, target, build_args, pull_base, or tag produces an Update step that re-runs docker build. A drift-detected change to image_id (image deleted or rebuilt out of band) also re-runs the build.

Drift detection: read runs docker images --no-trunc --format '{{.ID}}' <tag>. Empty output → Absent. Otherwise Present { host, tag, image_id, id, ...prior fields }. The build-time fields (context, dockerfile, target, build_args, pull_base) are echoed forward from prior state — docker doesn't preserve them post-build, and re-querying them is impossible. Drift on those is detected at plan time via desired-vs-prior diff, not by read.

Delete runs docker rmi <tag> best-effort.

Apply behavior

Trace lines per resource:

[docker] NETWORK `stratum-edge` on root@192.0.2.10
[docker] IMAGE `api:dev` on root@192.0.2.10 (context=/srv/repos/api)
[docker] CONTAINER `traefik` on root@192.0.2.10 (create)
[docker] CONTAINER `traefik` on root@192.0.2.10 (recreate)