docker

Drives the docker CLI on a remote host over SSH. All operations are shell commands sent through ssh -o BatchMode=yes -o StrictHostKeyChecking=accept-new — there is no Docker API client.

The provider takes no configuration block.

Update strategy

docker_container updates are recreate: docker rm -f <name> followed by docker pull <image> and docker run. There is no in-place update. Any change to any attribute triggers a recreate. Networks are reconciled by name; the create path is idempotent (docker network inspect ... || docker network create ...), so re-applying without changes is a no-op.

docker_image is a producer kind: it runs docker build on the host and records the resulting image_id. Drift on build_args / context / dockerfile / target triggers a rebuild. Drift on tag itself recreates (the new tag won't share an image ID with the old one). See docker_image below.

Kinds

`docker_network`

Ensures a user-defined network exists on the host.

resource "docker_network" "edge" {
  host = host.primary.addr
  name = "stratum-edge"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`name`	no	string	resource name	Docker network name. Falls back to the resource label.
`driver`	no	string	`bridge`	Network driver passed to `--driver`.

The create/update path runs docker network inspect <name> >/dev/null 2>&1 || docker network create --driver <driver> <name>, so applying twice is safe.

Delete runs docker network rm <name>.

Drift detection: read runs docker network inspect <name> --format '{{json .}}'. Returns Absent on No such network, otherwise Present { host, name, driver }. The inspect output is parsed for Name and Driver.

`docker_container`

A long-running container.

resource "docker_container" "traefik" {
  host    = host.primary.addr
  name    = "traefik"
  image   = "traefik:v3.1"
  restart = "unless-stopped"
  env = {
    NODE_ENV = "production"
  }
  ports    = ["80:80", "443:443"]
  volumes  = ["/var/run/docker.sock:/var/run/docker.sock:ro"]
  networks = ["stratum-edge"]
  labels = {
    "traefik.enable" = "true"
  }
  command = "--api.dashboard=true"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`image`	yes	string	—	Image reference. Pulled before run unless `pull = false`.
`name`	no	string	resource name	Container name (`--name`). Falls back to the resource label.
`restart`	no	string	`unless-stopped`	Passed to `--restart`.
`pull`	no	bool	`true`	If `false`, skip the `docker pull` step. Use for locally-built images that aren't in any registry (see `pull = false` below).
`env`	no	map	none	Each entry becomes `-e KEY=VALUE`. Non-string values are JSON-stringified. A value that resolves from `secret.<name>.value` flows through as a normal env var; state stores only a redaction marker. See Secrets.
`ports`	no	list of string	none	Each string becomes `-p <spec>`. Use the standard `host:container[/proto]`.
`volumes`	no	list of string	none	Each string becomes `-v <spec>`.
`labels`	no	map	none	Each entry becomes `-l KEY=VALUE`. Dotted keys (Traefik) must be quoted.
`networks`	no	list of string	none	Each string becomes `--network <name>`.
`command`	no	string \| list of string	none	Appended after the image. String form is split on whitespace; list form preserves each element as one argv token (see `command`).
`memory`	no	string	none	Hard memory limit, passed to `--memory`. Same syntax as docker (`256m`, `1g`).
`memory_swap`	no	string	none	Total memory + swap limit, passed to `--memory-swap`. See docker's docs for the swap-disable / swap-unlimited shorthands.
`healthcheck`	no	map	none	Lowered to `--health-*` flags on `docker run`. See `healthcheck` below.
`depends_on`	no	list of string	none	Resource addresses (`<kind>.<name>`) this container depends on. The planner topo-sorts before applying. See `depends_on` below.

The constructed command is:

docker pull <image> >/dev/null; docker run -d --name <name> --restart <restart> [...flags...] <image> [<command tokens>]

On update it becomes:

docker rm -f <name> >/dev/null 2>&1 || true; docker pull <image> >/dev/null; docker run -d ...

Stored state preserves the input attrs verbatim and adds container_id (the last line of docker run's stdout, which is the container ID).

Delete runs docker rm -f <name>.

`pull = false`: locally-built images

When pull = false, the docker pull <image> >/dev/null; prefix is dropped from both the create and the recreate command. Use it when the image is built directly on the host (a sibling ssh_exec runs docker build -t myapp:dev ...) so there is no registry to pull from. With the default pull = true, docker pull myapp:dev errors and the create/update fails.

docker run itself still errors loudly if the image isn't present on the host, so a typo in image is caught at apply time — there's no silent fallback to a stale image.

For a locally-built image, prefer an immutable SHA-derived tag over :latest so rebuilds recreate the container through the normal plan diff. See Vars.

resource "ssh_exec" "build-app" {
  host    = host.primary.addr
  command = "cd /srv/repos/app && docker build -t app:dev -f Dockerfile ."
}

resource "docker_container" "app" {
  host  = host.primary.addr
  image = "app:dev"
  pull  = false
  # ...
}

`command`: string or argv list

# String form — split on whitespace at apply time.
command = "node server.js --port 4000"

# List form — each element is one argv token. Spaces inside an element are
# preserved (the third element below is a single shell line).
command = ["sh", "-c", "redis-server --requirepass ${secret.redis_pw.value}"]

Use the list form when an argument contains whitespace, embedded quotes, or other shell metacharacters. Each list element is shell-escaped independently when the run command crosses the SSH boundary, so spaces inside one element do not split it into two arguments.

A non-string, non-list value (a number, a bool, a map) errors at apply time.

`healthcheck`

A map lowered to --health-* flags on docker run. test is the only required field; everything else has a docker-side default or is sensible to omit.

resource "docker_container" "cache" {
  host  = host.primary.addr
  image = "redis:7-alpine"
  healthcheck = {
    test         = "redis-cli ping | grep -q PONG"
    interval     = "5s"
    retries      = 5
    timeout      = "3s"
    start_period = "10s"
  }
}

field	required	type	lowering	default
`test`	yes	string	`--health-cmd <value>`	—
`interval`	no	string	`--health-interval <v>`	docker default (30s)
`retries`	no	number	`--health-retries <v>`	docker default (3)
`timeout`	no	string	`--health-timeout <v>`	`30s`
`start_period`	no	string	`--health-start-period <v>`	`0s`

Declaring healthcheck opts the container into the post-apply readiness wait: apply will not move on to dependent steps until docker inspect --format '{{.State.Health.Status}}' reports healthy, with a 60s budget.

A healthcheck map without a test field is a hard error at apply time:

`<name>` healthcheck missing required field `test`

`depends_on` and the post-apply wait

resource "docker_container" "api" {
  host       = host.primary.addr
  image      = "api:dev"
  depends_on = ["docker_container.db", "docker_container.cache"]
}

Each entry must be a <kind>.<name> resource address. The planner uses these edges in three places:

Topo sort of create / update steps. api is reordered after both of its dependencies, regardless of file order. The sort is stable: resources with no edges keep their relative input order, and implicit _stratum_* resources (per-host swap, sshd OOM tuning) fall through with in_degree = 0 and stay at the front.
Cycle detection. A cycle is a hard error at plan time, with the cycle path in the message.
Forward-topo delete order. When api and db are both removed from config, api is deleted first (forward topo over the state-resident edges) — the dependent goes down before its dependency.

A missing reference is a hard error at plan time, naming both addresses:

depends_on edge: `docker_container.api` references unknown resource `docker_container.db`

depends_on edges only resolve within a single namespace. If a producer and consumer end up in different namespaces, duplicate the producer into the consumer's namespace — see Cross-namespace depends_on.

Post-apply readiness wait. After every successful docker_container create or update, the planner pauses before moving on:

If healthcheck is declared, it polls docker inspect --format '{{.State.Health.Status}}' <name> once a second for up to 60s, waiting for healthy. A status of unhealthy or a timeout fails the apply with a named error. An empty status (no healthcheck configured at the docker level) is treated as ready immediately.
Otherwise, a cosmetic 500ms pause gives docker time to wire networks and volumes before the next step pokes the container.

This wait is what makes depends_on actually useful — a dependent container starts against a healthy dependency, not just a running one.

Drift detection: read runs docker inspect <name> --format '{{json .}}'. Returns Absent on No such object / No such container, otherwise Present. The parsed shape is { host, name, image, restart, labels, networks, container_id }.

Two normalization rules in parse_container_inspect keep observed labels from being noisy:

All com.docker.* labels (compose metadata, etc.) are dropped.
Remaining labels are intersected with the state's label key set, so image-baked LABELs and daemon-injected labels don't surface as drift.

Networks come from NetworkSettings.Networks, sorted lexicographically. (diff_observed treats string arrays as sets, so order changes don't drift either.)

`--refresh` and rebuilt images

A running container keeps the image SHA it started with even after the image tag is re-pointed by a rebuild, so comparing the container's own .Image against the tag tells you nothing. To catch a rebuilt-but-unchanged-tag image as drift, read resolves what the tag currently points to (docker images --no-trunc --format '{{.ID}}' <image>) and compares it against the image_id stratum recorded at the last apply. A mismatch surfaces as an image_id change under plan --refresh.

This only runs when prior state already tracks an image_id (so legacy state doesn't drift spuriously) and the tag still resolves on the host. It is the secondary safety net for the mutable-:latest case. The primary recommended pattern is an immutable SHA-derived tag built with a var: the tag changes with the source commit, so the container's image attr changes and the ordinary plan diff recreates it without any --refresh.

`docker_image`

Builds an image on a remote host from a context directory. Producer kind: state captures the resulting image_id so drift can be detected when the image is rebuilt or removed underneath stratum.

resource "docker_image" "api" {
  host       = host.primary.addr
  context    = "/srv/repos/api"
  dockerfile = "Dockerfile"
  target     = "runner"
  tag        = "api:dev"
  pull_base  = true
  build_args = {
    NODE_ENV = "production"
    API_URL  = "https://api.example.com"
  }
}

attr	required	type	default	description
`host`	yes	string	—	SSH target. The build runs on this host's docker daemon.
`tag`	yes	string	—	Image tag (`-t`) and the lookup key for `read`.
`context`	yes	string	—	Absolute path on the host containing the Dockerfile and build context. Stratum `cd`s into it before `docker build`.
`dockerfile`	no	string	`Dockerfile`	Dockerfile filename, passed to `-f`.
`target`	no	string	none	Multi-stage build target, passed to `--target`.
`build_args`	no	map	none	Each entry becomes `--build-arg KEY=VALUE`. Keys are sorted alphabetically for deterministic command output.
`pull_base`	no	bool	`false`	If true, passes `--pull` so docker re-pulls the base image instead of reusing a cached one.

The build line is:

cd <context> && DOCKER_BUILDKIT=1 docker build [--target T] [--build-arg K=V ...] [--pull] -t <tag> -f <dockerfile> .

DOCKER_BUILDKIT=1 is always set. The host must have the buildx plugin available (docker buildx).

After a successful build, stratum runs docker images --no-trunc --format '{{.ID}}' <tag> and records the full image ID as both image_id and id in state (the id alias is reserved for future resource-attr refs of the form docker_image.X.id).

Update strategy. docker_image does not have a separate update path — both create and update run the same build. The desired-vs-prior diff drives whether the build runs at all: a change to any of context, dockerfile, target, build_args, pull_base, or tag produces an Update step that re-runs docker build. A drift-detected change to image_id (image deleted or rebuilt out of band) also re-runs the build.

Drift detection: read runs docker images --no-trunc --format '{{.ID}}' <tag>. Empty output → Absent. Otherwise Present { host, tag, image_id, id, ...prior fields }. The build-time fields (context, dockerfile, target, build_args, pull_base) are echoed forward from prior state — docker doesn't preserve them post-build, and re-querying them is impossible. Drift on those is detected at plan time via desired-vs-prior diff, not by read.

Delete runs docker rmi <tag> best-effort.

Apply behavior

Trace lines per resource:

[docker] NETWORK `stratum-edge` on root@192.0.2.10
[docker] IMAGE `api:dev` on root@192.0.2.10 (context=/srv/repos/api)
[docker] CONTAINER `traefik` on root@192.0.2.10 (create)
[docker] CONTAINER `traefik` on root@192.0.2.10 (recreate)