Introduction

stratum is a tiny declarative IaC tool, written from scratch in Rust. It's scoped to system bootstrap: install packages, drop config files, manage systemd services, configure the firewall, and run system-tier containers (Traefik, monitoring agents, log shippers). Describe resources in a .strat file, run stratum plan to preview, and stratum apply -y to make it happen. State lives in a JSON file under .stratum/state.json next to your config.

The tool is intentionally small. No plugin system, no remote backend, no cloud SDK. Providers are first-class crates in the workspace; today there are four:

system — packages, services, files, secret files, directories, ufw rules.
ssh — shells out to system ssh to run commands and write files.
docker — drives the remote docker CLI over SSH (networks, containers, image builds).
git — clones and pins git working trees on a remote host.

What stratum is not

Stratum is not an app-deployment tool. It will not build your application image, manage per-app environments, or rotate deploys. Stratum's role stops at "the host has docker, traefik, a firewall, and the right config files." Reach for a per-app deploy tool from there.

What works today

.strat config: host, secret, provider, resource blocks; nested maps, lists, string/number/bool values; comments (# and //); refs of the form host.<name>.<field> and secret.<name>.value; ${...} string interpolation for embedding refs inside larger strings.
JSON state at .stratum/state.json, with create / update / delete actions and a recursive structural diff that ignores state-only provider-computed fields.
CLI: plan (with --refresh for live drift detection and --allow-unresolved-secrets for plan-only review), apply (with -y to execute and --allow-destroy to permit deletes), status (per-host resource snapshot), state list, state show, state merge. Global --env-file for loading env vars before resolving secret { from_env } refs, with auto-load of ./.env when no flag is passed.
Repeatable -c on plan/apply: multiple .strat files merge into one document and evaluate together. Hosts and secrets declared in one file are visible to refs in another. Duplicates across files are hard errors that name both paths. See Multi-file configs.
Namespaces: a top-level namespace "<n>" { configs = [...] } block declares a deployable slice with its own state file. The CLI's -n NAME global selects a namespace; the manifest's shared host / secret / provider blocks are visible to every namespace's configs. Cross-namespace docker_container port and name collisions are caught at plan time. Implicit per-host tuning resources are kept in a shared state file (.stratum/_shared.json) so multiple namespaces sharing a host don't fight over them. Bundle mode (no -n) is unchanged.
system_package, system_service, system_file, system_secret_file, system_dir, system_ufw_rule (apt + systemd + ufw + file + secret-file + directory-tree management). system_dir also has an empty-dir mode for pre-creating daemon directories without uploading anything. system_secret_file stores sha256 only — plaintext never persists in state.
ssh_exec (with optional env map for sensitive shell prefixes), ssh_file (run commands, write files).
docker_network, docker_container, docker_image. Containers support depends_on (planner topo-sorts), healthcheck (post-apply readiness wait), memory / memory_swap limits, list-form command for argv passthrough, and pull = false for locally-built images. docker_image builds images on the host with DOCKER_BUILDKIT=1 and tracks the resulting image_id.
git_repo (new git provider) clones a remote repo to a fixed path on a host and pins it to a branch, tag, or full SHA. State tracks commit_sha; drift triggers fetch + reset.
Secrets v0: secret { from_env = ... } or from_file = ..., referenced as secret.<name>.value or ${secret.<name>.value} inside strings. Plaintext flows to providers; state stores a {__secret, __secret_sha256} marker for whole-leaf matches and a <secret:NAME:sha256:HEX> inline marker for substring matches inside interpolated strings; diff/diff_observed are marker-aware so no perpetual drift; plan output renders object markers as <secret:name sha:abc123>. See Secrets.
Planner-side validators: docker_container.ports conflict check across the merged config (fails on (host, ip, host_port) collisions, with 0.0.0.0:N / 127.0.0.1:N symmetry); depends_on topo sort with cycle and unknown-ref detection.
Post-apply readiness wait: a docker_container with a healthcheck map blocks subsequent steps until docker inspect reports healthy (60s budget). Otherwise a 500ms cosmetic pause.
Provider::read on system_* (except system_ufw_rule), ssh_file, both docker_* kinds, and git_repo — surfaces drift between recorded state and live host reality.
Post-apply self-check: every successful stratum apply -y re-reads each resource and reports post-apply drift: clean (or counts of differ/missing/unreadable).
content_file = "<relative-path>" on system_file and source_dir = "<relative-path>" on system_dir, resolved against the .strat file's directory.
Destruction guard: stratum apply refuses to run a plan with Delete steps unless --allow-destroy is passed. Error names the resources, the loaded configs, and the state file path.

What does not work yet

No drift detection for system_ufw_rule or ssh_exec — both return Unknown from read, so they show up in unreadable counts (intentional).
No DNS provider.
No partial / targeted apply — every plan is whole-config.
No state locking or remote backend.
No cross-resource refs of the form <kind>.<name>.<attr> (e.g. docker_image.X.id) — producer-to-consumer wiring still uses the literal tag + a depends_on edge. Planned, not shipped.
The provider "<name>" { ... } block parses but no shipped provider reads one today.

Quick start

cargo build --release

# write a config
cat > stratum.strat <<'EOF'
host "primary" {
  addr = "root@192.0.2.10"
}

resource "system_package" "curl" {
  host = host.primary.addr
  name = "curl"
}
EOF

# see what would happen
./target/release/stratum plan

# preview again with live drift annotation
./target/release/stratum plan --refresh

# apply for real
./target/release/stratum apply -y

State is written to .stratum/state.json. Inspect it with stratum state list and stratum state show.

Where to go next

Bootstrap a fresh droplet — end-to-end walkthrough: blank Ubuntu 24.04 → ufw + docker + traefik in one apply.
Inject a secret into a docker container — the env-var-from-shell-to-container pattern, with rotation.
Multi-namespace deployments — split one host into independent deployable slices.
The .strat language — full grammar reference.
String interpolation — embed ${host.X.field} and ${secret.Y.value} inside larger strings.
Multi-file configs — one state per host, cross-file refs, duplicate handling.
Namespaces — manifest blocks, per-namespace state, cross-namespace conflict checks.
Secrets — the secret block, refs, redaction, the honesty guard.
Providers — what each kind does and the exact attribute schema.
Architecture — how plan, apply, and drift detection fit together.

The `.strat` language

A .strat file is a flat list of top-level blocks. Five kinds exist:

host "<name>" { ... } — a named SSH target. See Hosts.
secret "<name>" { ... } — a sensitive value sourced from env or file, referenced as secret.<name>.value. See Secrets.
provider "<name>" { ... } — provider configuration. See Providers.
resource "<kind>" "<name>" { ... } — a piece of declared infra. See Resources.
namespace "<name>" { ... } — a deployable slice; lists the .strat files that apply together against one dedicated state file. Only meaningful in a top-level manifest. See Namespaces.

document    ::= block*
block       ::= ident label* "{" body "}"
label       ::= string
body        ::= ( attr | block )*
attr        ::= ident "=" value
value       ::= string | number | bool | list | map | ref
list        ::= "[" ( value ( "," value )* ","? )? "]"
map         ::= "{" ( ( ident | string ) "=" value )* "}"
ref         ::= ident ( "." ident )+

Whitespace is insignificant. Block bodies may also contain nested blocks; those are folded into the parent map under the key <kind> (or <kind>_<label> when labels are present).

Comments

Both # and // start a line comment. There are no block comments.

# this is a comment
// so is this
resource "ssh_exec" "uptime" {
  host    = host.prod.addr  # inline comments work too
  command = "uptime"
}

Evaluation order

The evaluator runs four passes:

Hosts first. All host blocks are evaluated with an empty scope, so they must be made of literals only. Any ref inside a host block is a hard error.
Secrets next. All secret blocks are evaluated, sources resolved eagerly. Secrets are also literal-only — refs inside secret bodies error.
Namespaces. All namespace blocks are collected. Body attributes (configs, optional state) must be literal — no refs allowed. The namespaces don't take part in resource evaluation; they only inform the CLI's -n NAME resolution.
Providers and resources. With the host + secret scope built, provider and resource bodies are evaluated and any host.<name>.<field> / secret.<name>.value reference is resolved.

See References & scope for the resolution rules.

Resource kind naming

A resource kind must start with the provider name, separated by an underscore — for example system_package, ssh_exec, docker_container. The prefix is what stratum uses to route resources to a provider. A kind with no underscore (or with only an underscore as a separator) is rejected.

Values

Strings, numbers, booleans, lists, and maps. See Types & values for the exact rendering rules — in particular the integer-vs-float behavior for numbers.

Strings may embed ${<ref>} placeholders that are substituted at config-load time. See String interpolation.

Hosts

A host block names an SSH target. Other blocks reference its fields via host.<name>.<field>.

host "prod" {
  addr = "root@192.0.2.10"
  port = 22
}

host_block ::= "host" string "{" attr* "}"

The single label is the host's name and must be unique within the document. The body is a flat list of attributes; nested blocks are allowed but rare in practice.

Literal-only

Host blocks are evaluated in pass 1 with an empty scope. They cannot reference anything — not other hosts, not providers, not themselves. Any ref inside a host body errors with references not allowed inside host blocks.

host "prod" {
  addr = host.staging.addr   # error: hosts must be literal
}

Fields

There is no schema for host fields — anything you put in a host body is accessible by ref. The conventional fields used by the SSH and Docker providers are:

field	type	notes
`addr`	string	`user@host` form passed verbatim to the `ssh` binary.
`port`	number	Currently unused by the providers; reserved for future use.

In the providers shipped today, only addr is consumed; SSH connection options come from your ~/.ssh/config and ssh-agent. The port field is parsed but not yet wired through.

Multiple hosts

Declare as many as you need. Each one is independent.

host "prod"    { addr = "root@192.0.2.10" }
host "staging" { addr = "root@5.6.7.8" }

resource "ssh_exec" "uptime_prod" {
  host    = host.prod.addr
  command = "uptime"
}
resource "ssh_exec" "uptime_staging" {
  host    = host.staging.addr
  command = "uptime"
}

Secrets

A secret "<name>" { ... } block sources a sensitive value from outside the .strat file and makes it referenceable as secret.<name>.value. Plaintext flows into provider attrs at apply time; state stores a redaction marker, never the value itself.

secret "pg_password" {
  from_env = "PG_PASSWORD"
}

resource "docker_container" "db" {
  host  = host.primary.addr
  image = "postgres:16-alpine"
  env = {
    POSTGRES_PASSWORD = secret.pg_password.value
  }
}

secret_block ::= "secret" string "{" attr* "}"

The single label is the secret's name. Names must be unique within the document — duplicates across -c files error with duplicate secret.

Sources

Exactly one of from_env or from_file is required. Both set or neither set is a hard error (BadSecretBody).

attr	type	description
`from_env`	string	Name of an environment variable. Resolved with `std::env::var` at config-load time.
`from_file`	string	Path to a file. `~` and `~/` expand to `$HOME` (or `$USERPROFILE` on Windows). Relative paths resolve against the `.strat` file's directory. The file's contents are loaded with `std::fs::read_to_string`; one trailing `\n` or `\r` is trimmed.
`sensitive`	bool	Default `true`. When `false`, the resolved value is still used by providers but is never placed in the redaction map — CLI output and state will show it in the clear. Opt-out for values you don't mind printing.

Both sources resolve eagerly at config-load time. A missing env var or unreadable file is a hard error unless --allow-unresolved-secrets is set on plan.

References

The only allowed field is value:

env = { POSTGRES_PASSWORD = secret.pg_password.value }

Anything else (secret.pg.fingerprint, secret.pg, secret.pg.value.foo) errors with unknown secret field or reference ... too short.

Like host blocks, secret bodies are literal-only: any ref inside (including refs to other secrets) errors with references not allowed inside \secret` blocks`.

Redaction

When a secret is resolved and meets all of these conditions, its plaintext is added to a private redaction map:

sensitive = true (the default).
The resolved value is at least 8 characters long.
The value is non-empty.
The secret is not in the unresolved-placeholder state (see below).

The redaction walk runs over every plan step's desired and prior values before printing, and over every provider's returned attrs before they're persisted to state. Any leaf string that exactly matches a known plaintext is replaced with a marker object:

{
  "__secret": "pg_password",
  "__secret_sha256": "sha256:f7c3bc1d808e04..."
}

The marker is what lives in .stratum/state.json. Re-loading state and re-planning produces the same marker (no re-redaction needed — redact_walk is idempotent on markers).

Substring redaction

The exact-match case covers env = { POSTGRES_PASSWORD = secret.pg.value } — the leaf string equals the plaintext, so the whole leaf is replaced with the object marker. A secret ref inside a ${...} interpolation (see String interpolation) is different: the plaintext is glued into a larger string at evaluation time, so the redaction walk sees "postgresql://app:CORRECTHORSEBATTERY@db:5432/app", not the bare secret.

For those cases, redact_walk falls back to substring replacement. Every known secret plaintext that appears in the leaf is replaced inline with a marker token of the form <secret:NAME:sha256:HEX>. Longest match wins (so overlapping secrets stay deterministic), and replacement is per-occurrence. The substituted string is what lands in state:

{
  "env": {
    "DATABASE_URL": "postgresql://app:<secret:pg:sha256:f7c3bc...>@db:5432/app"
  }
}

Substring redaction also runs over diff_observed output: when state holds the inline marker and the live host returns plaintext, the marker is applied to the observed side before comparison, so both sides collapse to the same string and the diff disappears. Without this, every plan --refresh would emit a spurious Update for every interpolated secret-bearing field. The post-redaction equality check happens in Extracted::redact_plan, which the CLI calls on both plan and apply before printing.

A short value (< 8 chars) still resolves and flows into provider attrs — it's just not added to the redaction map, because substring-substitution on short strings ("root", "5432") has a high false-positive rate. The CLI emits a warning to stderr in that case:

[secrets] warning: secret `s` resolved value is <8 chars; CLI/state will not redact it

If two distinct secrets resolve to the same plaintext, you get a different warning — the marker is ambiguous because there's no way to tell which secret a given plaintext leaf came from:

[secrets] warning: secrets `a` and `b` resolved to the same value — redaction marker may be ambiguous

Plan output

Secret-bearing fields render with a 6-char hash prefix:

 ~ docker_container.db
      ~ env.POSTGRES_PASSWORD: <secret:pg_password sha:f7c3bc> -> <secret:pg_password sha:9a1e44>

The hash is enough signal to tell that the value changed without leaking the value or a full attackable digest.

Drift detection

diff_observed is marker-aware:

Marker (state) vs plaintext (observed) — hash the plaintext, compare to the marker's __secret_sha256. Match → no drift.
Marker vs marker — compare hashes directly.
Mismatch → emit a single <secret> -> <secret-drifted> change, with no plaintext on either side.

This is what stops --refresh from showing a perpetual drift on every secret-bearing field. Without marker awareness, every refresh would compare the state-side marker object against the host-side plaintext and flag a difference.

The honesty guard

Some resource attrs receive opaque blobs that stratum can't substring-redact after the fact: file contents, file paths that may contain content interpolation, directory uploads. Embedding a secret ref in any of these is rejected at config-load time:

kind	forbidden attr
`system_file`	`content`
`system_file`	`content_file`
`system_dir`	`source_dir`

resource "system_file" "x" {
  content = secret.s.value   # error: SecretInUnsupportedAttr
}

The error message is secret reference not allowed in \system_file.content` — secrets must be in single-leaf string attrs (e.g. inside `env`), not embedded in file content or path strings`.

This is a deliberate limitation. The right shape for a config-file secret is a templated config rendered outside stratum (or via a future system_template kind that knows about secret boundaries) — not a secret embedded inside a content blob whose redaction story is "search the file for the plaintext, hope you find it."

For the common case of "drop a whole secret blob on the host" (a Firebase service-account JSON, an .env file, an age-encrypted key), use system_secret_file. Its content attribute accepts a secret ref directly — state stores sha256 plus permissions, never the plaintext.

resource "system_secret_file" "firebase-sa" {
  host    = host.primary.addr
  path    = "/etc/app/firebase-sa.json"
  content = secret.firebase_sa.value
  mode    = "0400"
}

`--allow-unresolved-secrets`

stratum plan --allow-unresolved-secrets treats a missing env var or unreadable file as a soft failure: the secret's value becomes the placeholder string <unresolved-secret:NAME> instead of erroring out. Useful when reviewing someone else's config without their env populated.

The placeholder flows through eval_value like any other string, so it shows up in plan output wherever the secret was referenced. Apply refuses to run a plan containing any placeholder:

refusing to apply: plan contains unresolved-secret placeholder for `pg_password`
hint: this only happens via `plan --allow-unresolved-secrets`; set the secret's source and retry.

Placeholders are not added to the redaction map.

Errors

condition	error
Neither `from_env` nor `from_file`	`BadSecretBody`
Both `from_env` and `from_file`	`BadSecretBody`
Env var unset (without `--allow-unresolved-secrets`)	`SecretEnvMissing`
`from_file` path unreadable (without the flag)	`SecretFileMissing`
`from_file` is relative but the source has no base dir	`SecretFileNoBaseDir` (only with `load_str`, not CLI flows)
`~` in `from_file` but neither `HOME` nor `USERPROFILE` set	`SecretTildeNoHome`
Ref inside the secret body	`RefInSecretBlock`
Unknown secret name	`UnknownSecret`
Unknown field (anything other than `value`)	`UnknownSecretField`
Duplicate name across `-c` files	`DuplicateSecret` (names both paths)
Secret ref in a forbidden attr	`SecretInUnsupportedAttr`

Tutorial

See Inject a secret into a docker container for the env-var-on-docker_container pattern end to end.

Resources

A resource block declares a piece of infra that should exist.

resource "<kind>" "<name>" {
  <attr> = <value>
  ...
}

resource_block ::= "resource" string string "{" body "}"
body           ::= ( attr | nested_block )*

The two labels are positional:

<kind> — the resource kind, e.g. docker_container. The prefix before the first _ selects the provider.
<name> — a stable identifier unique within the kind. The pair <kind>.<name> is the address used everywhere else (state file, plan output, stratum state show).

The address (kind, name) must be unique within the document; duplicates error.

Kind-to-provider routing

stratum splits the kind on the first _ and uses the prefix as the provider name. The kind must contain at least one underscore and must not start with one.

kind	provider
`system_package`	`system`
`ssh_exec`	`ssh`
`docker_container`	`docker`

Anything without an underscore-prefix (foo, _bar) is rejected with resource kind ... must be prefixed with provider name.

Example

resource "docker_container" "hello" {
  host    = host.primary.addr
  name    = "hello"
  image   = "nginxdemos/hello:latest"
  restart = "unless-stopped"
  networks = ["stratum-edge"]
  labels = {
    "traefik.enable"                   = "true"
    "traefik.http.routers.hello.rule"  = "Host(`hello.example.com`)"
  }
}

Nested blocks

A resource body may contain nested blocks. They are folded into the parent map. If the nested block has labels, the key is <kind>_<label1>_<label2>...; with no labels it is just <kind>.

resource "ssh_exec" "demo" {
  host    = host.prod.addr
  command = "true"

  meta {
    owner = "ops"
  }
}

Stored attrs:

{
  "host": "root@192.0.2.10",
  "command": "true",
  "meta": { "owner": "ops" }
}

Provider implementations today expect attributes, not nested blocks — see each provider page for the shape it reads.

What providers see

After parsing and ref resolution, each resource body is a serde_json::Value::Object. Providers receive that object and pick the fields they care about. Unknown fields are ignored — there is no validation step between the parser and the provider.

References & scope

A reference is a dotted path of identifiers used in place of a literal value.

ref ::= ident ( "." ident )+

The first segment is the root. Two roots are supported:

root	resolves to	shape
`host`	a field on a declared host block.	`host.<name>.<field>` (3+ parts)
`secret`	the resolved plaintext of a declared secret block.	`secret.<name>.value` (exactly 3 parts; `value` is the only allowed field)

Anything else errors with unknown reference root.

Host references

Form: host.<name>.<field> (at least three segments).

host "prod" {
  addr = "root@192.0.2.10"
  port = 22
}

resource "ssh_exec" "uptime" {
  host    = host.prod.addr   # -> "root@192.0.2.10"
  command = "uptime"
}

Resolution walks the host's evaluated attrs as a JSON map. Each segment after the host name indexes one level deeper, so nested fields work too:

host "prod" {
  ssh = {
    user = "root"
    addr = "1.2.3.4"
  }
}

resource "ssh_exec" "x" {
  host    = host.prod.ssh.addr   # -> "1.2.3.4"
  command = "true"
}

Secret references

Form: secret.<name>.value. Exactly three segments — value is the only allowed field, since secrets are leaves.

secret "pg_password" {
  from_env = "PG_PASSWORD"
}

resource "docker_container" "db" {
  host  = host.primary.addr
  image = "postgres:16"
  env = {
    POSTGRES_PASSWORD = secret.pg_password.value
  }
}

Secret refs are only allowed inside single-leaf string attrs. They are rejected at config-load time inside system_file.content, system_file.content_file, and system_dir.source_dir — see Secrets: the honesty guard for why.

Error cases

condition	error
Fewer than 3 segments (`host.prod`)	`reference ... too short — expected at least 3 segments`
Unknown host name (`host.ghost.addr`)	`unknown host` ghost `in reference ...`
Unknown field (`host.prod.missing`)	`host` prod`has no field`missing `...`
Unknown secret name (`secret.ghost.value`)	`unknown secret` ghost `in reference ...`
Secret field other than `value` (`secret.s.fingerprint`)	`reference to secret` s `has unsupported field ...`
Unknown root (`provider.x.y`)	`unknown reference root` provider``
Any ref inside a `host` body	`references not allowed inside` host `blocks`
Any ref inside a `secret` body	`references not allowed inside` secret `blocks`
Secret ref inside `system_file.content` / `content_file` / `system_dir.source_dir`	`secret reference not allowed in ...`

Why these two roots only

The three-pass evaluator collects hosts first (pass 1), then secrets (pass 2), then evaluates providers and resources (pass 3). Providers and resources see fully populated host + secret scopes but cannot reference each other — there is no resource.foo.attr form. Cross-resource references would need a topological pass; that has not shipped.

References inside strings

A reference may also appear inside a string as ${<ref>} — the placeholder is replaced with the resolved scalar value at evaluation time. The same root rules apply (host.* and secret.* only), and the same honesty guard fires for forbidden attrs. See String interpolation for the grammar and edge cases.

env = {
  DATABASE_URL = "postgresql://app:${secret.pg.value}@${host.primary.addr}:5432/app"
}

Bare identifiers as values

A single identifier with no dots is not a valid value. Either quote it as a string or extend it to a ref. The parser will report bare identifier ... not allowed as value (use a string or a reference like ...).

resource "ssh_exec" "x" {
  host    = prod          # error: bare identifier
  host    = "prod"        # ok — string
  host    = host.prod.addr  # ok — reference
}

String interpolation

A double-quoted string may embed ${<ref>} placeholders. Each placeholder is replaced at config-load time with the resolved value of the reference (see References & scope), coerced to its string form.

env = {
  DATABASE_URL = "postgresql://app:${secret.pg.value}@${host.primary.addr}:5432/app"
}

string         ::= '"' ( char | escape | interp )* '"'
interp         ::= "${" ref "}"
ref            ::= ident ( "." ident )+
escape         ::= "\\\"" | "\\\\" | "\\n" | "\\r" | "\\t" | "\\${"

A string that contains no ${...} lexes as a plain string literal. A string with at least one ${...} lexes as a template (alternating literal and interpolation segments) and the evaluator concatenates the resolved parts.

Allowed refs inside `${...}`

Any reference the References & scope rules accept:

host.<name>.<field> — including nested fields like host.prod.ssh.addr.
secret.<name>.value — the resolved plaintext flows in. State stores a substring marker (see Secrets).

A bare identifier inside ${...} is rejected:

${foo}        ← error: "bare identifier `foo` not allowed in `${...}`"
${secret.foo} ← error: too short (secret refs need exactly 3 segments)
${a.${b}}     ← error: "nested `${` is not supported"
${}           ← error: "empty `${}` is not allowed"

Scalar coercion

The reference must resolve to a scalar — string, number, or bool. Numbers render via their JSON form (4000, 1.5), bools as true / false. Lists, maps, or null error at evaluation time:

cannot interpolate non-scalar value `${host.h.tags}` into a string
  (refs inside `${...}` must resolve to a string, number, or bool)

Escaping

\${ produces a literal ${ in the output and is not a placeholder. There is no other \$ escape — a bare \$ followed by anything else is a lex error.

shell_var = "literal \${HOME} not stratum"   # -> "literal ${HOME} not stratum"

The honesty guard still applies

Embedding secret.X.value inside a string interpolation lands in the same forbidden-attr check as a bare secret reference. The check fires by attr name, not by ref form:

# Both are rejected — system_file.content is in SECRET_FORBIDDEN_ATTRS.
content = secret.s.value
content = "prefix ${secret.s.value} suffix"

See Secrets: the honesty guard for the list of forbidden (kind, attr) pairs and the reasoning.

Where interpolation is most useful

Connection strings and other glued-together values where a bare ref doesn't fit:

resource "docker_container" "api" {
  host  = host.primary.addr
  image = "api:dev"
  env = {
    # Embed a secret inside a URI — bare `secret.pg.value` would only work
    # if the whole env value were the password.
    DATABASE_URL = "postgresql://app:${secret.pg.value}@db:5432/app"

    # Combine multiple host fields.
    INTERNAL_API = "http://${host.primary.addr}:4000"
  }
}

When a secret is interpolated into a larger string, state stores the value with an inline substring marker ("--requirepass <secret:pg:sha256:HEX>") instead of replacing the whole leaf with an object marker. See Secrets: substring redaction.

Types & values

Five value types, all of which round-trip to JSON.

String

Double-quoted. Supports the escape sequences \", \\, \n, \r, \t. Strings cannot contain a raw newline; use \n.

name = "api"
greeting = "hello\nworld"

A string may also contain one or more ${<ref>} placeholders, replaced at config-load time with the resolved value of the reference. \${ escapes a literal ${. See String interpolation.

db_url = "postgresql://app:${secret.pg.value}@${host.primary.addr}:5432/app"
escaped = "literal \${HOME}"   # -> "literal ${HOME}"

Number

A signed decimal, optionally with a fractional part. Lexed as f64. When emitting JSON, stratum prefers a JSON integer if the number is finite and whole (port = 4000 becomes 4000, not 4000.0); otherwise it emits a JSON float. Non-finite floats serialize as null.

port    = 4000      # -> JSON 4000
timeout = 1.5       # -> JSON 1.5

This matters for diff: changing 4000 to 4000.0 is a no-op, since both render as the integer 4000.

Bool

Bare true or false. These are lexed as keywords, not identifiers.

tls = true

List

Square-bracketed, comma-separated. Trailing commas are allowed. Items can be any value type (including refs and other lists/maps).

ports   = ["8080:80", "8443:443"]
mixed   = [1, "two", true]
nested  = [[1, 2], [3, 4]]

list ::= "[" ( value ( "," value )* ","? )? "]"

Map

Brace-delimited. Each entry is <key> = <value>. Entries are separated by whitespace — no commas. Keys may be either bare identifiers or quoted strings; the string form is required for dotted keys like Traefik labels.

env = {
  NODE_ENV = "production"
  PORT     = 4000
}

labels = {
  "traefik.enable"                          = "true"
  "traefik.http.routers.api.rule"           = "Host(`api.example.com`)"
}

map ::= "{" ( ( ident | string ) "=" value )* "}"

Reference

See References & scope.

host = host.prod.addr

Identifiers

Identifiers start with an ASCII letter or _, then continue with letters, digits, _, or -. They are used for block kinds, attribute keys, map keys, and ref segments.

`content_file` on `system_file`

The system_file resource (see providers/system) accepts a special attribute, content_file, which inlines a local file's bytes into content at config-load time.

resource "system_file" "traefik-config" {
  host         = host.primary.addr
  path         = "/etc/traefik/traefik.yml"
  content_file = "files/traefik.yml"
  mode         = "0644"
}

Semantics:

The value is a path relative to the .strat file's directory (not the current working directory).
The file is read at config-load time. Its bytes become the content attribute the provider sees. content_file itself is stripped — providers never see it.
The file must contain valid UTF-8 (it's loaded with std::fs::read_to_string).

Errors:

condition	error variant
Both `content` and `content_file` on the same `system_file`	`EvalError::ContentConflict`
The referenced file does not exist or is unreadable	`EvalError::ContentFileMissing`
Using `content_file` via `stratum_config::load_str` (no base)	`EvalError::ContentFileNoBaseDir`

The third case only matters if you're embedding stratum-config in another program and calling load_str directly. The stratum CLI always uses load_file, so content_file always works in CLI flows.

This attribute is specific to system_file. The ssh_file resource only supports inline content. Use system_file if you want to load a file from disk.

`source_dir` on `system_dir`

The system_dir resource (see providers/system) accepts an optional source_dir attribute pointing at a local directory. Same base-dir rule as content_file:

resource "system_dir" "book" {
  host       = host.primary.addr
  source_dir = "../book/book"
  path       = "/srv/stratum-book"
}

Semantics:

The value is a path relative to the .strat file's directory.
At config-load time the path is joined with the base dir and std::fs::canonicalize'd. The provider sees the canonical absolute path. The original relative form is not preserved.
The directory must exist and be a directory.
Omitting source_dir is valid — the resource enters empty-dir mode, where only mkdir -p + chown + chmod run on the host.

Errors:

condition	error variant
`source_dir` points at a missing path or non-directory	`EvalError::SourceDirMissing`
Using `source_dir` via `stratum_config::load_str` (no base dir)	`EvalError::SourceDirNoBaseDir`

Unlike content_file, the contents are not inlined into state at config-load time — system_dir builds a fresh manifest from source_dir every plan and only ships bytes during apply.

Multi-file configs

plan and apply accept -c more than once. Every listed file is parsed independently, then concatenated into one document and evaluated as a single config. Hosts and secrets declared in any file are visible to references in any other file.

stratum apply -y \
  -c infra.strat \
  -c app.strat \
  -s .stratum/host.json

The shape on disk is one .strat file per logical concern (the host's bootstrap, each app, each shared service). The shape on a host is one state file per host, never one per .strat file.

For multiple deployable slices on the same host where you want each slice to plan and apply on its own — without juggling a long -c list and without one slice's state file silently owning another slice's resources — see Namespaces. Namespaces are the higher-level alternative; the bundle workflow described on this page is unchanged and remains the right shape when you have a single deployable slice.

One state per host

Every config that touches a given host must apply against the same -s state file. State is the authority on "what is currently tracked on this host"; splitting it across files means each file's state thinks it owns the host alone, and applying one of them produces a plan full of Delete steps for the resources owned by the others.

The destruction guard catches this case — apply refuses to run if any Delete is present without --allow-destroy — but the structural fix is to apply all -c files for a host together against one state file. Do not apply them one at a time.

If you forget a -c, the apply will still refuse to run, and the error names every loaded config so the missing one is visually obvious:

refusing to apply: plan would delete 9 resources not in config:
  - docker_container.traefik
  - docker_network.edge
  - ...

loaded configs: app.strat
state file: .stratum/host.json

The missing config here is infra.strat — it owns the deleted resources, but it isn't in the loaded set.

If you'd rather apply each slice independently against its own state file, that's what Namespaces are for. The per-namespace state file is scoped to its namespace's resources, and the implicit per-host tuning resources land in a shared file so multiple namespaces sharing a host don't fight over them.

Cross-file references

A host or secret declared in file A is referenceable in file B without redeclaration.

# hosts.strat
host "primary" {
  addr = "root@192.0.2.10"
}

# app.strat — no `host "primary"` redeclaration
resource "ssh_exec" "uptime" {
  host    = host.primary.addr
  command = "uptime"
}

stratum plan -c hosts.strat -c app.strat -s .stratum/host.json

This works because all files merge into one Document before evaluation. The evaluation order (hosts → secrets → providers + resources) is global across the merged set, not per file.

Per-file base directories

content_file (on system_file) and source_dir (on system_dir) resolve relative to the declaring file's directory, not relative to the working directory or the first -c file. Two system_file blocks in two different files can both reference files/foo.txt next to themselves, and each resolves to its own files/foo.txt.

Duplicates are hard errors

Three categories of duplicate are caught at load time, and the error names both source paths:

category	error
Same `host` name in two files	`DuplicateHost { name, first, second }`
Same `provider` name in two files	`DuplicateProvider { name, first, second }`
Same `<kind>.<name>` resource	`DuplicateResource { addr, first, second }`
Same `secret` name	`DuplicateSecret { name, first, second }`

duplicate host `primary`: defined in hosts.strat and infra.strat

There is no "last file wins" rule. Pick one file to own the declaration and remove the other.

Merging existing state files

If you started with per-config state files (e.g. .stratum/infra.json, .stratum/app.json) and want to consolidate into one bundle state, use stratum state merge — it merges two or more state files into one, refusing on overwrite and on any <kind>.<name> collision. After merging, run stratum plan against the consolidated state to confirm zero diff before removing the old files.

Namespaces

A namespace "<name>" { ... } block declares a deployable slice of the infra: a set of .strat files that apply together against a dedicated state file. Namespaces live in a top-level manifest (by convention ./stratum.strat) alongside the host, secret, and provider blocks they share.

host "primary" {
  addr = "root@192.0.2.10"
}

namespace "infra" {
  configs = ["infra.strat"]
}

namespace "app" {
  configs = ["app/web.strat", "app/db.strat"]
}

namespace_block ::= "namespace" string "{" attr* "}"

A namespace is selected on the CLI with -n <name>:

stratum -n infra apply -y
stratum -n app   apply -y

Both invocations share the host "primary" declared in the manifest; each writes its own state file (.stratum/infra.json, .stratum/app.json).

Body attributes

attr	required	type	default	description
`configs`	yes	list of string	—	`.strat` files this namespace owns. Paths resolve against the manifest's directory. The manifest itself is always loaded first; `configs` entries are loaded after, in order.
`state`	no	string	`.stratum/<name>.json`	Explicit state file path. Relative paths resolve against the manifest's directory. Overridden by a CLI `-s` flag.

Names must be unique within the manifest. Duplicates error with duplicate namespace. References (host.x.y, secret.z.value) are not allowed inside a namespace body — only string literals.

What goes in the manifest, what goes in a namespace config

The manifest is the shared scope. The per-namespace configs are the scoped scope.

block	manifest	namespace config
`host`	yes (shared)	yes (scoped to ns)
`secret`	yes (shared)	yes (scoped to ns)
`provider`	yes (shared)	yes (scoped to ns)
`resource`	rejected	yes
`namespace`	yes	rejected

A resource block in the manifest is loaded into every namespace (because the manifest is always the first file in the merged set), but it is not scoped to any one of them — it would appear in every namespace's plan, and the cross-namespace validator would flag every container as colliding with itself. Put resources in the per-namespace files only.

A namespace block in a non-manifest file parses fine but is invisible to CLI resolution — -n NAME only inspects whichever file is passed as --manifest. Treat namespace blocks as manifest-only.

State layout

Each namespace gets its own state file:

.stratum/
  infra.json          # everything declared under namespace "infra"
  app.json            # everything declared under namespace "app"
  _shared.json        # implicit per-host _stratum_* resources

The shared file holds the implicit per-host tuning resources (_stratum_swap_*, _stratum_sshd_oom_*, _stratum_sshd_reload_*). They live in a single file because every namespace that targets a given host wants the same swap and the same sshd drop-in — splitting them per-namespace would have the first apply create them, the second apply see them as missing from state, and recreate them. See Architecture: split state for the merge rules.

What references can cross namespace boundaries

reference shape	works across namespaces?
`host.<name>.<field>`	yes (manifest-scoped)
`secret.<name>.value`	yes (manifest-scoped)
`depends_on` to a sibling ns	no

depends_on on a docker_container must point at a <kind>.<name> declared in the same namespace. The planner only sees one namespace's resources at a time, so a depends_on edge to a sibling namespace's resource errors as an undeclared target.

If a producer-consumer relationship crosses what becomes a namespace boundary, the workaround is to duplicate the trigger resource on the consumer side. A typical case: a ssh_exec runs docker build to produce an image (producer), and a docker_container in another namespace consumes that image. Move (or duplicate) the build ssh_exec into the consumer's namespace so the depends_on edge is local. See Cross-namespace conflicts in the tutorial for an example.

Cross-namespace conflict checks

When -n NAME is set, plan and apply re-load every sibling namespace's configs (with unresolved secrets tolerated) and check the current namespace's docker_container resources for collisions against them. Two cases are caught:

Port collision — two namespaces declare a docker_container binding the same (host, host_port). Random-port and bare-port forms are skipped.
Container name collision — two namespaces declare a docker_container with the same (host, name).

Both errors name the offending resource and the sibling that already claims the port or name. See Cross-namespace conflicts for the error shape.

Errors

condition	error
Label count other than one (`namespace "a" "b" { ... }`)	`BadNamespaceLabels`
Missing `configs = [...]`	`BadNamespaceBody`
`configs` entry is not a string	`BadNamespaceBody`
Reference inside the body	`BadNamespaceBody` (no refs allowed)
Duplicate namespace name	`DuplicateNamespace` (names both source paths)
`-n NAME` requested but no manifest at `./stratum.strat` (and no `--manifest`)	CLI: `requires a manifest, but ./stratum.strat does not exist`
`-n NAME` requested but the manifest declares no such namespace	CLI: `namespace <name> not declared in <manifest> (known: ...)`
`-n` and `-c` passed together	CLI: `mutually exclusive`

Bundle mode is unchanged

Without -n, stratum operates in bundle mode — the historical workflow. -c X -c Y -s state.json keeps working exactly as before, the cross-namespace validator is skipped, and state is a single file. See Multi-file configs. Namespaces are opt-in; you don't have to migrate.

Tutorial

See Multi-namespace deployments for the end-to-end walkthrough: writing a manifest, splitting an existing bundle, triggering a port collision and resolving it.

Providers

A provider owns one or more resource kinds and implements the Provider trait: create, update, delete, read, and an optional configure for the corresponding provider "<name>" { ... } block.

stratum routes a resource to a provider by splitting the kind on the first _ and looking up the prefix:

kind	provider
`system_package`	`system`
`system_service`	`system`
`system_file`	`system`
`system_secret_file`	`system`
`system_ufw_rule`	`system`
`system_dir`	`system`
`ssh_exec`	`ssh`
`ssh_file`	`ssh`
`docker_network`	`docker`
`docker_container`	`docker`
`docker_image`	`docker`
`git_repo`	`git`

Providers are registered at CLI startup in crates/cli/src/main.rs. Adding one means adding a new crate under crates/providers/, wiring it into the registry, and documenting it here.

Configuration block

A provider "<name>" { ... } block is optional. When present, its body is passed to the provider's configure method during apply. No shipped provider reads its block today — the grammar exists but is currently dormant.

Execution

Side effects run when you pass -y to stratum apply. Without -y, apply prints the plan and exits without touching providers. There is no dry-run mode for providers: once -y is set, every create / update / delete call hits the remote host.

For drift detection, every provider also implements read (a non-destructive query). Coverage today:

kind	`read` returns
`system_package`	`Present { state: present\|absent }`
`system_service`	`Present { enabled, state }`
`system_file`	`Present { mode, owner, group, sha256 }` or `Absent`
`system_secret_file`	`Present { mode, owner, group, sha256 }` or `Absent` (content never observed)
`system_ufw_rule`	`Unknown` (parsing punted)
`system_dir`	`Present { file_count, manifest_sha256, manifest }` or `Absent` (or `Unknown` if state's `file_count` > 200)
`ssh_exec`	`Unknown` (no readable identity)
`ssh_file`	`Present { mode, sha256 }` or `Absent`
`docker_network`	`Present { name, driver }` or `Absent`
`docker_container`	`Present { name, image, restart, labels, networks, container_id }` or `Absent`
`docker_image`	`Present { tag, image_id, id }` (echoes prior `build_args` / `context` / `dockerfile` / `target` / `pull_base`) or `Absent`
`git_repo`	`Present { path, url, ref, commit_sha }` or `Absent`

The Unknown cases show up in unreadable counts when you run stratum plan --refresh or after every stratum apply -y. That's intentional for v1.

See each provider's page for the exact attribute schema.

system

Ansible-shape system bootstrap: install packages, manage systemd units, drop files and directories, configure ufw. Operates against a remote host via the system ssh binary (-o BatchMode=yes -o StrictHostKeyChecking=accept-new). Targets Debian/Ubuntu — package management is apt-only.

All apt invocations use DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::=--force-confold, so package installs never hang on prompts and keep existing config files on upgrade.

The provider takes no configuration block.

Kinds

`system_package`

An apt-managed package, present or absent.

resource "system_package" "docker" {
  host  = host.primary.addr
  name  = "docker.io"
  state = "present"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target in `user@host` form.
`name`	yes*	string	resource label	apt package name. Falls back to the resource label if omitted.
`state`	no	string	`present`	`present` or `absent`.

*name is technically optional in the source — it defaults to the resource label — but documenting it explicitly is the convention so the apt package name doesn't depend on what you called the resource.

On present, the provider runs apt-get update once per (process, host) before the first install, then apt-get install <name>. On absent, it runs apt-get remove <name>.

Stored state: { host, name, state }. read runs dpkg-query -W -f='${Status}' <name> and returns Present { state: present|absent }. There is no version field in observed state — version drift is not surfaced.

Delete is best-effort apt-get remove (errors swallowed so a missing package doesn't fail the apply).

`system_service`

A systemd unit, started/stopped with enabled/disabled independently controlled.

resource "system_service" "docker" {
  host    = host.primary.addr
  name    = "docker"
  enabled = true
  state   = "started"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`name`	yes*	string	resource label	systemd unit name. Falls back to the resource label.
`enabled`	no	bool	`false`	If true, runs `systemctl enable <name>`.
`state`	no	string	`stopped`	`started` or `stopped`.

Order of operations:

(enabled=true, state=started) → enable, then start, then poll is-active for up to 10s.
(enabled=true, state=stopped) → enable, then stop.
(enabled=false, state=stopped) → stop (best-effort), then disable.
(enabled=false, state=started) → disable (best-effort), then start, then poll is-active for 10s.

If the 10s is-active poll times out, the provider collects systemctl status --no-pager -n 20 (and journalctl -u <svc> --no-pager -n 50 if the status output is sparse) and includes both in the error message.

Stored state: { host, name, enabled, state }. read runs systemctl is-enabled + systemctl is-active in one round-trip and returns Present { enabled, state }.

Delete runs systemctl disable --now <name> best-effort.

`system_file`

Drops a file on the remote host. Auto-creates parent directories via install -D.

resource "system_file" "traefik-config" {
  host         = host.primary.addr
  path         = "/etc/traefik/traefik.yml"
  content_file = "files/traefik.yml"
  mode         = "0644"
  owner        = "root"
  group        = "root"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`path`	yes	string	—	Absolute destination path. Parent dirs are created.
`content`	yes*	string	—	File contents, written verbatim.
`content_file`	yes*	string	—	Path to a local file, resolved relative to the `.strat` file's directory. Inlined at config-load time into `content`.
`mode`	no	string	`0644`	File mode, passed to `install -m`.
`owner`	no	string	`root`	File owner, passed to `install -o`.
`group`	no	string	`root`	File group, passed to `install -g`.

*Exactly one of content or content_file must be set. Both → EvalError::ContentConflict at config-load time. See content_file for the full semantics.

Upload uses install -D -m <mode> -o <owner> -g <group> /dev/stdin <path> with the content streamed via SSH stdin. The -D flag creates intermediate directories.

The provider sha256-hashes the content; on update, if the new sha matches what's in prior state, the upload is skipped and the resource logs unchanged (sha256 match).

Stored state: { host, path, content, mode, owner, group, sha256 }. Persisting content means a re-plan with the same config is a no-op (no spurious updates from "content field appeared").

read runs a single-roundtrip probe that prints either MISSING or <mode>|<owner>|<group>|<sha256>. Returns Absent for missing files, Present { host, path, mode, owner, group, sha256 } otherwise. Note that observed state does not include content — drift surfaces as a sha256 mismatch.

Delete runs rm -f -- <path>.

`system_secret_file`

Like system_file, but the content is treated as a whole-file secret. The plaintext is streamed via SSH stdin (never argv) and never persists in state. State stores sha256 plus the file permissions only — enough to detect drift, not enough to recover the value.

secret "firebase_sa" {
  from_file = "~/.config/app/firebase-sa.json"
}

resource "system_secret_file" "firebase-sa" {
  host    = host.primary.addr
  path    = "/etc/app/firebase-sa.json"
  content = secret.firebase_sa.value
  mode    = "0400"
  owner   = "root"
  group   = "root"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`path`	yes	string	—	Absolute destination path. Parent dirs are created via `install -D`.
`content`	yes	string	—	File contents. Typically `secret.<name>.value` — see Secrets. Unlike `system_file`, there is no `content_file` attribute; whole-file secrets are sourced through a `secret { from_file = ... }` block.
`mode`	no	string	`0400`	File mode, passed to `install -m`. Default is stricter than `system_file` (`0400` vs `0644`) — secret files default to owner-only read.
`owner`	no	string	`root`	File owner, passed to `install -o`.
`group`	no	string	`root`	File group, passed to `install -g`.

Upload uses install -D -m <mode> -o <owner> -g <group> /dev/stdin <path> with the content streamed via SSH stdin. The provider hashes the content before sending; on update, if the new sha matches the prior state's sha and all permissions match, the upload is skipped and the resource logs unchanged (sha256 + perms match).

State shape: { host, path, mode, owner, group, sha256 }. The content field is omitted — recovering the plaintext from state is impossible by design. Plan diff is sha-to-sha: the build_plan normalizer (SECRET_CONTENT_TO_SHA) drops content from desired and substitutes sha256: hash(content) before diffing, so plans never echo plaintext into a content: null -> "<plaintext>" change. See Architecture: secret-content normalization.

Apply log: byte length only — never the content, never the sha.

[system] SECRET_FILE `firebase-sa` -> root@192.0.2.10:/etc/app/firebase-sa.json (1842 bytes, mode=0400 root:root)
[system] SECRET_FILE `firebase-sa` -> root@192.0.2.10:/etc/app/firebase-sa.json unchanged (sha256 + perms match)

Drift detection: read runs the same probe shape as system_file (<mode>|<owner>|<group>|<sha256> or MISSING). Returns Present { host, path, mode, owner, group, sha256 } with no content field. Drift surfaces as sha256 / mode / owner / group mismatch.

Delete runs rm -f -- <path> best-effort.

`system_dir`

Manages a directory on the remote host. Two modes, selected by whether source_dir is set:

Upload mode (source_dir set) — tars + gzips a local tree in memory, streams it over one SSH connection, extracts on the host, applies chown -R + recursive chmod. Used to ship static assets (an mdbook build output, a static-site bundle) to a host where another container will serve them.
Empty-dir mode (source_dir omitted) — just mkdir -p + chown + chmod on the host. No upload. Used to pre-create directories a daemon expects but won't create itself (e.g. /srv/repos, /var/lib/<daemon>).

resource "system_dir" "book" {
  host         = host.primary.addr
  source_dir   = "../book/book"
  path         = "/srv/stratum-book"
  mode         = "0644"
  dir_mode     = "0755"
  owner        = "root"
  group        = "root"
  delete_extra = true
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`path`	yes	string	—	Absolute remote destination. Created with `mkdir -p`.
`source_dir`	no	string	—	If set, local directory whose contents are tarred + uploaded. Resolved relative to the `.strat` file's directory and canonicalized at config-load time. Must exist and be a directory. If omitted, the resource is in empty-dir mode.
`mode`	no	string	`0644`	Mode applied to every regular file under `path` (via `find -type f -exec chmod`). Has no effect in empty-dir mode.
`dir_mode`	no	string	`0755`	Mode applied to every directory under `path` (via `find -type d -exec chmod`). In empty-dir mode, applied once to `path` itself.
`owner`	no	string	`root`	Recursive owner, applied with `chown -R <owner>:<group>`. In empty-dir mode, applied once to `path`.
`group`	no	string	`root`	Recursive group.
`delete_extra`	no	bool	`false`	If true, files in prior state's manifest but absent from the new manifest are `rm -f`'d on the host after the upload. Keeps the remote tree in sync as local files are removed. No effect in empty-dir mode (the manifest is always empty).

The provider walks source_dir with walkdir, sha256-hashes every regular file, and stores the result as { relpath -> sha256 } (POSIX / separators even on Windows). The manifest is digested into a single manifest_sha256; on update, if both the manifest digest and every permission attr match prior state, the upload is skipped and the resource logs unchanged (... manifest match).

source_dir is resolved at config-load time: the value the provider sees is the canonicalized absolute path of <dir-of-.strat-file>/<source_dir>. Using system_dir with source_dir via stratum_config::load_str (no base dir) errors with EvalError::SourceDirNoBaseDir. A missing or non-directory path errors with EvalError::SourceDirMissing.

Stored state: { host, source_dir, path, mode, dir_mode, owner, group, delete_extra, file_count, manifest_sha256, manifest }. The manifest map is persisted in full so delete_extra can diff prior keys against the new manifest.

read runs find . -type f -print0 | sort -z | xargs -0 sha256sum on the remote tree, returns Absent if the directory is missing, otherwise Present { host, path, file_count, manifest_sha256, manifest }. Drift surfaces as a manifest_sha256 mismatch (or, with delete_extra off, extra keys observed on the host that aren't in state).

File-count cap. If state's file_count exceeds 200, read returns Observed::Unknown("system_dir read skipped: file_count N > cap 200") instead of doing the remote sha256 sweep. Drift detection on large trees needs a smarter strategy; today they show up as unreadable in --refresh output. Apply itself is not capped — uploads of any size work.

Delete runs rm -rf -- <path> best-effort.

Empty-dir mode (no `source_dir`)

Omit source_dir to skip the upload entirely. The provider runs only:

mkdir -p <path>; chown <owner>:<group> <path>; chmod <dir_mode> <path>

The stored state's file_count is 0, the manifest is {}, and manifest_sha256 is the digest of the empty manifest. delete_extra is recorded but has no effect — there are no manifest entries to diff. read returns Absent if the directory is missing on the host, otherwise Present with file_count: 0 (the host's tree is also expected to be empty as far as stratum is concerned; any files placed there by other processes don't drift the manifest).

# Pre-create directories the daemon expects.
resource "system_dir" "etc-app" {
  host = host.primary.addr
  path = "/etc/app"
}

resource "system_dir" "srv-repos" {
  host = host.primary.addr
  path = "/srv/repos"
}

Use this in place of an ssh_exec "mkdir -p ..." chain when a daemon needs the directories to exist before it starts: empty-dir mode is idempotent, drift-detectable (the daemon won't recreate the dir if path is missing on the host), and the owner/group/mode become declarative.

`system_ufw_rule`

A single ufw allow/deny rule. Idempotent via ufw itself — adding the same rule twice is harmless.

resource "system_ufw_rule" "allow-ssh" {
  host = host.primary.addr
  port = "22/tcp"
  rule = "allow"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`port`	yes	string	—	Port specifier, e.g. `22/tcp`, `443/tcp`, `8080`. Passed verbatim to ufw.
`rule`	yes	string	—	`allow` or `deny`.

Runs ufw <rule> <port> on create/update. Delete runs ufw delete <rule> <port> best-effort.

Stored state: { host, port, rule }. read returns Observed::Unknown("system_ufw_rule read not implemented (ufw status parsing punted)") — ufw rules always show up as unreadable in drift summaries. This is intentional for v1.

Lockout warning: systemctl start ufw does NOT activate the firewall ruleset. The ruleset becomes active only when you run ufw --force enable (typically via an ssh_exec resource, after the 22/tcp allow rule is in state). If you later remove the 22/tcp allow rule with ufw enabled while you're sshing in, you'll lock yourself out. See the bootstrap tutorial for the recommended ordering.

Apply trace

Each resource logs one line to stderr:

[system] PACKAGE `docker.io` present on root@192.0.2.10
[system] SERVICE `docker` enabled=true state=started on root@192.0.2.10
[system] FILE `traefik-config` -> root@192.0.2.10:/etc/traefik/traefik.yml (188 bytes, mode=0644 root:root)
[system] FILE `traefik-config` -> root@192.0.2.10:/etc/traefik/traefik.yml unchanged (sha256 match)
[system] SECRET_FILE `firebase-sa` -> root@192.0.2.10:/etc/app/firebase-sa.json (1842 bytes, mode=0400 root:root)
[system] DIR `site` -> root@192.0.2.10:/srv/site (42 files, mode=0644 dir_mode=0755 root:root)
[system] DIR `site` -> root@192.0.2.10:/srv/site unchanged (42 files, manifest match)
[system] UFW allow 22/tcp on root@192.0.2.10

Notes

apt-get update is memoized once per (process, host). A separate stratum invocation re-runs it.
All shell quoting is internal (shell_quote helper) — package names with spaces are correctly quoted, but in practice apt names don't contain them.
There is no system_user kind. Use ssh_exec if you need to create users.

ssh

Runs commands and writes files on a remote host. Shells out to the system ssh binary with -o BatchMode=yes -o StrictHostKeyChecking=accept-new, so authentication is whatever your ~/.ssh/config and ssh-agent provide. Passwords are not supported — keys only.

The provider takes no configuration block.

Kinds

`ssh_exec`

Runs a command on a remote host. Stratum re-runs the command whenever any attribute changes (typically command).

resource "ssh_exec" "uptime" {
  host    = host.prod.addr
  command = "uptime"
}

resource "ssh_exec" "bootstrap" {
  host       = host.prod.addr
  command    = "mkdir -p /opt/stratum"
  on_destroy = "rm -rf /opt/stratum"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target in `user@host` form. Passed verbatim to `ssh`.
`command`	yes	string	—	Shell command to run on the remote host.
`env`	no	map	none	Each entry becomes `export KEY=VALUE;` prepended to `command`. Values are shell-quoted. A `secret.<name>.value` ref is supported here (state stores a redaction marker, same as `docker_container.env`). Keys are sorted for deterministic command output.
`on_destroy`	no	string	none	Command to run when this resource is deleted from config.

Stored state adds stdout, stderr, and exit_code from the last run.

The effective remote command is export K1=V1; export K2=V2; ...; <command> — a single shell invocation, not a separate environment. The env map is the right place for short, sensitive values (a GitHub PAT, a deploy token) that flow into a one-shot command without landing in a file. For values that need to persist on the host, use system_secret_file.

If the remote command exits non-zero, apply fails with the captured stderr.

When the resource is removed from config and on_destroy is set, the command is executed; otherwise delete is a no-op (with a log line).

Drift detection: read always returns Observed::Unknown("ssh_exec has no readable identity on the host"). ssh_exec resources show up in unreadable counts on stratum plan --refresh and after every stratum apply -y. That's intentional — a fire-and-forget shell command has no canonical "current state" to read back.

`ssh_file`

Writes a file to a remote path. The file is re-uploaded whenever its content changes, detected via sha256 of the prior state. Deletion runs rm -f on the recorded path.

resource "ssh_file" "motd" {
  host    = host.prod.addr
  path    = "/etc/motd"
  content = "Managed by stratum.\n"
  mode    = "0644"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target in `user@host` form.
`path`	yes	string	—	Absolute destination path on the remote host.
`content`	yes	string	—	File contents, written verbatim.
`mode`	no	string	`0644`	File mode, passed to `install -m`.

Upload uses install -m <mode> /dev/stdin <path> with the content streamed via stdin. Binary content is not officially supported (the value comes through the parser as a UTF-8 string).

Stored state: { host, path, mode, sha256 }. The hash is what stratum compares on the next plan to decide whether to re-upload.

For richer file management (owner, group, auto-created parent dirs, content_file = "..." inlining), use system_file instead.

Drift detection: read runs a single-roundtrip probe: if [ -e <path> ]; then printf '%s|' "$(stat -c %a <path>)"; sha256sum <path> | cut -d' ' -f1; else echo MISSING; fi. Returns Absent for missing files, Present { host, path, mode, sha256 } otherwise.

Apply behavior

Trace lines, one per resource:

[ssh] EXEC `uptime` on root@192.0.2.10
[ssh] FILE `motd` -> root@192.0.2.10:/etc/motd
[ssh] FILE `motd` on root@192.0.2.10 unchanged (sha256 match)

docker

Drives the docker CLI on a remote host over SSH. All operations are shell commands sent through ssh -o BatchMode=yes -o StrictHostKeyChecking=accept-new — there is no Docker API client.

The provider takes no configuration block.

Update strategy

docker_container updates are recreate: docker rm -f <name> followed by docker pull <image> and docker run. There is no in-place update. Any change to any attribute triggers a recreate. Networks are reconciled by name; the create path is idempotent (docker network inspect ... || docker network create ...), so re-applying without changes is a no-op.

docker_image is a producer kind: it runs docker build on the host and records the resulting image_id. Drift on build_args / context / dockerfile / target triggers a rebuild. Drift on tag itself recreates (the new tag won't share an image ID with the old one). See docker_image below.

Kinds

`docker_network`

Ensures a user-defined network exists on the host.

resource "docker_network" "edge" {
  host = host.primary.addr
  name = "stratum-edge"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`name`	no	string	resource name	Docker network name. Falls back to the resource label.
`driver`	no	string	`bridge`	Network driver passed to `--driver`.

The create/update path runs docker network inspect <name> >/dev/null 2>&1 || docker network create --driver <driver> <name>, so applying twice is safe.

Delete runs docker network rm <name>.

Drift detection: read runs docker network inspect <name> --format '{{json .}}'. Returns Absent on No such network, otherwise Present { host, name, driver }. The inspect output is parsed for Name and Driver.

`docker_container`

A long-running container.

resource "docker_container" "traefik" {
  host    = host.primary.addr
  name    = "traefik"
  image   = "traefik:v3.1"
  restart = "unless-stopped"
  env = {
    NODE_ENV = "production"
  }
  ports    = ["80:80", "443:443"]
  volumes  = ["/var/run/docker.sock:/var/run/docker.sock:ro"]
  networks = ["stratum-edge"]
  labels = {
    "traefik.enable" = "true"
  }
  command = "--api.dashboard=true"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`image`	yes	string	—	Image reference. Pulled before run unless `pull = false`.
`name`	no	string	resource name	Container name (`--name`). Falls back to the resource label.
`restart`	no	string	`unless-stopped`	Passed to `--restart`.
`pull`	no	bool	`true`	If `false`, skip the `docker pull` step. Use for locally-built images that aren't in any registry (see `pull = false` below).
`env`	no	map	none	Each entry becomes `-e KEY=VALUE`. Non-string values are JSON-stringified. A value that resolves from `secret.<name>.value` flows through as a normal env var; state stores only a redaction marker. See Secrets.
`ports`	no	list of string	none	Each string becomes `-p <spec>`. Use the standard `host:container[/proto]`.
`volumes`	no	list of string	none	Each string becomes `-v <spec>`.
`labels`	no	map	none	Each entry becomes `-l KEY=VALUE`. Dotted keys (Traefik) must be quoted.
`networks`	no	list of string	none	Each string becomes `--network <name>`.
`command`	no	string \| list of string	none	Appended after the image. String form is split on whitespace; list form preserves each element as one argv token (see `command`).
`memory`	no	string	none	Hard memory limit, passed to `--memory`. Same syntax as docker (`256m`, `1g`).
`memory_swap`	no	string	none	Total memory + swap limit, passed to `--memory-swap`. See docker's docs for the swap-disable / swap-unlimited shorthands.
`healthcheck`	no	map	none	Lowered to `--health-*` flags on `docker run`. See `healthcheck` below.
`depends_on`	no	list of string	none	Resource addresses (`<kind>.<name>`) this container depends on. The planner topo-sorts before applying. See `depends_on` below.

The constructed command is:

docker pull <image> >/dev/null; docker run -d --name <name> --restart <restart> [...flags...] <image> [<command tokens>]

On update it becomes:

docker rm -f <name> >/dev/null 2>&1 || true; docker pull <image> >/dev/null; docker run -d ...

Stored state preserves the input attrs verbatim and adds container_id (the last line of docker run's stdout, which is the container ID).

Delete runs docker rm -f <name>.

`pull = false`: locally-built images

When pull = false, the docker pull <image> >/dev/null; prefix is dropped from both the create and the recreate command. Use it when the image is built directly on the host (a sibling ssh_exec runs docker build -t myapp:dev ...) so there is no registry to pull from. With the default pull = true, docker pull myapp:dev errors and the create/update fails.

docker run itself still errors loudly if the image isn't present on the host, so a typo in image is caught at apply time — there's no silent fallback to a stale image.

resource "ssh_exec" "build-app" {
  host    = host.primary.addr
  command = "cd /srv/repos/app && docker build -t app:dev -f Dockerfile ."
}

resource "docker_container" "app" {
  host  = host.primary.addr
  image = "app:dev"
  pull  = false
  # ...
}

`command`: string or argv list

# String form — split on whitespace at apply time.
command = "node server.js --port 4000"

# List form — each element is one argv token. Spaces inside an element are
# preserved (the third element below is a single shell line).
command = ["sh", "-c", "redis-server --requirepass ${secret.redis_pw.value}"]

Use the list form when an argument contains whitespace, embedded quotes, or other shell metacharacters. Each list element is shell-escaped independently when the run command crosses the SSH boundary, so spaces inside one element do not split it into two arguments.

A non-string, non-list value (a number, a bool, a map) errors at apply time.

`healthcheck`

A map lowered to --health-* flags on docker run. test is the only required field; everything else has a docker-side default or is sensible to omit.

resource "docker_container" "cache" {
  host  = host.primary.addr
  image = "redis:7-alpine"
  healthcheck = {
    test         = "redis-cli ping | grep -q PONG"
    interval     = "5s"
    retries      = 5
    timeout      = "3s"
    start_period = "10s"
  }
}

field	required	type	lowering	default
`test`	yes	string	`--health-cmd <value>`	—
`interval`	no	string	`--health-interval <v>`	docker default (30s)
`retries`	no	number	`--health-retries <v>`	docker default (3)
`timeout`	no	string	`--health-timeout <v>`	`30s`
`start_period`	no	string	`--health-start-period <v>`	`0s`

Declaring healthcheck opts the container into the post-apply readiness wait: apply will not move on to dependent steps until docker inspect --format '{{.State.Health.Status}}' reports healthy, with a 60s budget.

A healthcheck map without a test field is a hard error at apply time:

`<name>` healthcheck missing required field `test`

`depends_on` and the post-apply wait

resource "docker_container" "api" {
  host       = host.primary.addr
  image      = "api:dev"
  depends_on = ["docker_container.db", "docker_container.cache"]
}

Each entry must be a <kind>.<name> resource address. The planner uses these edges in three places:

Topo sort of create / update steps. api is reordered after both of its dependencies, regardless of file order. The sort is stable: resources with no edges keep their relative input order, and implicit _stratum_* resources (per-host swap, sshd OOM tuning) fall through with in_degree = 0 and stay at the front.
Cycle detection. A cycle is a hard error at plan time, with the cycle path in the message.
Forward-topo delete order. When api and db are both removed from config, api is deleted first (forward topo over the state-resident edges) — the dependent goes down before its dependency.

A missing reference is a hard error at plan time, naming both addresses:

depends_on edge: `docker_container.api` references unknown resource `docker_container.db`

depends_on edges only resolve within a single namespace. If a producer and consumer end up in different namespaces, duplicate the producer into the consumer's namespace — see Cross-namespace depends_on.

Post-apply readiness wait. After every successful docker_container create or update, the planner pauses before moving on:

If healthcheck is declared, it polls docker inspect --format '{{.State.Health.Status}}' <name> once a second for up to 60s, waiting for healthy. A status of unhealthy or a timeout fails the apply with a named error. An empty status (no healthcheck configured at the docker level) is treated as ready immediately.
Otherwise, a cosmetic 500ms pause gives docker time to wire networks and volumes before the next step pokes the container.

This wait is what makes depends_on actually useful — a dependent container starts against a healthy dependency, not just a running one.

Drift detection: read runs docker inspect <name> --format '{{json .}}'. Returns Absent on No such object / No such container, otherwise Present. The parsed shape is { host, name, image, restart, labels, networks, container_id }.

Two normalization rules in parse_container_inspect keep observed labels from being noisy:

All com.docker.* labels (compose metadata, etc.) are dropped.
Remaining labels are intersected with the state's label key set, so image-baked LABELs and daemon-injected labels don't surface as drift.

Networks come from NetworkSettings.Networks, sorted lexicographically. (diff_observed treats string arrays as sets, so order changes don't drift either.)

`docker_image`

Builds an image on a remote host from a context directory. Producer kind: state captures the resulting image_id so drift can be detected when the image is rebuilt or removed underneath stratum.

resource "docker_image" "api" {
  host       = host.primary.addr
  context    = "/srv/repos/api"
  dockerfile = "Dockerfile"
  target     = "runner"
  tag        = "api:dev"
  pull_base  = true
  build_args = {
    NODE_ENV = "production"
    API_URL  = "https://api.example.com"
  }
}

attr	required	type	default	description
`host`	yes	string	—	SSH target. The build runs on this host's docker daemon.
`tag`	yes	string	—	Image tag (`-t`) and the lookup key for `read`.
`context`	yes	string	—	Absolute path on the host containing the Dockerfile and build context. Stratum `cd`s into it before `docker build`.
`dockerfile`	no	string	`Dockerfile`	Dockerfile filename, passed to `-f`.
`target`	no	string	none	Multi-stage build target, passed to `--target`.
`build_args`	no	map	none	Each entry becomes `--build-arg KEY=VALUE`. Keys are sorted alphabetically for deterministic command output.
`pull_base`	no	bool	`false`	If true, passes `--pull` so docker re-pulls the base image instead of reusing a cached one.

The build line is:

cd <context> && DOCKER_BUILDKIT=1 docker build [--target T] [--build-arg K=V ...] [--pull] -t <tag> -f <dockerfile> .

DOCKER_BUILDKIT=1 is always set. The host must have the buildx plugin available (docker buildx).

After a successful build, stratum runs docker images --no-trunc --format '{{.ID}}' <tag> and records the full image ID as both image_id and id in state (the id alias is reserved for future resource-attr refs of the form docker_image.X.id).

Update strategy. docker_image does not have a separate update path — both create and update run the same build. The desired-vs-prior diff drives whether the build runs at all: a change to any of context, dockerfile, target, build_args, pull_base, or tag produces an Update step that re-runs docker build. A drift-detected change to image_id (image deleted or rebuilt out of band) also re-runs the build.

Drift detection: read runs docker images --no-trunc --format '{{.ID}}' <tag>. Empty output → Absent. Otherwise Present { host, tag, image_id, id, ...prior fields }. The build-time fields (context, dockerfile, target, build_args, pull_base) are echoed forward from prior state — docker doesn't preserve them post-build, and re-querying them is impossible. Drift on those is detected at plan time via desired-vs-prior diff, not by read.

Delete runs docker rmi <tag> best-effort.

Apply behavior

Trace lines per resource:

[docker] NETWORK `stratum-edge` on root@192.0.2.10
[docker] IMAGE `api:dev` on root@192.0.2.10 (context=/srv/repos/api)
[docker] CONTAINER `traefik` on root@192.0.2.10 (create)
[docker] CONTAINER `traefik` on root@192.0.2.10 (recreate)

git

Owns a git working tree on a remote host. Shells out to the system git binary over SSH — there is no libgit2 dependency. The provider takes no configuration block.

Kinds

`git_repo`

Clones a remote repository to a fixed path and keeps it pinned to a given ref. Producer kind: state captures the resolved commit_sha, which is what other steps (a docker_image rebuild, a ssh_exec post-deploy) can observe drift against.

resource "git_repo" "app" {
  host = host.primary.addr
  path = "/srv/app"
  url  = "https://github.com/example/app.git"
  ref  = "main"
}

attr	required	type	default	description
`host`	yes	string	—	SSH target.
`path`	yes	string	—	Absolute destination path on the remote host. The directory is created by `git clone`.
`url`	yes	string	—	Remote URL. Passed verbatim to `git clone`. Authentication uses whatever's on the host (`~/.git-credentials`, deploy key in `~/.ssh`, etc.) — stratum does not inject credentials.
`ref`	yes	string	—	A branch name, a tag, or a full 40-character lowercase hex SHA. See Refs below for the dispatch rules.
`depth`	no	number	none	Shallow-clone depth, passed to `--depth`. Only honored when `ref` is a branch or tag; full SHAs always clone full (`--depth` + bare SHA is fragile).

State shape

{
  "host": "root@192.0.2.10",
  "path": "/srv/app",
  "url": "https://github.com/example/app.git",
  "ref": "main",
  "depth": null,
  "commit_sha": "bfb77a6c1d808e04..."
}

commit_sha is the output of git rev-parse HEAD after the create / update. Stratum re-reads this on --refresh; a mismatch between the recorded SHA and what ref currently resolves to on the remote becomes drift, which update resolves with a fetch + reset.

Refs

The ref value dispatches at command-build time:

Branch or tag (anything not matching a full SHA shape) → git clone --branch <ref> [--depth N] <url> <path>. Updates run git fetch origin && git reset --hard origin/<ref>.
Full 40-character hex SHA → git clone <url> <path> && git -C <path> checkout <sha>. The --branch flag does not accept bare SHAs, so the clone-then-checkout shape is required. --depth is intentionally ignored for SHA refs. Updates run git fetch origin && git checkout <sha>.

A short SHA (bfb77a6) is treated as a branch name — git will reject it as a missing branch at clone time. Use the full 40-char form if you need SHA-pinned behavior.

Recreate vs in-place update

change	action
`url` differs	`rm -rf <path>` then re-clone. State preserves the new url + sha.
`path` differs	`rm -rf <old-path>` then re-clone at the new path.
`ref` / `depth` only	`git fetch origin` + `git reset --hard origin/<ref>` (or `git checkout <sha>`).

There is no git fetch --depth shrink / expand path — depth changes alone do not trigger any work today.

Drift detection

read runs a single-roundtrip probe:

if [ ! -d <path>/.git ]; then echo MISSING;
else printf '%s|' "$(git -C <path> rev-parse HEAD)";
     git -C <path> remote get-url origin;
fi

<path> or <path>/.git missing → Absent.
Probe output <sha>|<origin_url> → Present { host, path, url, ref, depth, commit_sha }. The url in the observed value is the actual remote URL — if it differs from the desired url, the plan flags drift on url and the next apply will re-clone.
Anything else → Unknown with the unparsed output.

Apply trace

[git] CLONE git_repo `app` -> root@192.0.2.10:/srv/app (ref=main)
[git] UPDATE git_repo `app` (root@192.0.2.10:/srv/app) ref=main
[git] DELETE git_repo `app` (root@192.0.2.10:/srv/app)

Notes

Stratum does not manage SSH host keys for git remotes. If the remote is a private git server reached over SSH, the host running stratum's ssh (the remote host, not your laptop) needs to have already accepted the remote's host key.
Credentials live on the host. For GitHub HTTPS, the common pattern is to drop ~/.git-credentials via a system_secret_file sourcing a personal access token from secret { from_env = "GH_PAT" }.
git_repo does not run anything inside the cloned tree (no npm install, no docker build). Pair it with a docker_image (using the clone path as context) or an ssh_exec for post-clone work.

CLI

Single binary: stratum. All commands operate against one or more config files (default stratum.strat) and a state file (default .stratum/state.json).

stratum <COMMAND>

command	purpose
`plan`	Print the diff between config and state.
`apply`	Execute the plan and save state.
`status`	Print per-host resource usage and per-container stats.
`state list`	List resources currently tracked in state.
`state show`	Print one resource's full state as JSON.
`state merge`	Merge two or more state files into one.

Global flags

These flags work on any subcommand (clap's global = true).

flag	short	default	description
`--env-file <PATH>`		none	Load env vars from a `.env`-style file before resolving `secret { from_env }` refs. Repeatable to layer multiple files. See `--env-file` and `.env` auto-load below.
`--namespace <NAME>`	`-n`	none	Operate within the named namespace declared in the manifest. Resolves `-c` from the namespace's `configs = [...]` and `-s` to `.stratum/<name>.json`. See Namespace mode below.
`--manifest <PATH>`		auto	Override the manifest path used by `-n`. Default: `./stratum.strat` if it exists. Only consulted when `-n` is set.

Namespace mode (`-n` and `--manifest`)

A namespace is a deployable slice declared in a top-level manifest. When -n NAME is set, the CLI:

Locates the manifest. If --manifest PATH is set, that path is used. Otherwise ./stratum.strat is required to exist; if it doesn't, the command errors out (requires a manifest, but ./stratum.strat does not exist).
Loads the manifest and looks up the named namespace. If it isn't declared, the error names every known namespace from the manifest.
Resolves -c from the namespace's configs = [...]. The manifest itself is always the first file in the merged set, so its host / secret / provider blocks are visible to every per-namespace config.
Resolves -s to (in priority order) the explicit -s on the command line, the namespace's body-level state = "...", or .stratum/<name>.json.

stratum -n infra plan                          # uses ./stratum.strat, configs from `namespace "infra"`
stratum -n app   apply -y                      # state at .stratum/app.json
stratum --manifest deploy/manifest.strat -n web apply -y

-n and -c are mutually exclusive. Passing both errors out — the namespace's configs = [...] is the config list, and a -c override would silently shadow the manifest's intent. Drop the -n to operate in bundle mode, or remove the -c to let the namespace pick.

-s is not mutually exclusive with -n. The CLI default is .stratum/state.json; the namespace's default is .stratum/<name>.json. If you want to point a namespace's apply at a custom state file (typically during migration from bundle mode), pass -s explicitly and it wins.

Split state

In namespace mode, state writes to two files instead of one:

<state_path> (e.g. .stratum/<name>.json) — every user-declared resource.
<state_path's directory>/_shared.json — every implicit per-host _stratum_* resource (auto-injected swap, sshd OOM tuning, sshd reload).

The shared file is what lets multiple namespaces target the same host without each one trying to recreate the tuning resources. The first namespace's apply creates them; subsequent namespaces see them as no-op. See Architecture: split state for the merge rules.

Cross-namespace conflict checks

Before classification, plan and apply in namespace mode re-load every sibling namespace's configs (with unresolved secrets tolerated) and walk every docker_container they declare. Two collision classes are caught:

Host port — two namespaces declare a docker_container binding the same (host, host_port). The check parses H:C and IP:H:C shapes; ranges and bare-port-random forms are skipped.
Container name — two namespaces declare a docker_container with the same (host, name). The default for name is the resource's label.

Both error messages name the offending resource and the sibling that already claims the port or name:

cross-namespace port conflict on host `root@192.0.2.10`:
  - app::docker_container.web  (current) wants 80
  - infra::docker_container.traefik  (sibling) already claims it

The check runs at plan time and is skipped entirely in bundle mode (no -n).

`--env-file` and `.env` auto-load

secret { from_env = "X" } reads std::env::var("X") at config-load time. To avoid having to export X=... (and to keep credentials out of shell history), stratum can load env files before resolving secrets.

stratum --env-file .env.prod plan -c bootstrap.strat
stratum --env-file base.env --env-file overrides.env apply -y

Rules:

Auto-load. If --env-file is not passed at all, stratum auto-loads ./.env if it exists. Auto-load is silent on miss — no error if there's no .env.
Explicit list. If --env-file is passed one or more times, only those files are loaded. The auto-.env is not consulted. Every listed path must exist and parse, or the command errors before doing any work.
Process env wins. A variable already set in the process environment is never overwritten by a file. This is the 12-factor rule — FOO=x stratum apply keeps FOO=x regardless of what .env says.
First-set wins among files. When --env-file a --env-file b is passed, a is parsed first; a variable set in a is not overwritten by b. The CLI prints [env] loaded <path> to stderr for every file successfully loaded.

Useful with secret { from_env = ... }:

# .env
PG_PASSWORD=correcthorsebatterystaple
GH_PAT=ghp_...

secret "pg_password" { from_env = "PG_PASSWORD" }
secret "gh_pat"      { from_env = "GH_PAT" }

The .env file is not committed (add it to .gitignore). For team workflows, check in a .env.example with placeholder values instead.

`plan`

Print what would change if applied. Read-only — never writes state. By default never contacts a host either; pass --refresh to query live state.

stratum plan
stratum plan --refresh
stratum plan -c infra.strat -c app.strat -s .stratum/host.json
stratum plan --allow-unresolved-secrets
stratum -n app plan                                    # namespace mode

flag	short	default	description
`--config <PATH>`	`-c`	`stratum.strat`	Path to a `.strat` config file. Repeatable: every `-c` file is merged into one document and evaluated together. See Multi-file configs.
`--state <PATH>`	`-s`	`.stratum/state.json`	Path to the JSON state file.
`--refresh`		off	Query live hosts via `Provider::read` and annotate drift between state and reality.
`--allow-unresolved-secrets`		off	If a `secret` block's source is missing (env unset, file unreadable), substitute a `<unresolved-secret:NAME>` placeholder instead of failing. Plan-only — `apply` refuses any plan containing placeholders. See Secrets: `--allow-unresolved-secrets`.

Output is a list of resources prefixed with an action symbol:

symbol	action
	no change
`+`	create
`~`	update (with field-by-field diff)
`-`	delete

Followed by a summary line: N create, N update, N delete, N no-op.

Secret rendering

A leaf value that is a secret marker (in either prior or desired) prints as <secret:NAME sha:abc123>, where abc123 is the first six hex chars of the SHA-256 of the plaintext. Enough to spot a rotation; not enough to attack offline. See Secrets.

 ~ docker_container.stratum-postgres
      ~ env.POSTGRES_PASSWORD: <secret:pg_password sha:f7c3bc> -> <secret:pg_password sha:9a1e44>

`--refresh`: live drift detection

When --refresh is set, stratum calls each provider's read method for every step that has prior state (i.e. not Create), compares the result against what's recorded, and prints per-resource annotations:

   docker_container.traefik
      ! DRIFT: image: state="traefik:v2.11" observed="traefik:v3.0"
   system_service.docker
      ! DRIFT: resource missing on host (state says exists)

A footer line summarizes:

drift: clean

Or, when there's drift:

drift: 2 differ, 1 missing, 4 unreadable

differ — read returned data that doesn't match state on at least one field.
missing — read returned Absent (the resource is gone on the host but still in state).
unreadable — the provider returned Observed::Unknown (no read impl, or read errored). For system_ufw_rule and ssh_exec, this is always the case by design.

Drift detection is one-sided: fields that exist in state but not in the observed data are ignored (providers don't surface every field). It is also marker-aware — a state-side secret marker is hashed against an observed plaintext, so secret-bearing fields don't perpetually drift on every refresh. See Architecture: drift detection and Secrets: drift detection.

Create steps are skipped during refresh — there's no prior state to compare.

`apply`

Compute the plan, print it, and — if confirmed with -y — execute side effects against remote hosts.

stratum apply -y
stratum apply -y -c infra.strat -c app.strat -s .stratum/host.json
stratum apply -y --allow-destroy
stratum -n app apply -y                                # namespace mode

flag	short	default	description
`--config <PATH>`	`-c`	`stratum.strat`	Path to a `.strat` config file. Repeatable, like `plan -c`. See Multi-file configs.
`--state <PATH>`	`-s`	`.stratum/state.json`	Path to the JSON state file.
`--yes`	`-y`	off	Skip the confirmation gate and execute. Without it, apply prints the plan and exits.
`--allow-destroy`		off	Permit `Delete` steps. Required whenever the plan would remove a resource that's in state but absent from config. See `--allow-destroy` below.

apply refuses to run a plan containing any <unresolved-secret:NAME> placeholder — those only appear via plan --allow-unresolved-secrets, and they're a plan-only construct. Resolve the secret's source and retry.

Confirmation gate: without -y, apply prints the plan and stops with Apply? Re-run with -y to execute against remote hosts. There is no interactive prompt — re-run with -y to proceed.

When -y is set, you'll see this banner before the work starts:

!! Applying: side effects WILL execute on remote hosts.

State is written to --state after every successful apply.

Post-apply self-check

After a successful apply, stratum re-loads the config, rebuilds the plan against the freshly updated state, and runs refresh_plan against the live host. The result is one summary line:

post-apply drift: clean

Or, when something is off:

post-apply drift: 1 differ, 4 unreadable — run 'stratum plan --refresh' to see details

This catches resources that "applied successfully" according to the provider but don't actually match reality (rare, but possible — e.g. a systemd service that exits zero from start but immediately crashes). Unreadable counts are expected when your config includes system_ufw_rule or ssh_exec resources; both return Unknown from read by design.

If the plan was a no-op (Nothing to do.), no apply runs and no post-apply check happens.

`--allow-destroy`: the destruction guard

This flag exists because of a failure shape that's easy to hit by accident. Apply a config against a state file that holds resources the config doesn't declare — by forgetting a -c, by pointing -s at the wrong file, by reusing a bundle state for one slice of a larger deployment — and build_plan will emit a Delete step for every resource in state but not in config. Without a guard, apply executes those deletes, and a single mis-typed command tears down the whole host: containers, networks, services, packages, ufw rules, in roughly the order BTreeMap iteration of <kind>.<name> produces.

The fix is two-part. The structural fix depends on how you organize configs: one state file per host with multi-file configs, or one state file per namespace. The tactical fix is the destruction guard: if a plan contains any Delete step, apply refuses to run unless --allow-destroy is set.

refusing to apply: plan would delete 9 resources not in config:
  - system_ufw_rule.allow-ssh
  - system_service.ufw
  - system_service.docker
  - system_package.ufw
  - system_file.traefik-config
  - docker_network.edge
  - docker_container.traefik
  - ...

loaded configs: app.strat
state file: .stratum/host.json

If this is intended, re-run with --allow-destroy. If not, you may be applying against the wrong state file (-s), or have forgotten a -c flag.

The check fires when PlanSummary.delete > 0 — the list is every step with Action::Delete. The error names:

the resources slated for deletion,
the loaded config files (so a missing -c is visually obvious),
the state file path (so a wrong -s is too),
three likely causes.

The guard runs before the -y confirmation gate, so you'll hit the same bail with or without -y.

When to pass --allow-destroy:

You really did remove a resource from config and want it gone on the host.
You're tearing down a whole config (commented out resources, intend a full sweep).

When not to pass it: any time the delete list surprises you. Re-check -s and your -c set.

`status`

Snapshot per-host resource usage. For each unique host declared in the loaded configs, stratum ssh's the host and prints uptime + free memory + root-disk usage + per-container CPU/RAM/IO from docker stats --no-stream. One section per host.

stratum status
stratum status -c infra.strat -c app.strat -s .stratum/host.json
stratum status --host root@192.0.2.10

flag	short	default	description
`--config <PATH>`	`-c`	`stratum.strat`	Config file(s) to enumerate hosts from. Repeatable.
`--host <FILTER>`		none	Only query the matching host. Compared against both the host's `addr` and the block label (`host "name" {...}`).

The state file is not consulted — status doesn't read or write state. It only needs host block declarations.

Output shape:

=== root@192.0.2.10 ===
  up:   17:48:12 up 3 days,  4:21,  1 user
  load: 0.42, 0.31, 0.19
  mem:  Mem:  1.9Gi  1.3Gi  85Mi  19Mi  559Mi  625Mi
  swap: Swap: 4.0Gi  500K   4.0Gi
  disk: /dev/vda1  79G  18G  61G  23% /

  CONTAINER                   CPU%      MEM USAGE / LIMIT    MEM%              NET I/O            BLOCK I/O
  traefik                     0.18%     34.2MiB / 1.9GiB     1.78%      1.2MB / 4.5MB       0B / 12.3kB
  web                         0.04%     12.1MiB / 1.9GiB     0.62%      245kB / 89kB        0B / 0B
  db                          0.21%     112MiB / 256MiB      43.75%     332kB / 412kB       2.4MB / 1.8MB
  cache                       0.08%     8.4MiB / 64MiB       13.13%     189kB / 167kB       0B / 0B

The probe is a single composite shell snippet (one ssh round-trip per host) with sentinel headers (===UPTIME===, ===STATS===) to keep parsing simple. docker stats --no-stream is one snapshot, not the streaming default. If docker is absent or there are no running containers, the section prints (no docker stats available) and moves on.

Errors:

A host block with no addr is skipped with a [status] skipping host warning.
An SSH failure on one host is logged with [status] <addr>: failed: <reason> but does not stop the loop — remaining hosts are still queried.
--host filtering with no match is a hard error: no host matched filter `<arg>` .

status is a between-applies diagnostic — not a replacement for a real monitoring stack. The headline use case is spotting RAM/CPU pressure (a container creeping toward its memory limit, a host swap-thrashing) without leaving stratum's CLI.

`state list`

List every resource currently in state.

stratum state list
stratum state list --path infra-state.json

flag	short	default	description
`--path <PATH>`	`-p`	`.stratum/state.json`	Path to the JSON state file.

Output is one line per resource: <kind>.<name> [<provider>]. Empty state prints (empty state).

`state show`

Print the full JSON for a single resource.

stratum state show docker_container.traefik
stratum state show ssh_exec.uptime --path infra-state.json

Positional argument:

arg	description
`<addr>`	Resource address, `<kind>.<name>` (split on the first `.`).

flag	short	default	description
`--path <PATH>`	`-p`	`.stratum/state.json`	Path to the JSON state file.

Prints the pretty-printed ResourceState JSON, or not found if the address is not in state.

`state merge`

Merge two or more state files into one. Used to consolidate per-config state files (e.g. .stratum/infra.json, .stratum/app.json) into one state per host.

stratum state merge \
  -o .stratum/host.json \
  .stratum/infra.json \
  .stratum/app.json

arg / flag	short	default	description
`<INPUTS>...`		—	Source state files. At least two are required.
`--out <PATH>`	`-o`	—	Output path. Must not already exist — refuses to overwrite. Remove the file or pick a different `-o` to retry.

On success:

merged 2 state files into .stratum/host.json (13 resources)

Failure modes:

The output path already exists → refusing to overwrite existing state file ....
Any <kind>.<name> key appears in more than one input → key ... present in both X and Y — refusing to merge. There is no last-writer-wins. Resolve the collision (rename one resource, or remove it from one state) and retry.

After merge, verify by running stratum plan against the consolidated state with the full -c set — a non-zero diff means something is off and the old per-config files should not be deleted yet.

Architecture

stratum is a Cargo workspace of six crates. The flow is config → desired resources → diff against prior state → plan → apply via providers → new state → post-apply drift check.

Workspace layout

crates/
  core/                   stratum-core
  config/                 stratum-config
  cli/                    stratum-cli (the `stratum` binary)
  providers/
    ssh/                  stratum-provider-ssh
    docker/               stratum-provider-docker
    system/               stratum-provider-system
    git/                  stratum-provider-git

core has no provider dependencies. config depends on core only for its DesiredResource / ResourceAddr types. Providers depend on core for the trait. The CLI wires everything together.

Core types

All types live in crates/core/src/lib.rs.

`ResourceAddr`

#![allow(unused)]
fn main() {
struct ResourceAddr { kind: String, name: String }
}

Renders as <kind>.<name>. Used as the key in the state map and as the user-facing identifier in CLI output.

`ResourceState`

#![allow(unused)]
fn main() {
struct ResourceState {
    addr: ResourceAddr,
    provider: String,
    attrs: serde_json::Value,
}
}

One per tracked resource. attrs is whatever the provider returned from its last create / update.

`State`

#![allow(unused)]
fn main() {
struct State {
    version: u32,                              // file format version, default 1
    resources: BTreeMap<String, ResourceState>, // keyed by addr.key()
}
}

On-disk JSON, loaded with State::load(path) and saved with State::save(path). Default path is .stratum/state.json. A missing file is treated as an empty state. The parent directory is created on save.

`Action` and `FieldChange`

#![allow(unused)]
fn main() {
enum Action {
    NoOp,
    Create,
    Update { changes: Vec<FieldChange> },
    Delete,
}

struct FieldChange { field: String, from: Value, to: Value }
}

Action::symbol() returns the two-character prefix used in plan output ( , +, ~, -).

`Observed` and `Drift`

#![allow(unused)]
fn main() {
enum Observed {
    Present(Value),       // resource exists; attrs normalized to state shape
    Absent,               // confirmed gone on the host
    Unknown(String),      // provider can't tell (carries a reason)
}

struct Drift {
    changes: Vec<FieldChange>,
    missing: bool,               // state says exists but observed == Absent
    unreadable: Option<String>,  // observed == Unknown OR read returned Err
}
}

Drift is per-resource, populated by refresh_plan. Drift::is_clean() is true when all three fields are empty/false/none.

`PlannedResource` and `Plan`

#![allow(unused)]
fn main() {
struct PlannedResource {
    addr: ResourceAddr,
    provider: String,
    desired: Value,
    prior: Value,
    action: Action,
    drift: Option<Drift>,     // None unless refresh_plan was called
}

struct Plan { steps: Vec<PlannedResource> }
}

Plan::summary() returns a PlanSummary { create, update, delete, noop, drifted, missing, unreadable }. Plan::was_refreshed() is true iff any step has a Some(drift).

`DesiredResource`

#![allow(unused)]
fn main() {
struct DesiredResource {
    addr: ResourceAddr,
    provider: String,
    attrs: Value,
}
}

The output of the config evaluator and the input to build_plan.

Plan / apply flow

Resolve sources. If -n NAME is set, the CLI loads the manifest, extracts the named namespace, and resolves the config list to [manifest, ...namespace.configs] and the state path to .stratum/<name>.json (or the namespace's explicit state =). Otherwise the configs and state path come straight from -c / -s. See Manifest discovery.
Parse config. stratum_config::load_files(paths) (or load_file for a single path) runs lex → parse on each file, tags each block with its source path, concatenates into one Document, and runs a multi-pass extract: hosts → secrets → namespaces → providers + resources. The result is an Extracted { hosts, providers, resources, secrets, namespaces, redaction_map }. Any system_file with content_file is inlined during this step — see content_file. Duplicate hosts/providers/secrets/resources/namespaces across files are hard errors that name both paths.
Cross-namespace check. In namespace mode only, re-load every sibling namespace's configs and check the current namespace's docker_container resources for port and container-name collisions. See Cross-namespace validator.
Load state. In bundle mode, State::load(state_path). In namespace mode, State::load_merged(state_path, _shared.json) — see Split state. Missing file → default empty state.
Build plan. build_plan(extracted.resources, &state) -> Result<Plan>:
- Run two planner-side validators before classification: a port-conflict check on every docker_container.ports, and a depends_on topo sort that orders create / update steps and rejects cycles / unknown refs.
- For each desired resource (in topo order): lookup prior by addr key. For kinds in SECRET_CONTENT_TO_SHA, normalize desired before diffing (see Secret-content normalization on plan). Run diff_observed(prior.attrs, normalize_for_plan(kind, desired.attrs)). No diff → NoOp. Otherwise Create (no prior) or Update { changes }.
- For each prior resource not in desired → Delete, in forward topo order over state-resident depends_on edges.
Optional refresh. With plan --refresh, run refresh_plan(&mut plan, &registry) to annotate every non-create step with observed drift.
Print plan. Symbol per resource, fields-changed lines for updates, drift annotations if refreshed, summary at the end.
Confirmation gate. Without -y, exit here. apply without -y is identical to plan plus the "Apply? Re-run with -y to execute" line.
Build registry. Instantiate all providers. (No shipped provider reads its provider { ... } block today.)
Execute. For each plan step (in topo order), look up the provider by kind prefix and call create / update / delete. After every docker_container create or update, run the post-apply readiness wait before moving to the next step. Update state with the returned attrs (or remove the entry on delete).
Save state. In bundle mode, state.save(state_path). In namespace mode, state.save_split(state_path, _shared.json) — _stratum_* addresses route to the shared file, everything else to the namespace's file.
Post-apply self-check. Reload the config, rebuild the plan against the new state, run refresh_plan again, and print one summary line: post-apply drift: clean or post-apply drift: N differ, M missing, K unreadable — run 'stratum plan --refresh' to see details.

The `Provider` trait

#![allow(unused)]
fn main() {
#[async_trait]
trait Provider: Send + Sync {
    fn name(&self) -> &str;
    fn kinds(&self) -> &[&'static str];
    fn configure(&mut self, _attrs: &Value) -> Result<()> { Ok(()) }
    async fn create(&self, kind: &str, name: &str, attrs: &Value) -> Result<Value>;
    async fn update(&self, kind: &str, name: &str, prior: &Value, attrs: &Value) -> Result<Value>;
    async fn delete(&self, kind: &str, name: &str, prior: &Value) -> Result<()>;
    async fn read(&self, _kind: &str, _name: &str, _prior: &Value) -> Result<Observed> {
        Ok(Observed::Unknown("provider does not implement read".into()))
    }
}
}

name() is the lookup key in the registry.
kinds() lists every kind the provider owns. The registry's for_kind scans providers and returns the first match.
configure is called once at apply time, with the provider "<name>" { ... } body. Default impl ignores it. No shipped provider implements it today.
create, update, delete return the new attrs to record in state (or () for delete). The returned value is what the next plan will diff against.
read must be non-destructive — it's a query, not a side effect. Default impl returns Unknown. Implementations should normalize the returned Value to the same shape as state attrs.

The diff algorithm

There are two diff functions in core. They serve different purposes.

`diff` (symmetric, used by `Action::Update` legacy path)

#![allow(unused)]
fn main() {
fn diff(prior: &Value, desired: &Value) -> Vec<FieldChange>
}

A recursive walk over JSON values:

If prior == desired exactly, return no changes.
If both are JSON objects, walk their union of keys (sorted, deduplicated). For each key, recurse with the dotted path <prefix>.<key>.
Otherwise, emit a single FieldChange { field: <path>, from: prior, to: desired }. The field is "<root>" when the diff lives at the document root.

`diff_observed` (one-sided, used by `build_plan` and `refresh_plan`)

#![allow(unused)]
fn main() {
fn diff_observed(prior: &Value, observed: &Value) -> Vec<FieldChange>
}

Used both by build_plan (comparing state-stored prior against desired config) and by refresh_plan (comparing state against live observation). Rules differ from diff:

State-only fields are ignored. Only keys present in observed are walked. A field that's in prior but not in observed does not generate drift. This is what lets providers store extra fields (container_id, sha256, etc.) without polluting plans.
Missing key vs empty container = no drift. If prior has no key k but observed has k: {} or k: [], that's not drift. Same for prior: null vs observed: {} / [].
String arrays are compared as sets. ["a", "b"] and ["b", "a"] are equal. Non-string arrays are compared by order.
Added keys in observed → flagged. A key in observed but not in prior shows up as from: null, to: <value>.

The provider's read implementation is responsible for trimming the observed value to a shape that mirrors state, so noise doesn't leak through. For example, docker_container strips com.docker.* labels and intersects with the state's label key set.

Drift detection

refresh_plan(&mut plan, &registry) annotates each plan step with observed drift from live reality.

#![allow(unused)]
fn main() {
async fn refresh_plan(plan: &mut Plan, registry: &Registry);
}

Rules:

Action::Create is skipped — there's no prior state to read.
Sequential per resource. SSH round-trips are I/O-bound but ~10 resources doesn't justify parallelism yet.
Per-resource errors are caught, not propagated. They become Drift::unreadable = Some("read failed: ..."). refresh_plan itself never returns Err.
The provider's read is called with (kind, name, &step.prior). The returned Observed is mapped:
- Present(observed) → drift.changes = diff_observed(&step.prior, &observed)
- Absent → drift.missing = true
- Unknown(reason) → drift.unreadable = Some(reason)

PlanSummary counts:

drifted — count of steps where drift.changes is non-empty.
missing — count of steps where drift.missing == true and the action is not Delete. (A Delete step whose resource is already gone is annotated (already gone on host; delete will noop) instead — that's not drift.)
unreadable — count of steps where drift.unreadable.is_some().

Planner-side validators

Port-conflict validator

Before classifying steps, build_plan walks every docker_container.ports value across the desired set and checks for (host, ip, host_port) collisions. Two resources binding the same port on the same host is a hard error at plan time, naming both. A 0.0.0.0:N bind symmetrically collides with 127.0.0.1:N — the wildcard bind subsumes the loopback one.

Random ports ("5432" — docker picks the host port) are skipped silently. Port ranges ("8000-8010:8000-8010") get a warning but are not validated. Unrecognized port shapes are skipped to keep the validator forward-compatible.

`depends_on` topo sort

The planner runs a stable Kahn's-algorithm topo sort over the docker_container.depends_on edges (see depends_on). Properties:

Stable. Resources without edges keep their input (file) order. Where ties exist, a BTreeSet ready-set picks them in lexicographic addr order.
Implicit _stratum_* resources stay at the front. They carry no edges and have in_degree = 0, so they land first.
Cycles are a hard error citing the cycle path.
Unknown references are hard errors citing both the source and the missing target.

The topo order applies to Create and Update steps; Delete order is computed separately.

Secret-content normalization on plan

For kinds where a content field carries a secret value, state stores only sha256 (the plaintext is unrecoverable from state) but desired carries the full plaintext at plan time. A naive diff_observed(prior, desired) would emit content: null -> "<plaintext>" on every plan, leaking the value into CLI output.

build_plan normalizes desired before diffing. The kinds that opt into this live in a const SECRET_CONTENT_TO_SHA: &[(&str, &str)]:

kind	content field
`system_secret_file`	`content`

For each entry, normalize_for_plan(kind, attrs) clones attrs, removes the named field, and inserts sha256: <hex> derived from its UTF-8 bytes. Diff then compares sha against sha — exactly the same shape state holds. Plaintext never reaches the diff.

This is the inverse half of the kind's own apply-time unchanged check (which compares the same sha against prior state to decide whether to re-upload). The two together guarantee that a plaintext secret never appears in plan output, in state, or in apply logs.

Plan-level secret redaction

After build_plan returns and after refresh_plan runs, the CLI calls Extracted::redact_plan(&mut plan) once before printing. This walk does two things:

Apply substring redaction to every step's desired, prior, and per-FieldChange from / to. A leaf string containing a known secret plaintext (introduced via ${...} interpolation) gets each occurrence replaced with the inline <secret:NAME:sha256:HEX> marker. Exact-match leaves are replaced with the object marker, same as everywhere else.
Drop redaction-cancelled changes. When state holds the inline substring marker and observed returns plaintext, both sides collapse to the same marker after the walk. Any FieldChange where from == to post-redaction is dropped. If an Action::Update's changes list becomes empty, the step is downgraded to Action::NoOp — drift that was only a substring-marker-vs-plaintext difference disappears entirely.

This is what stops plan --refresh from emitting spurious updates on every secret-bearing interpolated field. See Secrets: substring redaction.

Post-apply readiness wait

After every successful docker_container create or update, the planner pauses before moving on to the next step (which may be a dependent declared via depends_on). The wait lives in the CLI in post_apply_wait:

If desired.healthcheck is present, poll docker inspect --format '{{.State.Health.Status}}' <name> once a second, up to 60 polls. Terminal statuses are healthy (proceed), unhealthy (fail the apply), or empty / none on the first poll (no health check at the docker level — proceed). starting and other interim values keep polling.
Otherwise, sleep 500ms. This is cosmetic — docker often needs a beat to wire networks and volumes before something else pokes the container.

Non-docker_container steps return immediately. The provider's own create / update is synchronous: git_repo clones return when done, system_secret_file returns when the SSH upload completes.

The poll loop itself is in core (poll_container_health), separated from SSH plumbing so it's unit-testable with a mocked inspector.

Delete ordering

build_plan emits delete steps in forward topo order over state-resident depends_on edges. For two resources X and Y where X depends_on Y at runtime, X is torn down before Y — the dependent goes first so the dependency is still serving while it shuts down.

Resources without recorded depends_on edges fall through with in_degree = 0 and end up before any edged resources, in reverse-iteration order of the state BTreeMap (which preserves the prior file-order-independent behavior). This keeps the heuristic close to "leaves before roots" for hand-written configs even when no depends_on is declared.

depends_on is recorded in state at create / update time and survives across apply runs, so a delete computed against state still knows the edges the resource was declared with — even when the resource is no longer in config.

Implicit per-host resources

For every host block in the merged document, extract injects three implicit resources before any user-declared ones, addressed under the _stratum_ prefix:

addr	kind	purpose
`ssh_exec._stratum_swap_<host>`	`ssh_exec`	Creates a 4 GB `/swapfile`, enables it, persists in fstab.
`system_file._stratum_sshd_oom_<host>`	`system_file`	Drops `/etc/systemd/system/ssh.service.d/oom.conf` with `OOMScoreAdjust=-1000`.
`ssh_exec._stratum_sshd_reload_<host>`	`ssh_exec`	`systemctl daemon-reload && systemctl restart ssh`.

The first two exist so that under memory pressure the kernel does not kill sshd — which would lock the operator out of recovery. The third applies the drop-in. They are stable across versions and live at the front of the desired list (in_degree 0), so they apply before any user resource on the host.

In namespace mode they are routed to _shared.json so multiple namespaces sharing a host don't each try to recreate them. In bundle mode they share the single state file with everything else.

The _stratum_ prefix is reserved. User-declared resources should not use it.

Manifest discovery (namespace mode)

When -n NAME is set, the CLI resolves the config + state paths as follows. See Namespaces for the syntax.

Locate the manifest. If --manifest PATH was passed, that path is used. Otherwise the CLI requires ./stratum.strat to exist; if it doesn't, the command errors.
Load the manifest. Runs stratum_config::load_file(manifest), producing an Extracted with one or more namespace declarations.
Look up the namespace. If the named namespace isn't in the manifest, error with the list of known namespaces.
Resolve configs. The merged list is [manifest, ...ns.configs]. The manifest is always first so its top-level host / secret / provider blocks are visible to every per-namespace file. Each configs entry is absolutized at parse time against the manifest's directory.
Resolve state. Priority order: explicit -s on the command line, then the namespace's body-level state =, then .stratum/<name>.json.

Passing -c together with -n is a hard error — the namespace's configs = [...] is the config list, and a -c override would silently shadow it. Bundle mode (no -n) is unchanged by namespace support.

Cross-namespace validator

Namespace mode's plan and apply run a sibling-collision check before classification. The check exists because build_plan operates inside one namespace's view of the world — it has no visibility into what other namespaces declare — so two namespaces could each plan a docker_container binding the same (host, host_port) and only discover the conflict at apply time, when one fails over a port already taken by the other.

The check, in validate_cross_namespace:

Re-loads the manifest (cheap; it has no resources).
For each sibling namespace (every one except the current), loads its configs with LoadOptions::allow_unresolved_secrets = true so a missing env var in some unrelated namespace doesn't block planning the current one.
Walks every docker_container in every sibling, collecting:
- Port claims. Each ports entry is parsed for the host-port half of H:C or IP:H:C. Ranges (8000-8010:...) and bare-port shapes (where docker picks the host port) are skipped.
- Name claims. The container's name attribute, falling back to the resource's label.
Checks every docker_container in the current namespace against the collected claims, erroring on the first (host, port) or (host, name) collision and naming both the offending current-namespace address and the sibling that owns the claim.

The validator is skipped entirely in bundle mode. Within a single namespace, the existing planner-side port-conflict validator catches collisions within the same desired set. The cross-namespace validator is strictly the inter-namespace layer above it.

The sibling loader uses allow_unresolved_secrets = true defensively — it's only collecting addresses, ports, and names, none of which depend on secret plaintext. If a sibling load fails for any other reason, the error is logged and that sibling is skipped (the plan still proceeds), so a broken sibling doesn't gate apply of an unrelated namespace.

Split state (namespace mode)

In namespace mode the state on disk is two files instead of one:

.stratum/
  <name>.json        # user-declared resources for namespace `<name>`
  _shared.json       # implicit per-host _stratum_* resources

State::save_split(ns_path, shared_path) walks self.resources and routes each entry by addr name: anything starting with _stratum_ goes to _shared.json, everything else to <name>.json. Both files are written every save (with parent dirs created), even when one side is empty — that keeps the next load predictable.

State::load_merged(ns_path, shared_path) is the inverse. It loads both files and unions their resources maps, with the namespace's entry winning any addr.key() collision (the more recently touched of the two, since the active scope just ran). Missing files become empty state (matches load).

Bundle mode keeps using the single-file State::load(path) and State::save(path). The CLI picks the right pair via the -n flag — load_state / save_state in crates/cli/src/main.rs switch on whether a shared path is set.

The split is what lets two namespaces targeting the same host co-exist without each trying to own the per-host tuning resources. First namespace applies: _stratum_swap_*, _stratum_sshd_oom_*, _stratum_sshd_reload_* land in _shared.json. Second namespace plans: load_merged pulls them back from the shared file into its working state, so the new plan sees them as no-op. Without the split, the second apply would see them missing from its state file and recreate them, churning the swap file and restarting sshd on every cross-namespace apply.

State file shape

{
  "version": 1,
  "resources": {
    "docker_container.traefik": {
      "addr": { "kind": "docker_container", "name": "traefik" },
      "provider": "docker",
      "attrs": {
        "host": "root@192.0.2.10",
        "image": "traefik:v2.11",
        "container_id": "abc123...",
        "...": "..."
      }
    },
    "system_package.docker": { "...": "..." }
  }
}

Resources are keyed by <kind>.<name> in a BTreeMap, so the on-disk order is deterministic (lexicographic). The file is overwritten in full on every successful apply.

Secret markers in state

When a resource attr resolves from a secret ref, the provider receives plaintext but state stores a redaction marker:

{
  "env": {
    "POSTGRES_PASSWORD": {
      "__secret": "pg_password",
      "__secret_sha256": "sha256:f7c3bc1d808e04..."
    }
  }
}

The marker is written by Extracted::redact_into, called between every provider return and state.upsert. diff and diff_observed are marker-aware (see core::secret_compare): a marker compares equal to plaintext when the plaintext's hash matches the marker's __secret_sha256, and a marker-vs-marker compare uses only the hashes. This is what keeps --refresh from showing perpetual drift on secret-bearing fields. The CLI's render function prints markers as <secret:NAME sha:abc123> — six hex chars, enough to spot a rotation, not enough to attack offline.

Bootstrap a fresh droplet

End-to-end walkthrough: take a blank Ubuntu 24.04 droplet and bring it to ufw + docker + traefik in one stratum apply. The config and the Traefik file it ships are both written below; substitute your host's address where indicated.

What you get

After apply:

ufw and docker.io installed via apt.
ufw rules allowing 22/tcp, 80/tcp, 443/tcp.
docker and ufw systemd units enabled and started.
ufw ruleset actually activated (ufw --force enable via ssh_exec).
A stratum-edge Docker network for Traefik to attach to.
/etc/traefik/traefik.yml written from a local file.
A traefik:v2.11 container on :80 and :443, attached to stratum-edge, with /var/run/docker.sock mounted.

11 resources, all created in one apply.

Prerequisites

A fresh Ubuntu 24.04 droplet with a public IP. Any cloud provider works; the tutorial uses DigitalOcean. Note the IP.
SSH key access as root. Most cloud providers let you inject an SSH key at provision time. Stratum uses BatchMode=yes, so the key must already be in your local ssh-agent or referenced in ~/.ssh/config. No password prompts — they'll hang the apply.
stratum built. From the repo root: cargo build --release.

Step 1: verify SSH

ssh root@<ip> echo ok

Expect a single ok and exit 0. If you get a password prompt or a Permission denied, fix that before continuing — stratum won't be able to authenticate either.

Step 2: write the config

Create a bootstrap.strat with the host's address:

host "primary" {
  addr = "root@<ip>"
}

Replace <ip> with the droplet's public IP. The rest of the file references host.primary.addr — you don't have to touch anything else.

The bootstrap pulls a Traefik config from files/traefik.yml, resolved relative to the .strat file itself. Place your Traefik static config at files/traefik.yml next to bootstrap.strat.

Step 3: plan

./target/release/stratum plan -c bootstrap.strat

Expected output:

stratum plan
============
 + docker_container.traefik
 + docker_network.edge
 + ssh_exec.ufw-activate
 + system_file.traefik-config
 + system_package.docker
 + system_package.ufw
 + system_service.docker
 + system_service.ufw
 + system_ufw_rule.allow-http
 + system_ufw_rule.allow-https
 + system_ufw_rule.allow-ssh

11 create, 0 update, 0 delete, 0 no-op

(The exact alphabetical order comes from BTreeMap iteration of <kind>.<name> keys.) Read-only — nothing is touched on the host.

Step 4: apply

./target/release/stratum apply -y -c bootstrap.strat

The order in which resources execute is the same as the plan output (a depends-on graph is not implemented). The bootstrap config is hand-ordered to avoid the obvious foot-guns:

ufw and docker.io packages install (apt-get update runs once).
ufw rules are added — before ufw is activated.
docker and ufw systemd units start.
ssh_exec "ufw-activate" runs ufw --force enable. The ruleset is now enforcing.
The stratum-edge Docker network is created.
traefik.yml is dropped at /etc/traefik/traefik.yml.
The Traefik container starts.

After the work runs, you'll see:

State saved to .stratum/state.json
post-apply drift: 4 unreadable — run 'stratum plan --refresh' to see details

4 unreadable is expected: the three system_ufw_rule resources and one ssh_exec resource always return Observed::Unknown from read (see providers/system and providers/ssh).

Step 5: verify

# Traefik should respond — even if just with a 404 (no routes configured yet).
curl -k -I https://<ip>
# HTTP/2 404 ...

# Container should be running.
ssh root@<ip> docker ps
# CONTAINER ID  IMAGE          ...  PORTS                              NAMES
# abc123def     traefik:v2.11   ...  0.0.0.0:80->80/tcp, ...:443->443   traefik

# UFW should be active and have the three rules.
ssh root@<ip> ufw status
# Status: active
# To          Action    From
# 22/tcp      ALLOW     Anywhere
# 80/tcp      ALLOW     Anywhere
# 443/tcp     ALLOW     Anywhere

Step 6: re-plan with drift detection

./target/release/stratum plan --refresh -c bootstrap.strat

Every step prints with (no-op) — config matches state. The footer:

0 create, 0 update, 0 delete, 11 no-op
drift: 4 unreadable

The same 4 unreadable (three ufw rules + one ssh_exec) — nothing surprising. If any value on the host drifts from state (e.g. someone manually docker stop traefik), --refresh will surface it:

   docker_container.traefik
      ! DRIFT: resource missing on host (state says exists)

0 create, 0 update, 0 delete, 11 no-op
drift: 1 missing, 4 unreadable

Tear it down

Comment out (or delete) the resource blocks, leaving the host block. The resulting plan is all Delete steps, which trips the destruction guard — pass --allow-destroy to acknowledge:

./target/release/stratum apply -y --allow-destroy -c bootstrap.strat

build_plan emits deletes in reverse alphabetical order of <kind>.<name> (see Delete ordering). For the bootstrap config that conveniently gives you a reasonable teardown order — the Traefik container goes before the docker service, ufw rules go before the ufw service. It is not guaranteed to be safe for arbitrary configs; if you have inter-resource ordering needs, remove resources from the config in stages.

What's next

Layer your own containers on top of Traefik by adding more docker_container resources, or follow the Serve a static site behind Traefik tutorial to add a second app sharing the same state file.
For multiple independent slices on the same host, see Multi-namespace deployments — each slice plans and applies on its own with cross-slice port and name collisions caught at plan time.
Read providers/system for the full attribute schema if you want to extend the config.

Serve a static site behind Traefik

End-to-end walkthrough: take a host that already has docker, Traefik, and the stratum-edge network (i.e. one you just bootstrapped with bootstrap-droplet), and add a second application — an nginx container serving a static directory — routed by Traefik.

This is the canonical "second app behind Traefik" pattern. Use it as a template for any static-asset deploy. The two configs (bootstrap + this one) apply together against one state file — see Multi-file configs for why state is per-host and not per-config. If you'd rather apply them as independent slices, the same shape works as two namespaces — see the closing section of this tutorial.

What you get

After apply:

/srv/site/ on the host, containing whatever you point source_dir at (the output of any static-site build — mdbook build, hugo, a folder of pre-rendered HTML).
A site nginx:alpine container, mounting that directory read-only at the nginx web root.
Traefik labels routing Host(site.example.com) to the container.
Two new resources in state: system_dir.site and docker_container.site. The 11 resources from the bootstrap are untouched.

Prerequisites

Bootstrap done. The host needs the docker daemon, Traefik running on :80/:443, and the stratum-edge network. See Bootstrap a fresh droplet.
The shared host state file. Use the same -s .stratum/host.json you applied the bootstrap with. State is one-per-host, not one-per-config — both .strat files apply together against the shared state via repeated -c flags. See Multi-file configs.
A built static tree. Any directory of HTML/CSS/JS/assets works. The config below uploads ./site/ from next to the .strat file; substitute any local directory.

Step 1: the config

host "primary" {
  addr = "root@192.0.2.10"
}

resource "system_dir" "site" {
  host         = host.primary.addr
  source_dir   = "site"
  path         = "/srv/site"
  mode         = "0644"
  dir_mode     = "0755"
  owner        = "root"
  group        = "root"
  delete_extra = true
}

resource "docker_container" "site" {
  host     = host.primary.addr
  name     = "site"
  image    = "nginx:alpine"
  restart  = "unless-stopped"
  networks = ["stratum-edge"]
  volumes  = [
    "/srv/site:/usr/share/nginx/html:ro",
  ]
  labels = {
    "traefik.enable"                                       = "true"
    "traefik.http.routers.site.rule"                       = "Host(`site.example.com`)"
    "traefik.http.routers.site.entrypoints"                = "web"
    "traefik.http.services.site.loadbalancer.server.port"  = "80"
  }
}

Two resources. Notable details:

delete_extra = true keeps the remote tree in sync: files removed from site/ locally get rm -f'd on the host on the next apply. Without it, deleted local files stay on the host indefinitely.
Substitute site.example.com with a hostname that resolves to your host. For a quick demo without DNS, a service like nip.io resolves any hostname of the form <anything>.<ip>.nip.io to <ip> — useful for development, not for production.
The container attaches to stratum-edge — the same network Traefik discovers via the docker socket. No host port mapping needed; Traefik proxies via the network.
The Traefik labels are exactly the Traefik 2.x router/service form. Stratum doesn't parse them — they're opaque strings handed to docker.

Step 2: build the static tree

Build whatever your site is and drop the output in site/ next to the .strat file. The exact command depends on the generator — mdbook build, hugo, npm run build, or cp -R public site/. The system_dir provider doesn't care about the source; it just walks the directory tree and ships every regular file.

Step 3: plan

Apply both configs together against the shared host state. The bootstrap resources are already in state, so the only Create steps are the two new ones.

./target/release/stratum plan \
  -c bootstrap.strat \
  -c site.strat \
  -s .stratum/host.json

Expected output:

stratum plan
============
   docker_container.traefik
   docker_network.edge
   ...
 + docker_container.site
 + system_dir.site
   ...

2 create, 0 update, 0 delete, 11 no-op

(Order shown abbreviated.) Eleven resources are in state and unchanged; two new ones are queued for create.

Step 4: apply

./target/release/stratum apply -y \
  -c bootstrap.strat \
  -c site.strat \
  -s .stratum/host.json

The system_dir step tars + gzips site/ in memory, streams it over SSH, extracts on the host, and applies chown -R + chmod recursively. You'll see one summary line on stderr:

[system] DIR `site` -> root@192.0.2.10:/srv/site (N files, mode=0644 dir_mode=0755 root:root)

Then the container starts. Post-apply self-check:

post-apply drift: clean

(No unreadable count — neither resource uses an Unknown read.)

Step 5: verify

curl -H "Host: site.example.com" http://<host-ip>/
# or open http://site.example.com in a browser (with DNS pointed at the host).

You should get the site's landing page.

Re-deploying after content changes

Rebuild the static tree locally, then apply again. The system_dir provider hashes every file and compares against manifest_sha256 in prior state:

No content changes → [system] DIR \site` -> ... unchanged (N files, manifest match)` and the upload is skipped.
Any file added, removed, or modified → manifest digest changes, the whole tree re-tars and re-uploads. With delete_extra = true, files removed locally are rm -f'd on the host as part of the apply.

The docker_container is unchanged in either case — nginx is just serving from a bind mount, so new files on disk show up immediately without a container restart.

Why both `-c` flags together, not separately

Apply site.strat alone against .stratum/host.json and the plan diff is "11 deletes, 2 creates" — the site config doesn't mention any of the bootstrap resources, so build_plan flags them for deletion. The destruction guard catches the resulting apply with a list naming the loaded configs:

refusing to apply: plan would delete 11 resources not in config:
  - ...

loaded configs: site.strat
state file: .stratum/host.json

The missing config (bootstrap.strat) is visually obvious in loaded configs. The structural fix is to pass every .strat file that touches the host to every plan / apply. State is per-host. See Multi-file configs.

Or: split into two namespaces

If bootstrap.strat and site.strat are logically independent — bootstrap rarely changes, the site re-deploys often — wrapping them as two namespaces lets each one plan and apply on its own without juggling a long -c list. The shape is:

# stratum.strat
host "primary" {
  addr = "root@192.0.2.10"
}

namespace "infra" { configs = ["bootstrap.strat"] }
namespace "site"  { configs = ["site.strat"] }

Then stratum -n infra apply -y for the host-tier setup and stratum -n site apply -y for the site, each against its own state file. See the Multi-namespace deployments tutorial for the full walkthrough.

What's next

Layer additional apps the same way: one system_dir (or system_file for a single config) plus one docker_container with Traefik labels, each in its own .strat file, and add the file to your -c list (or to a new namespace).
For dynamic apps (build images, manage envs, blue/green rollouts), reach for a per-app deploy tool — stratum is for the host-tier setup, not per-app lifecycle.

Inject a secret into a docker container

You have a docker_container resource that needs a sensitive env var — a database password, an API token, a webhook secret. You want the value to come from your shell environment (or a file outside git), flow through stratum into the container's env map, and never land in .stratum/state.json as plaintext.

This is what secret blocks are for. The pattern is two resources: one secret block sourcing the value, and one docker_container reading secret.<name>.value inside its env map. Stratum substitutes the plaintext at apply time and stores a redaction marker in state.

Why this works the way it does

Anything you put in a .strat file is committed alongside your code. Anything you put in .stratum/state.json is committed (if you commit state) or sits on your disk in plaintext (if you don't). Neither place is somewhere a database password belongs.

Stratum splits the problem: the value lives in your shell (or a file you keep out of git), and the reference lives in the config. State stores a {__secret, __secret_sha256} marker, which is enough to tell that a secret is set and whether it changed, but not what it is. See Secrets for the full mechanism.

Step 1: source the value

Decide where the value comes from. Two options:

# From an env var.
secret "pg_password" {
  from_env = "PG_PASSWORD"
}

# Or from a file outside git.
secret "pg_password" {
  from_file = "~/.config/stratum/pg-password"
}

from_env reads the variable with std::env::var at config-load time. from_file reads the file; ~ and ~/ expand to $HOME (or $USERPROFILE on Windows). Relative paths resolve next to the .strat file.

Pick one. Setting both, or neither, is a hard error.

Step 2: reference it in a container

host "primary" {
  addr = "root@192.0.2.10"
}

secret "pg_password" {
  from_env = "PG_PASSWORD"
}

resource "docker_container" "db" {
  host     = host.primary.addr
  name     = "db"
  image    = "postgres:16-alpine"
  restart  = "unless-stopped"
  networks = ["stratum-edge"]
  ports    = ["127.0.0.1:5432:5432"]
  volumes  = ["pg-data:/var/lib/postgresql/data"]
  env = {
    POSTGRES_PASSWORD = secret.pg_password.value
    POSTGRES_USER     = "postgres"
    POSTGRES_DB       = "app"
  }
}

The ref secret.pg_password.value evaluates to the plaintext during ref resolution. The provider — docker here — receives a normal string in env.POSTGRES_PASSWORD and sets -e POSTGRES_PASSWORD=<value> on docker run.

A secret ref is only allowed in single-leaf string attrs. Putting it inside system_file.content (where it would land in a config file blob stratum can't redact) is rejected at load time. See the honesty guard for the full list.

Step 3: plan it

Set the env var, then plan:

export PG_PASSWORD=$(openssl rand -hex 32)

./target/release/stratum plan \
  -c db.strat \
  -s .stratum/host.json

The secret-bearing field renders with a 6-char hash prefix:

 + docker_container.db
      ...
      ~ env.POSTGRES_PASSWORD: null -> <secret:pg_password sha:f7c3bc>

The hash is enough to spot a rotation (the prefix changes) without leaking the value or a full attackable digest. The null -> ... is because this is a Create step; on Update you'd see both sides with their respective hashes.

Step 4: apply it

./target/release/stratum apply -y \
  -c db.strat \
  -s .stratum/host.json

What happens at apply time:

The provider receives the plaintext in env.POSTGRES_PASSWORD and runs docker run -e POSTGRES_PASSWORD=<value> .... The value lands inside the container's process environment.
Before stratum persists the provider's returned attrs to state, the redaction walk swaps every leaf string that matches a known plaintext for the marker object. env.POSTGRES_PASSWORD in state ends up as {"__secret": "pg_password", "__secret_sha256": "sha256:f7c3bc..."}.
state.save writes the file. Inspect it:

./target/release/stratum state show docker_container.db -p .stratum/host.json

You'll see the marker in env.POSTGRES_PASSWORD, not the password.

Step 5: rotate

Rotate by changing the source value and re-applying:

export PG_PASSWORD=$(openssl rand -hex 32)
./target/release/stratum apply -y -c db.strat -s .stratum/host.json

The new value hashes differently, so diff_observed sees a marker change and emits an Update step on the container. Docker tears down and recreates the container with the new env var. The plan output shows the hash prefix changing:

 ~ docker_container.db
      ~ env.POSTGRES_PASSWORD: <secret:pg_password sha:f7c3bc> -> <secret:pg_password sha:9a1e44>

No plaintext on either side of that line, in the CLI output or in state.

Embedding a secret inside a larger string

A connection string is the common case where a bare secret.X.value doesn't fit — the password is one piece of a URL, not the whole leaf. Use ${...} interpolation:

resource "docker_container" "api" {
  host     = host.primary.addr
  image    = "api:dev"
  networks = ["stratum-edge"]
  env = {
    DATABASE_URL = "postgresql://app:${secret.pg_password.value}@db:5432/app"
  }
}

At eval time the placeholder is replaced with the plaintext, so the provider receives a working connection string. At redaction time the substring redactor swaps the plaintext for an inline marker, so state ends up with:

"DATABASE_URL": "postgresql://app:<secret:pg_password:sha256:f7c3bc...>@db:5432/app"

Same drift behavior as exact-match: the marker on the state side compares equal to the plaintext on the observed side via the hash, and no perpetual update.

Whole-file secrets

For values too big or too binary to fit in a single env var — a Firebase service-account JSON, an age-encrypted key, a TLS bundle — use system_secret_file instead. The kind accepts a secret ref directly in content; state stores only the file's sha256 plus its permissions.

secret "firebase_sa" {
  from_file = "~/.config/app/firebase-sa.json"
}

resource "system_secret_file" "firebase-sa" {
  host    = host.primary.addr
  path    = "/etc/app/firebase-sa.json"
  content = secret.firebase_sa.value
  mode    = "0400"
}

resource "docker_container" "api" {
  host    = host.primary.addr
  image   = "api:dev"
  volumes = ["/etc/app/firebase-sa.json:/app/firebase-sa.json:ro"]
  env = {
    GOOGLE_APPLICATION_CREDENTIALS = "/app/firebase-sa.json"
  }
}

The pattern is: drop the file with system_secret_file, mount it into the container as a read-only volume, point the app at it via a non-sensitive env var. State holds sha256 + mode + owner + group for the file — never the bytes. Re-applying with the same contents is a no-op via the sha-match check; rotating the secret changes the sha and triggers a re-upload (and a container recreate if the mount is bind-mounted, which it is here).

What about plan-only review?

If you want to share a config or review one without populating the env, run plan --allow-unresolved-secrets:

./target/release/stratum plan --allow-unresolved-secrets \
  -c db.strat \
  -s .stratum/host.json

Unset env vars become the placeholder string <unresolved-secret:NAME> and flow through the plan as a normal string. apply refuses to execute any plan containing such a placeholder — the flag is plan-only by design. See plan --allow-unresolved-secrets.

What's next

Full reference for the syntax and semantics: Secrets.
For values you don't mind printing (debug toggles, public keys), add sensitive = false inside the secret block — the value still flows but is never added to the redaction map.

Multi-namespace deployments

You have a single host running several independent slices of infrastructure: a base layer (firewall, docker, traefik), a database tier, and a couple of apps. You want each slice to plan and apply on its own — without forgetting a -c flag and tripping the destruction guard, and without one slice's state file silently owning what another slice declared.

This is what namespaces are for. You write one manifest at ./stratum.strat that declares the shared host(s) and lists each slice by name; each slice gets its own state file; stratum -n <name> plan/apply operates inside one slice at a time. Stratum checks for docker_container port and name collisions across siblings at plan time, so two slices can't quietly fight over :80.

This tutorial walks through:

Writing a manifest that splits one host into two namespaces.
Applying each namespace independently.
Observing the on-disk state split (.stratum/<ns>.json + _shared.json).
Provoking a cross-namespace port collision and reading the error.
Migrating from an existing bundle (-c X -c Y -s state.json) to per-namespace state.

If you've never applied a stratum config before, start with Bootstrap a fresh droplet for the basics. The setup here assumes a host already exists.

What you'll end up with

./stratum.strat                # manifest (host + namespace blocks)
./infra/edge.strat             # namespace "infra" config (traefik, edge network)
./app/web.strat                # namespace "app" config (nginx behind traefik)
./app/db.strat                 # namespace "app" config (postgres)

.stratum/
  infra.json                   # state for namespace "infra"
  app.json                     # state for namespace "app"
  _shared.json                 # implicit _stratum_* tuning resources

Two independent state files for two namespaces, one shared file for the per-host tuning resources stratum injects automatically.

Step 1: write the manifest

The manifest is a plain .strat file. It contains:

host blocks — visible to every namespace.
secret blocks — visible to every namespace.
namespace blocks — one per slice.

It does not contain resource blocks. Those live in the per-namespace configs.

# stratum.strat
host "primary" {
  addr = "root@192.0.2.10"
}

namespace "infra" {
  configs = ["infra/edge.strat"]
}

namespace "app" {
  configs = [
    "app/web.strat",
    "app/db.strat",
  ]
}

configs paths are resolved relative to the manifest's directory. The manifest itself is loaded first when a namespace is selected, so anything it declares (the host "primary" block, any secret blocks) is visible to every file under configs.

Step 2: per-namespace configs

# infra/edge.strat
resource "docker_network" "edge" {
  host = host.primary.addr
  name = "stratum-edge"
}

resource "docker_container" "traefik" {
  host    = host.primary.addr
  name    = "traefik"
  image   = "traefik:v2.11"
  restart = "unless-stopped"
  ports   = ["80:80", "443:443"]
  volumes = ["/var/run/docker.sock:/var/run/docker.sock:ro"]
  networks = ["stratum-edge"]
}

# app/web.strat
resource "docker_container" "web" {
  host    = host.primary.addr
  name    = "web"
  image   = "nginx:alpine"
  restart = "unless-stopped"
  networks = ["stratum-edge"]
  labels = {
    "traefik.enable"                     = "true"
    "traefik.http.routers.web.rule"      = "Host(`web.example.com`)"
    "traefik.http.routers.web.entrypoints" = "web"
  }
}

# app/db.strat
secret "db_password" {
  from_env = "DB_PASSWORD"
}

resource "docker_container" "db" {
  host    = host.primary.addr
  name    = "db"
  image   = "postgres:16-alpine"
  restart = "unless-stopped"
  networks = ["stratum-edge"]
  env = {
    POSTGRES_PASSWORD = secret.db_password.value
    POSTGRES_DB       = "app"
  }
  volumes = ["app-db-data:/var/lib/postgresql/data"]
}

Three details:

Both app/web.strat and app/db.strat reference host.primary.addr. The host is declared in the manifest, not in either of these files — that's fine because the manifest is always loaded first in namespace mode.
The secret "db_password" block is scoped to the app namespace. The infra namespace's plan does not load app/db.strat, so DB_PASSWORD does not need to be set when planning infra.
All three containers attach to stratum-edge. The network is created by the infra namespace; the app namespace just attaches to it. Stratum does not validate that the network exists across namespaces — apply infra first.

Step 3: plan and apply the infra namespace

stratum -n infra plan

stratum.strat is auto-discovered in the current directory. The -n infra flag resolves to:

configs: stratum.strat (the manifest) + infra/edge.strat.
state: .stratum/infra.json (default for -n infra since no state = was set).

Expected plan: one create for docker_network.edge, one for docker_container.traefik, plus three implicit _stratum_* tuning resources stratum injects per host.

 + _stratum_sshd_oom_primary
 + _stratum_sshd_reload_primary
 + _stratum_swap_primary
 + docker_container.traefik
 + docker_network.edge

5 create, 0 update, 0 delete, 0 no-op

Apply it:

stratum -n infra apply -y

After the apply, inspect the state directory:

ls .stratum/
# _shared.json    infra.json

infra.json holds the two user-declared resources (docker_container.traefik, docker_network.edge). _shared.json holds the three _stratum_* resources. The split is by addr name: anything starting with _stratum_ goes to _shared.json, everything else to the namespace's file. This is what lets a second namespace targeting the same host see the tuning resources as already-applied instead of trying to recreate them.

Step 4: plan and apply the app namespace

export DB_PASSWORD=$(openssl rand -hex 32)
stratum -n app plan

The app namespace's plan loads the manifest, then app/web.strat, then app/db.strat. Two creates for the new containers; the three _stratum_* resources show as no-op (they're already in _shared.json from the infra apply).

   _stratum_sshd_oom_primary
   _stratum_sshd_reload_primary
   _stratum_swap_primary
 + docker_container.db
 + docker_container.web

2 create, 0 update, 0 delete, 3 no-op

The three no-ops are the heart of why _shared.json exists. Without it, the app namespace's first apply would try to recreate the swap file and the sshd drop-in, and the second apply of infra would do the same thing in reverse — every cross-namespace apply would churn the tuning resources.

Apply:

stratum -n app apply -y

State is now:

.stratum/
  _shared.json     # 3 _stratum_* resources
  app.json         # docker_container.{web,db}
  infra.json       # docker_container.traefik, docker_network.edge

Each namespace can now plan / apply / be torn down on its own without touching the other.

Step 5: collisions are caught at plan time

Now provoke a port conflict on purpose. Add a ports line to app/web.strat claiming :80:

# app/web.strat — buggy version
resource "docker_container" "web" {
  # ...
  ports = ["80:80"]   # WRONG: traefik already binds :80 on this host
}

stratum -n app plan

The cross-namespace validator runs before any plan output prints:

Error: cross-namespace port conflict on host `root@192.0.2.10`:
  - app::docker_container.web  (current) wants 80
  - infra::docker_container.traefik  (sibling) already claims it

Three things to notice:

The error names both namespaces (app::... for the resource being planned, infra::... for the sibling that already claimed the port).
The host is named in the prefix. Two namespaces using :80 on different hosts is allowed.
The validator runs at plan time, before any side effects — apply doesn't get a chance to fight docker over the port.

Container name collisions are caught the same way:

Error: cross-namespace container name conflict on host `root@192.0.2.10`:
  - app::docker_container.db  (current) uses name `traefik`
  - infra::docker_container.traefik  (sibling) already uses it

Resolve the conflict by removing the ports line — the web container is fronted by traefik over the stratum-edge network, so it doesn't need a host port binding.

Step 6: cross-namespace `depends_on` doesn't work — duplicate instead

Suppose you want to add a build step: a ssh_exec runs docker build to produce an image, and the docker_container in the app namespace consumes it.

# app/web.strat
resource "docker_container" "web" {
  # ...
  depends_on = ["ssh_exec.build-web"]   # WRONG: declared in another namespace
}

If ssh_exec.build-web lives in some other namespace, the planner can't see it — it only loads the current namespace's resources — and you'll get an undeclared-target error at plan time.

The workaround is duplicate the producer: move (or copy) the build step into the namespace that consumes it.

# app/web.strat
resource "ssh_exec" "build-web" {
  host    = host.primary.addr
  command = "cd /srv/repos/web && docker build -t web:dev ."
}

resource "docker_container" "web" {
  # ...
  image      = "web:dev"
  pull       = false
  depends_on = ["ssh_exec.build-web"]
}

Now depends_on is local to the namespace, and the planner can topo-sort the build ahead of the container start. If two namespaces share the same git checkout and need to build it for different consumers, declare the ssh_exec once per namespace — the apt-package equivalent of "each apt cache update runs once per host, not once per consumer."

Migrating from bundle mode

If you've been running stratum with -c X -c Y -s droplet.json, the path to namespaces is mechanical:

Move shared declarations into a new stratum.strat. Pull every host and secret block out of the per-config files and into the manifest. Add one namespace "<name>" { configs = [...] } per logical slice.
Leave the per-slice configs in place. They keep their resource blocks. Any host.<name>.<field> references in them now resolve against the manifest's host declaration — no edits required, as long as the host name is unchanged.
Split the bundle state file. Run stratum state show <addr> -p droplet.json for each resource to identify which namespace it belongs to. Hand-write the per-namespace state files by copying entries out of droplet.json. Implicit _stratum_* resources go to _shared.json.
Verify with plan. For each namespace, run stratum -n <name> plan. Every step should show no-op (0 create, 0 update, 0 delete, N no-op). Anything else means a resource ended up in the wrong file — fix the split.
Mind the depends_on edges. If any docker_container.depends_on crosses what is now a namespace boundary (the producer is in namespace A, the consumer in namespace B), duplicate the producer into B and edge to the local copy. See Step 6 above.

The bundle workflow keeps working without migration. If you don't need per-slice state files, you can keep using -c X -c Y -s state.json indefinitely — nothing about namespaces is mandatory.

What's next

Namespaces reference — full attribute table and error catalog.
-n and --manifest CLI flags — exact flag semantics, including the -s override rule.
Architecture: split state — how _shared.json is reconciled at load and save time.

Changelog

2026-05-27

Documented the namespace block shipment. New page language/namespaces.md covers the manifest-only namespace "<n>" { configs = [...] state = "..." } syntax, allowed top-level blocks (manifest vs per-namespace), shared-vs-scoped host/secret/provider visibility, the cross-namespace depends_on limitation, and the cross-namespace port + container-name collision check. New tutorial tutorials/namespaces.md walks through writing a manifest with two namespaces sharing a host, observing the split state (.stratum/<n>.json + .stratum/_shared.json), provoking a port collision, and migrating from bundle mode. Updated cli.md for the new global -n / --namespace and --manifest flags, including the -n + -c mutual-exclusion, the -s override rule, the split-state file pair, and the cross-namespace conflict check. Updated architecture.md with new sections: manifest discovery, cross-namespace validator, split state (save_split / load_merged semantics), implicit per-host _stratum_* resource catalog, and renumbered the plan/apply flow to thread namespace mode through it. Updated language/overview.md for the new top-level block and the four-pass evaluator. Updated language/multi-config.md to point at namespaces as the alternative for multi-slice deployments and generalized "one state per droplet" to "one state per host." Refreshed introduction.md "what works" with namespaces. Site-specific names (vortex, portal, 68.183.228.11, sotheara-say/*, *.nip.io, deployd as a deployed app) were swept from cli.md, tutorials/{bootstrap-droplet,book-serve,inject-secret-into-container}.md, providers/{system,docker,git}.md, language/{secrets,types,interpolation,resources}.md, and the introduction — replaced with generic web / api / db / app / host.primary.addr / RFC 5737 documentation IPs (192.0.2.10) / example.com.

2026-05-26

Documented the multi-feature shipment (8 gaps closed across config, core, docker, system, and a new git provider). New page language/interpolation.md covers ${...} string interpolation: grammar, scalar coercion, \${ escape, the honesty guard still applying through templates. New page providers/git.md covers the git_repo kind: branch / tag / SHA dispatch, recreate-on-url-change, commit_sha drift, depth handling. Extended providers/system.md with system_secret_file (whole-file secret kind, sha256-only state, stricter default mode 0400, secret ref directly in content). Extended providers/docker.md with docker_image (build-on-host producer kind, DOCKER_BUILDKIT=1, image_id in state) and four new docker_container attrs: depends_on (planner topo sort + cycle / unknown-ref detection), healthcheck (map lowered to --health-* flags + post-apply readiness wait up to 60s), memory / memory_swap (passthrough to docker run), and list-form command (argv-style with shell-escaping). Updated providers/ssh.md for the new ssh_exec.env map (sorted, shell-quoted, supports secret refs). Updated cli.md for the new global --env-file flag with auto-./.env load (12-factor: process env wins, first-set wins among files) and the new stratum status subcommand (per-host uptime + free + df + docker stats table). Updated architecture.md with planner-side validators (port-conflict, depends_on topo sort), the normalize_for_plan / SECRET_CONTENT_TO_SHA plaintext-leak fix, plan-level redact_plan walk (drops marker-vs-plaintext spurious drift), the post-apply readiness wait, and rewrote the delete-ordering section for forward-topo with reverse-iteration fallback. Refreshed introduction.md "what works" (now four providers; ${...} interpolation, --env-file, status, git_repo, docker_image, system_secret_file, depends_on, healthcheck, planner validators all listed). Extended tutorials/inject-secret-into-container.md with the connection-string-via-${...} pattern and the whole-file-secret-via-system_secret_file pattern. SUMMARY.md adds language/interpolation.md and providers/git.md.

2026-05-24 (latest)

Documented four shipped features. Added language/secrets.md (full reference for secret blocks: sources, refs, redaction map, marker shape, sensitive/short-value rules, the honesty guard, --allow-unresolved-secrets). Added language/multi-config.md (one-state-per-host rule, cross-file refs, duplicate hard errors, pointer to state merge). Added tutorials/inject-secret-into-container.md (env-var-on-docker_container pattern with rotation; inline code blocks — listings infra not yet present). Rewrote the destruction-guard section in cli.md around the wrong-state-file deletion footgun, updated the error template to include loaded configs + state path. Added state merge to cli.md. Added --allow-unresolved-secrets to plan flag table. Documented docker_container.pull = false in providers/docker.md with the locally-built-image use case. Documented system_dir empty-dir mode (no source_dir) in providers/system.md and made source_dir optional in the attribute table; updated language/types.md to match. Added secret as a top-level block in language/overview.md and added secret as a ref root in language/references.md. Updated tutorials/book-serve.md to teach one-state-per-host via multi--c, not one-state-per-config. Refreshed introduction.md "what works" with all four features. Fixed traefik:v3.1 -> traefik:v2.11 in architecture.md state-file example and added a secret-marker subsection.

2026-05-24 (later)

Documented system_dir kind (tar+gz upload, manifest sha tracking, delete_extra, 200-file read cap) under providers/system.md. Added source_dir semantics to language/types.md. Added tutorials/book-serve.md walking through the canonical "second app behind Traefik" pattern, including the one-state-file-per-host rule. Documented apply --allow-destroy and the destruction-guard rationale on cli.md; updated the bootstrap teardown step to use it. Refreshed introduction "what works" with system_dir and the destroy guard.

2026-05-24

Scope pivot to ansible-replacement: coolify provider deleted, app-deployment work moved out of scope. --live flag dropped (apply -y now executes). New system provider documented (system_package, system_service, system_file, system_ufw_rule). Drift detection shipped (stratum plan --refresh, post-apply self-check, Observed/Drift types, one-sided diff_observed). content_file attribute on system_file documented under language reference. Replaced tutorials/slice-1-hello.md with tutorials/bootstrap-droplet.md. Architecture page got a drift section and a delete-ordering note.

2026-05-24 (earlier)

Backfilled the book from current source: introduction, language reference (overview / hosts / resources / references / types), provider pages (coolify / ssh / docker), CLI reference, architecture, and the Slice 1 tutorial. Doc agent did the writing; source under crates/ is the source of truth.