Introduction
stratum is a tiny declarative IaC tool, written from scratch in Rust. It's scoped to system bootstrap: install packages, drop config files, manage systemd services, configure the firewall, and run system-tier containers (Traefik, monitoring agents, log shippers). Describe resources in a .strat file, run stratum plan to preview, and stratum apply -y to make it happen. State lives in a JSON file under .stratum/state.json next to your config.
The tool is intentionally small. No plugin system, no remote backend, no cloud SDK. Providers are first-class crates in the workspace; today there are four:
system— packages, services, files, secret files, directories, ufw rules.ssh— shells out to systemsshto run commands and write files.docker— drives the remotedockerCLI over SSH (networks, containers, image builds).git— clones and pins git working trees on a remote host.
What stratum is not
Stratum is not an app-deployment tool. It will not build your application image, manage per-app environments, or rotate deploys. Stratum's role stops at "the host has docker, traefik, a firewall, and the right config files." Reach for a per-app deploy tool from there.
What works today
.stratconfig:host,secret,provider,resourceblocks; nested maps, lists, string/number/bool values; comments (#and//); refs of the formhost.<name>.<field>andsecret.<name>.value;${...}string interpolation for embedding refs inside larger strings.- JSON state at
.stratum/state.json, withcreate/update/deleteactions and a recursive structural diff that ignores state-only provider-computed fields. - CLI:
plan(with--refreshfor live drift detection and--allow-unresolved-secretsfor plan-only review),apply(with-yto execute and--allow-destroyto permit deletes),status(per-host resource snapshot),state list,state show,state merge. Global--env-filefor loading env vars before resolvingsecret { from_env }refs, with auto-load of./.envwhen no flag is passed. - Repeatable
-conplan/apply: multiple.stratfiles merge into one document and evaluate together. Hosts and secrets declared in one file are visible to refs in another. Duplicates across files are hard errors that name both paths. See Multi-file configs. - Namespaces: a top-level
namespace "<n>" { configs = [...] }block declares a deployable slice with its own state file. The CLI's-n NAMEglobal selects a namespace; the manifest's sharedhost/secret/providerblocks are visible to every namespace's configs. Cross-namespacedocker_containerport and name collisions are caught at plan time. Implicit per-host tuning resources are kept in a shared state file (.stratum/_shared.json) so multiple namespaces sharing a host don't fight over them. Bundle mode (no-n) is unchanged. system_package,system_service,system_file,system_secret_file,system_dir,system_ufw_rule(apt + systemd + ufw + file + secret-file + directory-tree management).system_diralso has an empty-dir mode for pre-creating daemon directories without uploading anything.system_secret_filestoressha256only — plaintext never persists in state.ssh_exec(with optionalenvmap for sensitive shell prefixes),ssh_file(run commands, write files).docker_network,docker_container,docker_image. Containers supportdepends_on(planner topo-sorts),healthcheck(post-apply readiness wait),memory/memory_swaplimits, list-formcommandfor argv passthrough, andpull = falsefor locally-built images.docker_imagebuilds images on the host withDOCKER_BUILDKIT=1and tracks the resultingimage_id.git_repo(newgitprovider) clones a remote repo to a fixed path on a host and pins it to a branch, tag, or full SHA. State trackscommit_sha; drift triggersfetch+reset.- Secrets v0:
secret { from_env = ... }orfrom_file = ..., referenced assecret.<name>.valueor${secret.<name>.value}inside strings. Plaintext flows to providers; state stores a{__secret, __secret_sha256}marker for whole-leaf matches and a<secret:NAME:sha256:HEX>inline marker for substring matches inside interpolated strings;diff/diff_observedare marker-aware so no perpetual drift; plan output renders object markers as<secret:name sha:abc123>. See Secrets. - Planner-side validators:
docker_container.portsconflict check across the merged config (fails on(host, ip, host_port)collisions, with0.0.0.0:N/127.0.0.1:Nsymmetry);depends_ontopo sort with cycle and unknown-ref detection. - Post-apply readiness wait: a
docker_containerwith ahealthcheckmap blocks subsequent steps untildocker inspectreportshealthy(60s budget). Otherwise a 500ms cosmetic pause. Provider::readonsystem_*(exceptsystem_ufw_rule),ssh_file, bothdocker_*kinds, andgit_repo— surfaces drift between recorded state and live host reality.- Post-apply self-check: every successful
stratum apply -yre-reads each resource and reportspost-apply drift: clean(or counts of differ/missing/unreadable). content_file = "<relative-path>"onsystem_fileandsource_dir = "<relative-path>"onsystem_dir, resolved against the.stratfile's directory.- Destruction guard:
stratum applyrefuses to run a plan withDeletesteps unless--allow-destroyis passed. Error names the resources, the loaded configs, and the state file path.
What does not work yet
- No drift detection for
system_ufw_ruleorssh_exec— both returnUnknownfromread, so they show up inunreadablecounts (intentional). - No DNS provider.
- No partial / targeted apply — every plan is whole-config.
- No state locking or remote backend.
- No cross-resource refs of the form
<kind>.<name>.<attr>(e.g.docker_image.X.id) — producer-to-consumer wiring still uses the literal tag + adepends_onedge. Planned, not shipped. - The
provider "<name>" { ... }block parses but no shipped provider reads one today.
Quick start
cargo build --release
# write a config
cat > stratum.strat <<'EOF'
host "primary" {
addr = "root@192.0.2.10"
}
resource "system_package" "curl" {
host = host.primary.addr
name = "curl"
}
EOF
# see what would happen
./target/release/stratum plan
# preview again with live drift annotation
./target/release/stratum plan --refresh
# apply for real
./target/release/stratum apply -y
State is written to .stratum/state.json. Inspect it with stratum state list and stratum state show.
Where to go next
- Bootstrap a fresh droplet — end-to-end walkthrough: blank Ubuntu 24.04 → ufw + docker + traefik in one apply.
- Inject a secret into a docker container — the env-var-from-shell-to-container pattern, with rotation.
- Multi-namespace deployments — split one host into independent deployable slices.
- The
.stratlanguage — full grammar reference. - String interpolation — embed
${host.X.field}and${secret.Y.value}inside larger strings. - Multi-file configs — one state per host, cross-file refs, duplicate handling.
- Namespaces — manifest blocks, per-namespace state, cross-namespace conflict checks.
- Secrets — the
secretblock, refs, redaction, the honesty guard. - Providers — what each kind does and the exact attribute schema.
- Architecture — how plan, apply, and drift detection fit together.
The .strat language
A .strat file is a flat list of top-level blocks. Five kinds exist:
host "<name>" { ... }— a named SSH target. See Hosts.secret "<name>" { ... }— a sensitive value sourced from env or file, referenced assecret.<name>.value. See Secrets.provider "<name>" { ... }— provider configuration. See Providers.resource "<kind>" "<name>" { ... }— a piece of declared infra. See Resources.namespace "<name>" { ... }— a deployable slice; lists the.stratfiles that apply together against one dedicated state file. Only meaningful in a top-level manifest. See Namespaces.
document ::= block*
block ::= ident label* "{" body "}"
label ::= string
body ::= ( attr | block )*
attr ::= ident "=" value
value ::= string | number | bool | list | map | ref
list ::= "[" ( value ( "," value )* ","? )? "]"
map ::= "{" ( ( ident | string ) "=" value )* "}"
ref ::= ident ( "." ident )+
Whitespace is insignificant. Block bodies may also contain nested blocks; those are folded into the parent map under the key <kind> (or <kind>_<label> when labels are present).
Comments
Both # and // start a line comment. There are no block comments.
# this is a comment
// so is this
resource "ssh_exec" "uptime" {
host = host.prod.addr # inline comments work too
command = "uptime"
}
Evaluation order
The evaluator runs four passes:
- Hosts first. All
hostblocks are evaluated with an empty scope, so they must be made of literals only. Any ref inside a host block is a hard error. - Secrets next. All
secretblocks are evaluated, sources resolved eagerly. Secrets are also literal-only — refs inside secret bodies error. - Namespaces. All
namespaceblocks are collected. Body attributes (configs, optionalstate) must be literal — no refs allowed. The namespaces don't take part in resource evaluation; they only inform the CLI's-n NAMEresolution. - Providers and resources. With the host + secret scope built, provider and resource bodies are evaluated and any
host.<name>.<field>/secret.<name>.valuereference is resolved.
See References & scope for the resolution rules.
Resource kind naming
A resource kind must start with the provider name, separated by an underscore — for example system_package, ssh_exec, docker_container. The prefix is what stratum uses to route resources to a provider. A kind with no underscore (or with only an underscore as a separator) is rejected.
Values
Strings, numbers, booleans, lists, and maps. See Types & values for the exact rendering rules — in particular the integer-vs-float behavior for numbers.
Strings may embed ${<ref>} placeholders that are substituted at config-load time. See String interpolation.
Hosts
A host block names an SSH target. Other blocks reference its fields via host.<name>.<field>.
host "prod" {
addr = "root@192.0.2.10"
port = 22
}
host_block ::= "host" string "{" attr* "}"
The single label is the host's name and must be unique within the document. The body is a flat list of attributes; nested blocks are allowed but rare in practice.
Literal-only
Host blocks are evaluated in pass 1 with an empty scope. They cannot reference anything — not other hosts, not providers, not themselves. Any ref inside a host body errors with references not allowed inside host blocks.
host "prod" {
addr = host.staging.addr # error: hosts must be literal
}
Fields
There is no schema for host fields — anything you put in a host body is accessible by ref. The conventional fields used by the SSH and Docker providers are:
| field | type | notes |
|---|---|---|
addr | string | user@host form passed verbatim to the ssh binary. |
port | number | Currently unused by the providers; reserved for future use. |
In the providers shipped today, only addr is consumed; SSH connection options come from your ~/.ssh/config and ssh-agent. The port field is parsed but not yet wired through.
Multiple hosts
Declare as many as you need. Each one is independent.
host "prod" { addr = "root@192.0.2.10" }
host "staging" { addr = "root@5.6.7.8" }
resource "ssh_exec" "uptime_prod" {
host = host.prod.addr
command = "uptime"
}
resource "ssh_exec" "uptime_staging" {
host = host.staging.addr
command = "uptime"
}
Secrets
A secret "<name>" { ... } block sources a sensitive value from outside the .strat file and makes it referenceable as secret.<name>.value. Plaintext flows into provider attrs at apply time; state stores a redaction marker, never the value itself.
secret "pg_password" {
from_env = "PG_PASSWORD"
}
resource "docker_container" "db" {
host = host.primary.addr
image = "postgres:16-alpine"
env = {
POSTGRES_PASSWORD = secret.pg_password.value
}
}
secret_block ::= "secret" string "{" attr* "}"
The single label is the secret's name. Names must be unique within the document — duplicates across -c files error with duplicate secret.
Sources
Exactly one of from_env or from_file is required. Both set or neither set is a hard error (BadSecretBody).
| attr | type | description |
|---|---|---|
from_env | string | Name of an environment variable. Resolved with std::env::var at config-load time. |
from_file | string | Path to a file. ~ and ~/ expand to $HOME (or $USERPROFILE on Windows). Relative paths resolve against the .strat file's directory. The file's contents are loaded with std::fs::read_to_string; one trailing \n or \r is trimmed. |
sensitive | bool | Default true. When false, the resolved value is still used by providers but is never placed in the redaction map — CLI output and state will show it in the clear. Opt-out for values you don't mind printing. |
Both sources resolve eagerly at config-load time. A missing env var or unreadable file is a hard error unless --allow-unresolved-secrets is set on plan.
References
The only allowed field is value:
env = { POSTGRES_PASSWORD = secret.pg_password.value }
Anything else (secret.pg.fingerprint, secret.pg, secret.pg.value.foo) errors with unknown secret field or reference ... too short.
Like host blocks, secret bodies are literal-only: any ref inside (including refs to other secrets) errors with references not allowed inside \secret` blocks`.
Redaction
When a secret is resolved and meets all of these conditions, its plaintext is added to a private redaction map:
sensitive = true(the default).- The resolved value is at least 8 characters long.
- The value is non-empty.
- The secret is not in the unresolved-placeholder state (see below).
The redaction walk runs over every plan step's desired and prior values before printing, and over every provider's returned attrs before they're persisted to state. Any leaf string that exactly matches a known plaintext is replaced with a marker object:
{
"__secret": "pg_password",
"__secret_sha256": "sha256:f7c3bc1d808e04..."
}
The marker is what lives in .stratum/state.json. Re-loading state and re-planning produces the same marker (no re-redaction needed — redact_walk is idempotent on markers).
Substring redaction
The exact-match case covers env = { POSTGRES_PASSWORD = secret.pg.value } — the leaf string equals the plaintext, so the whole leaf is replaced with the object marker. A secret ref inside a ${...} interpolation (see String interpolation) is different: the plaintext is glued into a larger string at evaluation time, so the redaction walk sees "postgresql://app:CORRECTHORSEBATTERY@db:5432/app", not the bare secret.
For those cases, redact_walk falls back to substring replacement. Every known secret plaintext that appears in the leaf is replaced inline with a marker token of the form <secret:NAME:sha256:HEX>. Longest match wins (so overlapping secrets stay deterministic), and replacement is per-occurrence. The substituted string is what lands in state:
{
"env": {
"DATABASE_URL": "postgresql://app:<secret:pg:sha256:f7c3bc...>@db:5432/app"
}
}
Substring redaction also runs over diff_observed output: when state holds the inline marker and the live host returns plaintext, the marker is applied to the observed side before comparison, so both sides collapse to the same string and the diff disappears. Without this, every plan --refresh would emit a spurious Update for every interpolated secret-bearing field. The post-redaction equality check happens in Extracted::redact_plan, which the CLI calls on both plan and apply before printing.
A short value (< 8 chars) still resolves and flows into provider attrs — it's just not added to the redaction map, because substring-substitution on short strings ("root", "5432") has a high false-positive rate. The CLI emits a warning to stderr in that case:
[secrets] warning: secret `s` resolved value is <8 chars; CLI/state will not redact it
If two distinct secrets resolve to the same plaintext, you get a different warning — the marker is ambiguous because there's no way to tell which secret a given plaintext leaf came from:
[secrets] warning: secrets `a` and `b` resolved to the same value — redaction marker may be ambiguous
Plan output
Secret-bearing fields render with a 6-char hash prefix:
~ docker_container.db
~ env.POSTGRES_PASSWORD: <secret:pg_password sha:f7c3bc> -> <secret:pg_password sha:9a1e44>
The hash is enough signal to tell that the value changed without leaking the value or a full attackable digest.
Drift detection
diff_observed is marker-aware:
- Marker (state) vs plaintext (observed) — hash the plaintext, compare to the marker's
__secret_sha256. Match → no drift. - Marker vs marker — compare hashes directly.
- Mismatch → emit a single
<secret> -> <secret-drifted>change, with no plaintext on either side.
This is what stops --refresh from showing a perpetual drift on every secret-bearing field. Without marker awareness, every refresh would compare the state-side marker object against the host-side plaintext and flag a difference.
The honesty guard
Some resource attrs receive opaque blobs that stratum can't substring-redact after the fact: file contents, file paths that may contain content interpolation, directory uploads. Embedding a secret ref in any of these is rejected at config-load time:
| kind | forbidden attr |
|---|---|
system_file | content |
system_file | content_file |
system_dir | source_dir |
resource "system_file" "x" {
content = secret.s.value # error: SecretInUnsupportedAttr
}
The error message is secret reference not allowed in \system_file.content` — secrets must be in single-leaf string attrs (e.g. inside `env`), not embedded in file content or path strings`.
This is a deliberate limitation. The right shape for a config-file secret is a templated config rendered outside stratum (or via a future system_template kind that knows about secret boundaries) — not a secret embedded inside a content blob whose redaction story is "search the file for the plaintext, hope you find it."
For the common case of "drop a whole secret blob on the host" (a Firebase service-account JSON, an .env file, an age-encrypted key), use system_secret_file. Its content attribute accepts a secret ref directly — state stores sha256 plus permissions, never the plaintext.
resource "system_secret_file" "firebase-sa" {
host = host.primary.addr
path = "/etc/app/firebase-sa.json"
content = secret.firebase_sa.value
mode = "0400"
}
--allow-unresolved-secrets
stratum plan --allow-unresolved-secrets treats a missing env var or unreadable file as a soft failure: the secret's value becomes the placeholder string <unresolved-secret:NAME> instead of erroring out. Useful when reviewing someone else's config without their env populated.
The placeholder flows through eval_value like any other string, so it shows up in plan output wherever the secret was referenced. Apply refuses to run a plan containing any placeholder:
refusing to apply: plan contains unresolved-secret placeholder for `pg_password`
hint: this only happens via `plan --allow-unresolved-secrets`; set the secret's source and retry.
Placeholders are not added to the redaction map.
Errors
| condition | error |
|---|---|
Neither from_env nor from_file | BadSecretBody |
Both from_env and from_file | BadSecretBody |
Env var unset (without --allow-unresolved-secrets) | SecretEnvMissing |
from_file path unreadable (without the flag) | SecretFileMissing |
from_file is relative but the source has no base dir | SecretFileNoBaseDir (only with load_str, not CLI flows) |
~ in from_file but neither HOME nor USERPROFILE set | SecretTildeNoHome |
| Ref inside the secret body | RefInSecretBlock |
| Unknown secret name | UnknownSecret |
Unknown field (anything other than value) | UnknownSecretField |
Duplicate name across -c files | DuplicateSecret (names both paths) |
| Secret ref in a forbidden attr | SecretInUnsupportedAttr |
Tutorial
See Inject a secret into a docker container for the env-var-on-docker_container pattern end to end.
Resources
A resource block declares a piece of infra that should exist.
resource "<kind>" "<name>" {
<attr> = <value>
...
}
resource_block ::= "resource" string string "{" body "}"
body ::= ( attr | nested_block )*
The two labels are positional:
<kind>— the resource kind, e.g.docker_container. The prefix before the first_selects the provider.<name>— a stable identifier unique within the kind. The pair<kind>.<name>is the address used everywhere else (state file, plan output,stratum state show).
The address (kind, name) must be unique within the document; duplicates error.
Kind-to-provider routing
stratum splits the kind on the first _ and uses the prefix as the provider name. The kind must contain at least one underscore and must not start with one.
| kind | provider |
|---|---|
system_package | system |
ssh_exec | ssh |
docker_container | docker |
Anything without an underscore-prefix (foo, _bar) is rejected with resource kind ... must be prefixed with provider name.
Example
resource "docker_container" "hello" {
host = host.primary.addr
name = "hello"
image = "nginxdemos/hello:latest"
restart = "unless-stopped"
networks = ["stratum-edge"]
labels = {
"traefik.enable" = "true"
"traefik.http.routers.hello.rule" = "Host(`hello.example.com`)"
}
}
Nested blocks
A resource body may contain nested blocks. They are folded into the parent map. If the nested block has labels, the key is <kind>_<label1>_<label2>...; with no labels it is just <kind>.
resource "ssh_exec" "demo" {
host = host.prod.addr
command = "true"
meta {
owner = "ops"
}
}
Stored attrs:
{
"host": "root@192.0.2.10",
"command": "true",
"meta": { "owner": "ops" }
}
Provider implementations today expect attributes, not nested blocks — see each provider page for the shape it reads.
What providers see
After parsing and ref resolution, each resource body is a serde_json::Value::Object. Providers receive that object and pick the fields they care about. Unknown fields are ignored — there is no validation step between the parser and the provider.
References & scope
A reference is a dotted path of identifiers used in place of a literal value.
ref ::= ident ( "." ident )+
The first segment is the root. Two roots are supported:
| root | resolves to | shape |
|---|---|---|
host | a field on a declared host block. | host.<name>.<field> (3+ parts) |
secret | the resolved plaintext of a declared secret block. | secret.<name>.value (exactly 3 parts; value is the only allowed field) |
Anything else errors with unknown reference root.
Host references
Form: host.<name>.<field> (at least three segments).
host "prod" {
addr = "root@192.0.2.10"
port = 22
}
resource "ssh_exec" "uptime" {
host = host.prod.addr # -> "root@192.0.2.10"
command = "uptime"
}
Resolution walks the host's evaluated attrs as a JSON map. Each segment after the host name indexes one level deeper, so nested fields work too:
host "prod" {
ssh = {
user = "root"
addr = "1.2.3.4"
}
}
resource "ssh_exec" "x" {
host = host.prod.ssh.addr # -> "1.2.3.4"
command = "true"
}
Secret references
Form: secret.<name>.value. Exactly three segments — value is the only allowed field, since secrets are leaves.
secret "pg_password" {
from_env = "PG_PASSWORD"
}
resource "docker_container" "db" {
host = host.primary.addr
image = "postgres:16"
env = {
POSTGRES_PASSWORD = secret.pg_password.value
}
}
Secret refs are only allowed inside single-leaf string attrs. They are rejected at config-load time inside system_file.content, system_file.content_file, and system_dir.source_dir — see Secrets: the honesty guard for why.
Error cases
| condition | error |
|---|---|
Fewer than 3 segments (host.prod) | reference ... too short — expected at least 3 segments |
Unknown host name (host.ghost.addr) | unknown host ghost in reference ... |
Unknown field (host.prod.missing) | host prodhas no fieldmissing ... |
Unknown secret name (secret.ghost.value) | unknown secret ghost in reference ... |
Secret field other than value (secret.s.fingerprint) | reference to secret s has unsupported field ... |
Unknown root (provider.x.y) | unknown reference root provider`` |
Any ref inside a host body | references not allowed inside host blocks |
Any ref inside a secret body | references not allowed inside secret blocks |
Secret ref inside system_file.content / content_file / system_dir.source_dir | secret reference not allowed in ... |
Why these two roots only
The three-pass evaluator collects hosts first (pass 1), then secrets (pass 2), then evaluates providers and resources (pass 3). Providers and resources see fully populated host + secret scopes but cannot reference each other — there is no resource.foo.attr form. Cross-resource references would need a topological pass; that has not shipped.
References inside strings
A reference may also appear inside a string as ${<ref>} — the placeholder is replaced with the resolved scalar value at evaluation time. The same root rules apply (host.* and secret.* only), and the same honesty guard fires for forbidden attrs. See String interpolation for the grammar and edge cases.
env = {
DATABASE_URL = "postgresql://app:${secret.pg.value}@${host.primary.addr}:5432/app"
}
Bare identifiers as values
A single identifier with no dots is not a valid value. Either quote it as a string or extend it to a ref. The parser will report bare identifier ... not allowed as value (use a string or a reference like ...).
resource "ssh_exec" "x" {
host = prod # error: bare identifier
host = "prod" # ok — string
host = host.prod.addr # ok — reference
}
String interpolation
A double-quoted string may embed ${<ref>} placeholders. Each placeholder is replaced at config-load time with the resolved value of the reference (see References & scope), coerced to its string form.
env = {
DATABASE_URL = "postgresql://app:${secret.pg.value}@${host.primary.addr}:5432/app"
}
string ::= '"' ( char | escape | interp )* '"'
interp ::= "${" ref "}"
ref ::= ident ( "." ident )+
escape ::= "\\\"" | "\\\\" | "\\n" | "\\r" | "\\t" | "\\${"
A string that contains no ${...} lexes as a plain string literal. A string with at least one ${...} lexes as a template (alternating literal and interpolation segments) and the evaluator concatenates the resolved parts.
Allowed refs inside ${...}
Any reference the References & scope rules accept:
host.<name>.<field>— including nested fields likehost.prod.ssh.addr.secret.<name>.value— the resolved plaintext flows in. State stores a substring marker (see Secrets).
A bare identifier inside ${...} is rejected:
${foo} ← error: "bare identifier `foo` not allowed in `${...}`"
${secret.foo} ← error: too short (secret refs need exactly 3 segments)
${a.${b}} ← error: "nested `${` is not supported"
${} ← error: "empty `${}` is not allowed"
Scalar coercion
The reference must resolve to a scalar — string, number, or bool. Numbers render via their JSON form (4000, 1.5), bools as true / false. Lists, maps, or null error at evaluation time:
cannot interpolate non-scalar value `${host.h.tags}` into a string
(refs inside `${...}` must resolve to a string, number, or bool)
Escaping
\${ produces a literal ${ in the output and is not a placeholder. There is no other \$ escape — a bare \$ followed by anything else is a lex error.
shell_var = "literal \${HOME} not stratum" # -> "literal ${HOME} not stratum"
The honesty guard still applies
Embedding secret.X.value inside a string interpolation lands in the same forbidden-attr check as a bare secret reference. The check fires by attr name, not by ref form:
# Both are rejected — system_file.content is in SECRET_FORBIDDEN_ATTRS.
content = secret.s.value
content = "prefix ${secret.s.value} suffix"
See Secrets: the honesty guard for the list of forbidden (kind, attr) pairs and the reasoning.
Where interpolation is most useful
Connection strings and other glued-together values where a bare ref doesn't fit:
resource "docker_container" "api" {
host = host.primary.addr
image = "api:dev"
env = {
# Embed a secret inside a URI — bare `secret.pg.value` would only work
# if the whole env value were the password.
DATABASE_URL = "postgresql://app:${secret.pg.value}@db:5432/app"
# Combine multiple host fields.
INTERNAL_API = "http://${host.primary.addr}:4000"
}
}
When a secret is interpolated into a larger string, state stores the value with an inline substring marker ("--requirepass <secret:pg:sha256:HEX>") instead of replacing the whole leaf with an object marker. See Secrets: substring redaction.
Types & values
Five value types, all of which round-trip to JSON.
String
Double-quoted. Supports the escape sequences \", \\, \n, \r, \t. Strings cannot contain a raw newline; use \n.
name = "api"
greeting = "hello\nworld"
A string may also contain one or more ${<ref>} placeholders, replaced at config-load time with the resolved value of the reference. \${ escapes a literal ${. See String interpolation.
db_url = "postgresql://app:${secret.pg.value}@${host.primary.addr}:5432/app"
escaped = "literal \${HOME}" # -> "literal ${HOME}"
Number
A signed decimal, optionally with a fractional part. Lexed as f64. When emitting JSON, stratum prefers a JSON integer if the number is finite and whole (port = 4000 becomes 4000, not 4000.0); otherwise it emits a JSON float. Non-finite floats serialize as null.
port = 4000 # -> JSON 4000
timeout = 1.5 # -> JSON 1.5
This matters for diff: changing 4000 to 4000.0 is a no-op, since both render as the integer 4000.
Bool
Bare true or false. These are lexed as keywords, not identifiers.
tls = true
List
Square-bracketed, comma-separated. Trailing commas are allowed. Items can be any value type (including refs and other lists/maps).
ports = ["8080:80", "8443:443"]
mixed = [1, "two", true]
nested = [[1, 2], [3, 4]]
list ::= "[" ( value ( "," value )* ","? )? "]"
Map
Brace-delimited. Each entry is <key> = <value>. Entries are separated by whitespace — no commas. Keys may be either bare identifiers or quoted strings; the string form is required for dotted keys like Traefik labels.
env = {
NODE_ENV = "production"
PORT = 4000
}
labels = {
"traefik.enable" = "true"
"traefik.http.routers.api.rule" = "Host(`api.example.com`)"
}
map ::= "{" ( ( ident | string ) "=" value )* "}"
Reference
See References & scope.
host = host.prod.addr
Identifiers
Identifiers start with an ASCII letter or _, then continue with letters, digits, _, or -. They are used for block kinds, attribute keys, map keys, and ref segments.
content_file on system_file
The system_file resource (see providers/system) accepts a special attribute, content_file, which inlines a local file's bytes into content at config-load time.
resource "system_file" "traefik-config" {
host = host.primary.addr
path = "/etc/traefik/traefik.yml"
content_file = "files/traefik.yml"
mode = "0644"
}
Semantics:
- The value is a path relative to the
.stratfile's directory (not the current working directory). - The file is read at config-load time. Its bytes become the
contentattribute the provider sees.content_fileitself is stripped — providers never see it. - The file must contain valid UTF-8 (it's loaded with
std::fs::read_to_string).
Errors:
| condition | error variant |
|---|---|
Both content and content_file on the same system_file | EvalError::ContentConflict |
| The referenced file does not exist or is unreadable | EvalError::ContentFileMissing |
Using content_file via stratum_config::load_str (no base) | EvalError::ContentFileNoBaseDir |
The third case only matters if you're embedding stratum-config in another program and calling load_str directly. The stratum CLI always uses load_file, so content_file always works in CLI flows.
This attribute is specific to system_file. The ssh_file resource only supports inline content. Use system_file if you want to load a file from disk.
source_dir on system_dir
The system_dir resource (see providers/system) accepts an optional source_dir attribute pointing at a local directory. Same base-dir rule as content_file:
resource "system_dir" "book" {
host = host.primary.addr
source_dir = "../book/book"
path = "/srv/stratum-book"
}
Semantics:
- The value is a path relative to the
.stratfile's directory. - At config-load time the path is joined with the base dir and
std::fs::canonicalize'd. The provider sees the canonical absolute path. The original relative form is not preserved. - The directory must exist and be a directory.
- Omitting
source_diris valid — the resource enters empty-dir mode, where onlymkdir -p+chown+chmodrun on the host.
Errors:
| condition | error variant |
|---|---|
source_dir points at a missing path or non-directory | EvalError::SourceDirMissing |
Using source_dir via stratum_config::load_str (no base dir) | EvalError::SourceDirNoBaseDir |
Unlike content_file, the contents are not inlined into state at config-load time — system_dir builds a fresh manifest from source_dir every plan and only ships bytes during apply.
Multi-file configs
plan and apply accept -c more than once. Every listed file is parsed independently, then concatenated into one document and evaluated as a single config. Hosts and secrets declared in any file are visible to references in any other file.
stratum apply -y \
-c infra.strat \
-c app.strat \
-s .stratum/host.json
The shape on disk is one .strat file per logical concern (the host's bootstrap, each app, each shared service). The shape on a host is one state file per host, never one per .strat file.
For multiple deployable slices on the same host where you want each slice to plan and apply on its own — without juggling a long -c list and without one slice's state file silently owning another slice's resources — see Namespaces. Namespaces are the higher-level alternative; the bundle workflow described on this page is unchanged and remains the right shape when you have a single deployable slice.
One state per host
Every config that touches a given host must apply against the same -s state file. State is the authority on "what is currently tracked on this host"; splitting it across files means each file's state thinks it owns the host alone, and applying one of them produces a plan full of Delete steps for the resources owned by the others.
The destruction guard catches this case — apply refuses to run if any Delete is present without --allow-destroy — but the structural fix is to apply all -c files for a host together against one state file. Do not apply them one at a time.
If you forget a -c, the apply will still refuse to run, and the error names every loaded config so the missing one is visually obvious:
refusing to apply: plan would delete 9 resources not in config:
- docker_container.traefik
- docker_network.edge
- ...
loaded configs: app.strat
state file: .stratum/host.json
The missing config here is infra.strat — it owns the deleted resources, but it isn't in the loaded set.
If you'd rather apply each slice independently against its own state file, that's what Namespaces are for. The per-namespace state file is scoped to its namespace's resources, and the implicit per-host tuning resources land in a shared file so multiple namespaces sharing a host don't fight over them.
Cross-file references
A host or secret declared in file A is referenceable in file B without redeclaration.
# hosts.strat
host "primary" {
addr = "root@192.0.2.10"
}
# app.strat — no `host "primary"` redeclaration
resource "ssh_exec" "uptime" {
host = host.primary.addr
command = "uptime"
}
stratum plan -c hosts.strat -c app.strat -s .stratum/host.json
This works because all files merge into one Document before evaluation. The evaluation order (hosts → secrets → providers + resources) is global across the merged set, not per file.
Per-file base directories
content_file (on system_file) and source_dir (on system_dir) resolve relative to the declaring file's directory, not relative to the working directory or the first -c file. Two system_file blocks in two different files can both reference files/foo.txt next to themselves, and each resolves to its own files/foo.txt.
Duplicates are hard errors
Three categories of duplicate are caught at load time, and the error names both source paths:
| category | error |
|---|---|
Same host name in two files | DuplicateHost { name, first, second } |
Same provider name in two files | DuplicateProvider { name, first, second } |
Same <kind>.<name> resource | DuplicateResource { addr, first, second } |
Same secret name | DuplicateSecret { name, first, second } |
duplicate host `primary`: defined in hosts.strat and infra.strat
There is no "last file wins" rule. Pick one file to own the declaration and remove the other.
Merging existing state files
If you started with per-config state files (e.g. .stratum/infra.json, .stratum/app.json) and want to consolidate into one bundle state, use stratum state merge — it merges two or more state files into one, refusing on overwrite and on any <kind>.<name> collision. After merging, run stratum plan against the consolidated state to confirm zero diff before removing the old files.
Namespaces
A namespace "<name>" { ... } block declares a deployable slice of the infra: a set of .strat files that apply together against a dedicated state file. Namespaces live in a top-level manifest (by convention ./stratum.strat) alongside the host, secret, and provider blocks they share.
host "primary" {
addr = "root@192.0.2.10"
}
namespace "infra" {
configs = ["infra.strat"]
}
namespace "app" {
configs = ["app/web.strat", "app/db.strat"]
}
namespace_block ::= "namespace" string "{" attr* "}"
A namespace is selected on the CLI with -n <name>:
stratum -n infra apply -y
stratum -n app apply -y
Both invocations share the host "primary" declared in the manifest; each writes its own state file (.stratum/infra.json, .stratum/app.json).
Body attributes
| attr | required | type | default | description |
|---|---|---|---|---|
configs | yes | list of string | — | .strat files this namespace owns. Paths resolve against the manifest's directory. The manifest itself is always loaded first; configs entries are loaded after, in order. |
state | no | string | .stratum/<name>.json | Explicit state file path. Relative paths resolve against the manifest's directory. Overridden by a CLI -s flag. |
Names must be unique within the manifest. Duplicates error with duplicate namespace. References (host.x.y, secret.z.value) are not allowed inside a namespace body — only string literals.
What goes in the manifest, what goes in a namespace config
The manifest is the shared scope. The per-namespace configs are the scoped scope.
| block | manifest | namespace config |
|---|---|---|
host | yes (shared) | yes (scoped to ns) |
secret | yes (shared) | yes (scoped to ns) |
provider | yes (shared) | yes (scoped to ns) |
resource | rejected | yes |
namespace | yes | rejected |
A resource block in the manifest is loaded into every namespace (because the manifest is always the first file in the merged set), but it is not scoped to any one of them — it would appear in every namespace's plan, and the cross-namespace validator would flag every container as colliding with itself. Put resources in the per-namespace files only.
A namespace block in a non-manifest file parses fine but is invisible to CLI resolution — -n NAME only inspects whichever file is passed as --manifest. Treat namespace blocks as manifest-only.
State layout
Each namespace gets its own state file:
.stratum/
infra.json # everything declared under namespace "infra"
app.json # everything declared under namespace "app"
_shared.json # implicit per-host _stratum_* resources
The shared file holds the implicit per-host tuning resources (_stratum_swap_*, _stratum_sshd_oom_*, _stratum_sshd_reload_*). They live in a single file because every namespace that targets a given host wants the same swap and the same sshd drop-in — splitting them per-namespace would have the first apply create them, the second apply see them as missing from state, and recreate them. See Architecture: split state for the merge rules.
What references can cross namespace boundaries
| reference shape | works across namespaces? |
|---|---|
host.<name>.<field> | yes (manifest-scoped) |
secret.<name>.value | yes (manifest-scoped) |
depends_on to a sibling ns | no |
depends_on on a docker_container must point at a <kind>.<name> declared in the same namespace. The planner only sees one namespace's resources at a time, so a depends_on edge to a sibling namespace's resource errors as an undeclared target.
If a producer-consumer relationship crosses what becomes a namespace boundary, the workaround is to duplicate the trigger resource on the consumer side. A typical case: a ssh_exec runs docker build to produce an image (producer), and a docker_container in another namespace consumes that image. Move (or duplicate) the build ssh_exec into the consumer's namespace so the depends_on edge is local. See Cross-namespace conflicts in the tutorial for an example.
Cross-namespace conflict checks
When -n NAME is set, plan and apply re-load every sibling namespace's configs (with unresolved secrets tolerated) and check the current namespace's docker_container resources for collisions against them. Two cases are caught:
- Port collision — two namespaces declare a
docker_containerbinding the same(host, host_port). Random-port and bare-port forms are skipped. - Container name collision — two namespaces declare a
docker_containerwith the same(host, name).
Both errors name the offending resource and the sibling that already claims the port or name. See Cross-namespace conflicts for the error shape.
Errors
| condition | error |
|---|---|
Label count other than one (namespace "a" "b" { ... }) | BadNamespaceLabels |
Missing configs = [...] | BadNamespaceBody |
configs entry is not a string | BadNamespaceBody |
| Reference inside the body | BadNamespaceBody (no refs allowed) |
| Duplicate namespace name | DuplicateNamespace (names both source paths) |
-n NAME requested but no manifest at ./stratum.strat (and no --manifest) | CLI: requires a manifest, but ./stratum.strat does not exist |
-n NAME requested but the manifest declares no such namespace | CLI: namespace <name> not declared in <manifest> (known: ...) |
-n and -c passed together | CLI: mutually exclusive |
Bundle mode is unchanged
Without -n, stratum operates in bundle mode — the historical workflow. -c X -c Y -s state.json keeps working exactly as before, the cross-namespace validator is skipped, and state is a single file. See Multi-file configs. Namespaces are opt-in; you don't have to migrate.
Tutorial
See Multi-namespace deployments for the end-to-end walkthrough: writing a manifest, splitting an existing bundle, triggering a port collision and resolving it.
Providers
A provider owns one or more resource kinds and implements the Provider trait: create, update, delete, read, and an optional configure for the corresponding provider "<name>" { ... } block.
stratum routes a resource to a provider by splitting the kind on the first _ and looking up the prefix:
| kind | provider |
|---|---|
system_package | system |
system_service | system |
system_file | system |
system_secret_file | system |
system_ufw_rule | system |
system_dir | system |
ssh_exec | ssh |
ssh_file | ssh |
docker_network | docker |
docker_container | docker |
docker_image | docker |
git_repo | git |
Providers are registered at CLI startup in crates/cli/src/main.rs. Adding one means adding a new crate under crates/providers/, wiring it into the registry, and documenting it here.
Configuration block
A provider "<name>" { ... } block is optional. When present, its body is passed to the provider's configure method during apply. No shipped provider reads its block today — the grammar exists but is currently dormant.
Execution
Side effects run when you pass -y to stratum apply. Without -y, apply prints the plan and exits without touching providers. There is no dry-run mode for providers: once -y is set, every create / update / delete call hits the remote host.
For drift detection, every provider also implements read (a non-destructive query). Coverage today:
| kind | read returns |
|---|---|
system_package | Present { state: present|absent } |
system_service | Present { enabled, state } |
system_file | Present { mode, owner, group, sha256 } or Absent |
system_secret_file | Present { mode, owner, group, sha256 } or Absent (content never observed) |
system_ufw_rule | Unknown (parsing punted) |
system_dir | Present { file_count, manifest_sha256, manifest } or Absent (or Unknown if state's file_count > 200) |
ssh_exec | Unknown (no readable identity) |
ssh_file | Present { mode, sha256 } or Absent |
docker_network | Present { name, driver } or Absent |
docker_container | Present { name, image, restart, labels, networks, container_id } or Absent |
docker_image | Present { tag, image_id, id } (echoes prior build_args / context / dockerfile / target / pull_base) or Absent |
git_repo | Present { path, url, ref, commit_sha } or Absent |
The Unknown cases show up in unreadable counts when you run stratum plan --refresh or after every stratum apply -y. That's intentional for v1.
See each provider's page for the exact attribute schema.
system
Ansible-shape system bootstrap: install packages, manage systemd units, drop files and directories, configure ufw. Operates against a remote host via the system ssh binary (-o BatchMode=yes -o StrictHostKeyChecking=accept-new). Targets Debian/Ubuntu — package management is apt-only.
All apt invocations use DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::=--force-confold, so package installs never hang on prompts and keep existing config files on upgrade.
The provider takes no configuration block.
Kinds
system_package
An apt-managed package, present or absent.
resource "system_package" "docker" {
host = host.primary.addr
name = "docker.io"
state = "present"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target in user@host form. |
name | yes* | string | resource label | apt package name. Falls back to the resource label if omitted. |
state | no | string | present | present or absent. |
*name is technically optional in the source — it defaults to the resource label — but documenting it explicitly is the convention so the apt package name doesn't depend on what you called the resource.
On present, the provider runs apt-get update once per (process, host) before the first install, then apt-get install <name>. On absent, it runs apt-get remove <name>.
Stored state: { host, name, state }. read runs dpkg-query -W -f='${Status}' <name> and returns Present { state: present|absent }. There is no version field in observed state — version drift is not surfaced.
Delete is best-effort apt-get remove (errors swallowed so a missing package doesn't fail the apply).
system_service
A systemd unit, started/stopped with enabled/disabled independently controlled.
resource "system_service" "docker" {
host = host.primary.addr
name = "docker"
enabled = true
state = "started"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
name | yes* | string | resource label | systemd unit name. Falls back to the resource label. |
enabled | no | bool | false | If true, runs systemctl enable <name>. |
state | no | string | stopped | started or stopped. |
Order of operations:
(enabled=true, state=started)→enable, thenstart, then pollis-activefor up to 10s.(enabled=true, state=stopped)→enable, thenstop.(enabled=false, state=stopped)→stop(best-effort), thendisable.(enabled=false, state=started)→disable(best-effort), thenstart, then pollis-activefor 10s.
If the 10s is-active poll times out, the provider collects systemctl status --no-pager -n 20 (and journalctl -u <svc> --no-pager -n 50 if the status output is sparse) and includes both in the error message.
Stored state: { host, name, enabled, state }. read runs systemctl is-enabled + systemctl is-active in one round-trip and returns Present { enabled, state }.
Delete runs systemctl disable --now <name> best-effort.
system_file
Drops a file on the remote host. Auto-creates parent directories via install -D.
resource "system_file" "traefik-config" {
host = host.primary.addr
path = "/etc/traefik/traefik.yml"
content_file = "files/traefik.yml"
mode = "0644"
owner = "root"
group = "root"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
path | yes | string | — | Absolute destination path. Parent dirs are created. |
content | yes* | string | — | File contents, written verbatim. |
content_file | yes* | string | — | Path to a local file, resolved relative to the .strat file's directory. Inlined at config-load time into content. |
mode | no | string | 0644 | File mode, passed to install -m. |
owner | no | string | root | File owner, passed to install -o. |
group | no | string | root | File group, passed to install -g. |
*Exactly one of content or content_file must be set. Both → EvalError::ContentConflict at config-load time. See content_file for the full semantics.
Upload uses install -D -m <mode> -o <owner> -g <group> /dev/stdin <path> with the content streamed via SSH stdin. The -D flag creates intermediate directories.
The provider sha256-hashes the content; on update, if the new sha matches what's in prior state, the upload is skipped and the resource logs unchanged (sha256 match).
Stored state: { host, path, content, mode, owner, group, sha256 }. Persisting content means a re-plan with the same config is a no-op (no spurious updates from "content field appeared").
read runs a single-roundtrip probe that prints either MISSING or <mode>|<owner>|<group>|<sha256>. Returns Absent for missing files, Present { host, path, mode, owner, group, sha256 } otherwise. Note that observed state does not include content — drift surfaces as a sha256 mismatch.
Delete runs rm -f -- <path>.
system_secret_file
Like system_file, but the content is treated as a whole-file secret. The plaintext is streamed via SSH stdin (never argv) and never persists in state. State stores sha256 plus the file permissions only — enough to detect drift, not enough to recover the value.
secret "firebase_sa" {
from_file = "~/.config/app/firebase-sa.json"
}
resource "system_secret_file" "firebase-sa" {
host = host.primary.addr
path = "/etc/app/firebase-sa.json"
content = secret.firebase_sa.value
mode = "0400"
owner = "root"
group = "root"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
path | yes | string | — | Absolute destination path. Parent dirs are created via install -D. |
content | yes | string | — | File contents. Typically secret.<name>.value — see Secrets. Unlike system_file, there is no content_file attribute; whole-file secrets are sourced through a secret { from_file = ... } block. |
mode | no | string | 0400 | File mode, passed to install -m. Default is stricter than system_file (0400 vs 0644) — secret files default to owner-only read. |
owner | no | string | root | File owner, passed to install -o. |
group | no | string | root | File group, passed to install -g. |
Upload uses install -D -m <mode> -o <owner> -g <group> /dev/stdin <path> with the content streamed via SSH stdin. The provider hashes the content before sending; on update, if the new sha matches the prior state's sha and all permissions match, the upload is skipped and the resource logs unchanged (sha256 + perms match).
State shape: { host, path, mode, owner, group, sha256 }. The content field is omitted — recovering the plaintext from state is impossible by design. Plan diff is sha-to-sha: the build_plan normalizer (SECRET_CONTENT_TO_SHA) drops content from desired and substitutes sha256: hash(content) before diffing, so plans never echo plaintext into a content: null -> "<plaintext>" change. See Architecture: secret-content normalization.
Apply log: byte length only — never the content, never the sha.
[system] SECRET_FILE `firebase-sa` -> root@192.0.2.10:/etc/app/firebase-sa.json (1842 bytes, mode=0400 root:root)
[system] SECRET_FILE `firebase-sa` -> root@192.0.2.10:/etc/app/firebase-sa.json unchanged (sha256 + perms match)
Drift detection: read runs the same probe shape as system_file (<mode>|<owner>|<group>|<sha256> or MISSING). Returns Present { host, path, mode, owner, group, sha256 } with no content field. Drift surfaces as sha256 / mode / owner / group mismatch.
Delete runs rm -f -- <path> best-effort.
system_dir
Manages a directory on the remote host. Two modes, selected by whether source_dir is set:
- Upload mode (
source_dirset) — tars + gzips a local tree in memory, streams it over one SSH connection, extracts on the host, applieschown -R+ recursivechmod. Used to ship static assets (anmdbook buildoutput, a static-site bundle) to a host where another container will serve them. - Empty-dir mode (
source_diromitted) — justmkdir -p+chown+chmodon the host. No upload. Used to pre-create directories a daemon expects but won't create itself (e.g./srv/repos,/var/lib/<daemon>).
resource "system_dir" "book" {
host = host.primary.addr
source_dir = "../book/book"
path = "/srv/stratum-book"
mode = "0644"
dir_mode = "0755"
owner = "root"
group = "root"
delete_extra = true
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
path | yes | string | — | Absolute remote destination. Created with mkdir -p. |
source_dir | no | string | — | If set, local directory whose contents are tarred + uploaded. Resolved relative to the .strat file's directory and canonicalized at config-load time. Must exist and be a directory. If omitted, the resource is in empty-dir mode. |
mode | no | string | 0644 | Mode applied to every regular file under path (via find -type f -exec chmod). Has no effect in empty-dir mode. |
dir_mode | no | string | 0755 | Mode applied to every directory under path (via find -type d -exec chmod). In empty-dir mode, applied once to path itself. |
owner | no | string | root | Recursive owner, applied with chown -R <owner>:<group>. In empty-dir mode, applied once to path. |
group | no | string | root | Recursive group. |
delete_extra | no | bool | false | If true, files in prior state's manifest but absent from the new manifest are rm -f'd on the host after the upload. Keeps the remote tree in sync as local files are removed. No effect in empty-dir mode (the manifest is always empty). |
The provider walks source_dir with walkdir, sha256-hashes every regular file, and stores the result as { relpath -> sha256 } (POSIX / separators even on Windows). The manifest is digested into a single manifest_sha256; on update, if both the manifest digest and every permission attr match prior state, the upload is skipped and the resource logs unchanged (... manifest match).
source_dir is resolved at config-load time: the value the provider sees is the canonicalized absolute path of <dir-of-.strat-file>/<source_dir>. Using system_dir with source_dir via stratum_config::load_str (no base dir) errors with EvalError::SourceDirNoBaseDir. A missing or non-directory path errors with EvalError::SourceDirMissing.
Stored state: { host, source_dir, path, mode, dir_mode, owner, group, delete_extra, file_count, manifest_sha256, manifest }. The manifest map is persisted in full so delete_extra can diff prior keys against the new manifest.
read runs find . -type f -print0 | sort -z | xargs -0 sha256sum on the remote tree, returns Absent if the directory is missing, otherwise Present { host, path, file_count, manifest_sha256, manifest }. Drift surfaces as a manifest_sha256 mismatch (or, with delete_extra off, extra keys observed on the host that aren't in state).
File-count cap. If state's file_count exceeds 200, read returns Observed::Unknown("system_dir read skipped: file_count N > cap 200") instead of doing the remote sha256 sweep. Drift detection on large trees needs a smarter strategy; today they show up as unreadable in --refresh output. Apply itself is not capped — uploads of any size work.
Delete runs rm -rf -- <path> best-effort.
Empty-dir mode (no source_dir)
Omit source_dir to skip the upload entirely. The provider runs only:
mkdir -p <path>; chown <owner>:<group> <path>; chmod <dir_mode> <path>
The stored state's file_count is 0, the manifest is {}, and manifest_sha256 is the digest of the empty manifest. delete_extra is recorded but has no effect — there are no manifest entries to diff. read returns Absent if the directory is missing on the host, otherwise Present with file_count: 0 (the host's tree is also expected to be empty as far as stratum is concerned; any files placed there by other processes don't drift the manifest).
# Pre-create directories the daemon expects.
resource "system_dir" "etc-app" {
host = host.primary.addr
path = "/etc/app"
}
resource "system_dir" "srv-repos" {
host = host.primary.addr
path = "/srv/repos"
}
Use this in place of an ssh_exec "mkdir -p ..." chain when a daemon needs the directories to exist before it starts: empty-dir mode is idempotent, drift-detectable (the daemon won't recreate the dir if path is missing on the host), and the owner/group/mode become declarative.
system_ufw_rule
A single ufw allow/deny rule. Idempotent via ufw itself — adding the same rule twice is harmless.
resource "system_ufw_rule" "allow-ssh" {
host = host.primary.addr
port = "22/tcp"
rule = "allow"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
port | yes | string | — | Port specifier, e.g. 22/tcp, 443/tcp, 8080. Passed verbatim to ufw. |
rule | yes | string | — | allow or deny. |
Runs ufw <rule> <port> on create/update. Delete runs ufw delete <rule> <port> best-effort.
Stored state: { host, port, rule }. read returns Observed::Unknown("system_ufw_rule read not implemented (ufw status parsing punted)") — ufw rules always show up as unreadable in drift summaries. This is intentional for v1.
Lockout warning: systemctl start ufw does NOT activate the firewall ruleset. The ruleset becomes active only when you run ufw --force enable (typically via an ssh_exec resource, after the 22/tcp allow rule is in state). If you later remove the 22/tcp allow rule with ufw enabled while you're sshing in, you'll lock yourself out. See the bootstrap tutorial for the recommended ordering.
Apply trace
Each resource logs one line to stderr:
[system] PACKAGE `docker.io` present on root@192.0.2.10
[system] SERVICE `docker` enabled=true state=started on root@192.0.2.10
[system] FILE `traefik-config` -> root@192.0.2.10:/etc/traefik/traefik.yml (188 bytes, mode=0644 root:root)
[system] FILE `traefik-config` -> root@192.0.2.10:/etc/traefik/traefik.yml unchanged (sha256 match)
[system] SECRET_FILE `firebase-sa` -> root@192.0.2.10:/etc/app/firebase-sa.json (1842 bytes, mode=0400 root:root)
[system] DIR `site` -> root@192.0.2.10:/srv/site (42 files, mode=0644 dir_mode=0755 root:root)
[system] DIR `site` -> root@192.0.2.10:/srv/site unchanged (42 files, manifest match)
[system] UFW allow 22/tcp on root@192.0.2.10
Notes
apt-get updateis memoized once per(process, host). A separatestratuminvocation re-runs it.- All shell quoting is internal (
shell_quotehelper) — package names with spaces are correctly quoted, but in practice apt names don't contain them. - There is no
system_userkind. Usessh_execif you need to create users.
ssh
Runs commands and writes files on a remote host. Shells out to the system ssh binary with -o BatchMode=yes -o StrictHostKeyChecking=accept-new, so authentication is whatever your ~/.ssh/config and ssh-agent provide. Passwords are not supported — keys only.
The provider takes no configuration block.
Kinds
ssh_exec
Runs a command on a remote host. Stratum re-runs the command whenever any attribute changes (typically command).
resource "ssh_exec" "uptime" {
host = host.prod.addr
command = "uptime"
}
resource "ssh_exec" "bootstrap" {
host = host.prod.addr
command = "mkdir -p /opt/stratum"
on_destroy = "rm -rf /opt/stratum"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target in user@host form. Passed verbatim to ssh. |
command | yes | string | — | Shell command to run on the remote host. |
env | no | map | none | Each entry becomes export KEY=VALUE; prepended to command. Values are shell-quoted. A secret.<name>.value ref is supported here (state stores a redaction marker, same as docker_container.env). Keys are sorted for deterministic command output. |
on_destroy | no | string | none | Command to run when this resource is deleted from config. |
Stored state adds stdout, stderr, and exit_code from the last run.
The effective remote command is export K1=V1; export K2=V2; ...; <command> — a single shell invocation, not a separate environment. The env map is the right place for short, sensitive values (a GitHub PAT, a deploy token) that flow into a one-shot command without landing in a file. For values that need to persist on the host, use system_secret_file.
If the remote command exits non-zero, apply fails with the captured stderr.
When the resource is removed from config and on_destroy is set, the command is executed; otherwise delete is a no-op (with a log line).
Drift detection: read always returns Observed::Unknown("ssh_exec has no readable identity on the host"). ssh_exec resources show up in unreadable counts on stratum plan --refresh and after every stratum apply -y. That's intentional — a fire-and-forget shell command has no canonical "current state" to read back.
ssh_file
Writes a file to a remote path. The file is re-uploaded whenever its content changes, detected via sha256 of the prior state. Deletion runs rm -f on the recorded path.
resource "ssh_file" "motd" {
host = host.prod.addr
path = "/etc/motd"
content = "Managed by stratum.\n"
mode = "0644"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target in user@host form. |
path | yes | string | — | Absolute destination path on the remote host. |
content | yes | string | — | File contents, written verbatim. |
mode | no | string | 0644 | File mode, passed to install -m. |
Upload uses install -m <mode> /dev/stdin <path> with the content streamed via stdin. Binary content is not officially supported (the value comes through the parser as a UTF-8 string).
Stored state: { host, path, mode, sha256 }. The hash is what stratum compares on the next plan to decide whether to re-upload.
For richer file management (owner, group, auto-created parent dirs, content_file = "..." inlining), use system_file instead.
Drift detection: read runs a single-roundtrip probe: if [ -e <path> ]; then printf '%s|' "$(stat -c %a <path>)"; sha256sum <path> | cut -d' ' -f1; else echo MISSING; fi. Returns Absent for missing files, Present { host, path, mode, sha256 } otherwise.
Apply behavior
Trace lines, one per resource:
[ssh] EXEC `uptime` on root@192.0.2.10
[ssh] FILE `motd` -> root@192.0.2.10:/etc/motd
[ssh] FILE `motd` on root@192.0.2.10 unchanged (sha256 match)
docker
Drives the docker CLI on a remote host over SSH. All operations are shell commands sent through ssh -o BatchMode=yes -o StrictHostKeyChecking=accept-new — there is no Docker API client.
The provider takes no configuration block.
Update strategy
docker_container updates are recreate: docker rm -f <name> followed by docker pull <image> and docker run. There is no in-place update. Any change to any attribute triggers a recreate. Networks are reconciled by name; the create path is idempotent (docker network inspect ... || docker network create ...), so re-applying without changes is a no-op.
docker_image is a producer kind: it runs docker build on the host and records the resulting image_id. Drift on build_args / context / dockerfile / target triggers a rebuild. Drift on tag itself recreates (the new tag won't share an image ID with the old one). See docker_image below.
Kinds
docker_network
Ensures a user-defined network exists on the host.
resource "docker_network" "edge" {
host = host.primary.addr
name = "stratum-edge"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
name | no | string | resource name | Docker network name. Falls back to the resource label. |
driver | no | string | bridge | Network driver passed to --driver. |
The create/update path runs docker network inspect <name> >/dev/null 2>&1 || docker network create --driver <driver> <name>, so applying twice is safe.
Delete runs docker network rm <name>.
Drift detection: read runs docker network inspect <name> --format '{{json .}}'. Returns Absent on No such network, otherwise Present { host, name, driver }. The inspect output is parsed for Name and Driver.
docker_container
A long-running container.
resource "docker_container" "traefik" {
host = host.primary.addr
name = "traefik"
image = "traefik:v3.1"
restart = "unless-stopped"
env = {
NODE_ENV = "production"
}
ports = ["80:80", "443:443"]
volumes = ["/var/run/docker.sock:/var/run/docker.sock:ro"]
networks = ["stratum-edge"]
labels = {
"traefik.enable" = "true"
}
command = "--api.dashboard=true"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
image | yes | string | — | Image reference. Pulled before run unless pull = false. |
name | no | string | resource name | Container name (--name). Falls back to the resource label. |
restart | no | string | unless-stopped | Passed to --restart. |
pull | no | bool | true | If false, skip the docker pull step. Use for locally-built images that aren't in any registry (see pull = false below). |
env | no | map | none | Each entry becomes -e KEY=VALUE. Non-string values are JSON-stringified. A value that resolves from secret.<name>.value flows through as a normal env var; state stores only a redaction marker. See Secrets. |
ports | no | list of string | none | Each string becomes -p <spec>. Use the standard host:container[/proto]. |
volumes | no | list of string | none | Each string becomes -v <spec>. |
labels | no | map | none | Each entry becomes -l KEY=VALUE. Dotted keys (Traefik) must be quoted. |
networks | no | list of string | none | Each string becomes --network <name>. |
command | no | string | list of string | none | Appended after the image. String form is split on whitespace; list form preserves each element as one argv token (see command). |
memory | no | string | none | Hard memory limit, passed to --memory. Same syntax as docker (256m, 1g). |
memory_swap | no | string | none | Total memory + swap limit, passed to --memory-swap. See docker's docs for the swap-disable / swap-unlimited shorthands. |
healthcheck | no | map | none | Lowered to --health-* flags on docker run. See healthcheck below. |
depends_on | no | list of string | none | Resource addresses (<kind>.<name>) this container depends on. The planner topo-sorts before applying. See depends_on below. |
The constructed command is:
docker pull <image> >/dev/null; docker run -d --name <name> --restart <restart> [...flags...] <image> [<command tokens>]
On update it becomes:
docker rm -f <name> >/dev/null 2>&1 || true; docker pull <image> >/dev/null; docker run -d ...
Stored state preserves the input attrs verbatim and adds container_id (the last line of docker run's stdout, which is the container ID).
Delete runs docker rm -f <name>.
pull = false: locally-built images
When pull = false, the docker pull <image> >/dev/null; prefix is dropped from both the create and the recreate command. Use it when the image is built directly on the host (a sibling ssh_exec runs docker build -t myapp:dev ...) so there is no registry to pull from. With the default pull = true, docker pull myapp:dev errors and the create/update fails.
docker run itself still errors loudly if the image isn't present on the host, so a typo in image is caught at apply time — there's no silent fallback to a stale image.
resource "ssh_exec" "build-app" {
host = host.primary.addr
command = "cd /srv/repos/app && docker build -t app:dev -f Dockerfile ."
}
resource "docker_container" "app" {
host = host.primary.addr
image = "app:dev"
pull = false
# ...
}
command: string or argv list
# String form — split on whitespace at apply time.
command = "node server.js --port 4000"
# List form — each element is one argv token. Spaces inside an element are
# preserved (the third element below is a single shell line).
command = ["sh", "-c", "redis-server --requirepass ${secret.redis_pw.value}"]
Use the list form when an argument contains whitespace, embedded quotes, or other shell metacharacters. Each list element is shell-escaped independently when the run command crosses the SSH boundary, so spaces inside one element do not split it into two arguments.
A non-string, non-list value (a number, a bool, a map) errors at apply time.
healthcheck
A map lowered to --health-* flags on docker run. test is the only required field; everything else has a docker-side default or is sensible to omit.
resource "docker_container" "cache" {
host = host.primary.addr
image = "redis:7-alpine"
healthcheck = {
test = "redis-cli ping | grep -q PONG"
interval = "5s"
retries = 5
timeout = "3s"
start_period = "10s"
}
}
| field | required | type | lowering | default |
|---|---|---|---|---|
test | yes | string | --health-cmd <value> | — |
interval | no | string | --health-interval <v> | docker default (30s) |
retries | no | number | --health-retries <v> | docker default (3) |
timeout | no | string | --health-timeout <v> | 30s |
start_period | no | string | --health-start-period <v> | 0s |
Declaring healthcheck opts the container into the post-apply readiness wait: apply will not move on to dependent steps until docker inspect --format '{{.State.Health.Status}}' reports healthy, with a 60s budget.
A healthcheck map without a test field is a hard error at apply time:
`<name>` healthcheck missing required field `test`
depends_on and the post-apply wait
resource "docker_container" "api" {
host = host.primary.addr
image = "api:dev"
depends_on = ["docker_container.db", "docker_container.cache"]
}
Each entry must be a <kind>.<name> resource address. The planner uses these edges in three places:
- Topo sort of create / update steps.
apiis reordered after both of its dependencies, regardless of file order. The sort is stable: resources with no edges keep their relative input order, and implicit_stratum_*resources (per-host swap, sshd OOM tuning) fall through within_degree = 0and stay at the front. - Cycle detection. A cycle is a hard error at plan time, with the cycle path in the message.
- Forward-topo delete order. When
apianddbare both removed from config,apiis deleted first (forward topo over the state-resident edges) — the dependent goes down before its dependency.
A missing reference is a hard error at plan time, naming both addresses:
depends_on edge: `docker_container.api` references unknown resource `docker_container.db`
depends_on edges only resolve within a single namespace. If a producer and consumer end up in different namespaces, duplicate the producer into the consumer's namespace — see Cross-namespace depends_on.
Post-apply readiness wait. After every successful docker_container create or update, the planner pauses before moving on:
- If
healthcheckis declared, it pollsdocker inspect --format '{{.State.Health.Status}}' <name>once a second for up to 60s, waiting forhealthy. A status ofunhealthyor a timeout fails the apply with a named error. An empty status (no healthcheck configured at the docker level) is treated as ready immediately. - Otherwise, a cosmetic 500ms pause gives docker time to wire networks and volumes before the next step pokes the container.
This wait is what makes depends_on actually useful — a dependent container starts against a healthy dependency, not just a running one.
Drift detection: read runs docker inspect <name> --format '{{json .}}'. Returns Absent on No such object / No such container, otherwise Present. The parsed shape is { host, name, image, restart, labels, networks, container_id }.
Two normalization rules in parse_container_inspect keep observed labels from being noisy:
- All
com.docker.*labels (compose metadata, etc.) are dropped. - Remaining labels are intersected with the state's label key set, so image-baked
LABELs and daemon-injected labels don't surface as drift.
Networks come from NetworkSettings.Networks, sorted lexicographically. (diff_observed treats string arrays as sets, so order changes don't drift either.)
docker_image
Builds an image on a remote host from a context directory. Producer kind: state captures the resulting image_id so drift can be detected when the image is rebuilt or removed underneath stratum.
resource "docker_image" "api" {
host = host.primary.addr
context = "/srv/repos/api"
dockerfile = "Dockerfile"
target = "runner"
tag = "api:dev"
pull_base = true
build_args = {
NODE_ENV = "production"
API_URL = "https://api.example.com"
}
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. The build runs on this host's docker daemon. |
tag | yes | string | — | Image tag (-t) and the lookup key for read. |
context | yes | string | — | Absolute path on the host containing the Dockerfile and build context. Stratum cds into it before docker build. |
dockerfile | no | string | Dockerfile | Dockerfile filename, passed to -f. |
target | no | string | none | Multi-stage build target, passed to --target. |
build_args | no | map | none | Each entry becomes --build-arg KEY=VALUE. Keys are sorted alphabetically for deterministic command output. |
pull_base | no | bool | false | If true, passes --pull so docker re-pulls the base image instead of reusing a cached one. |
The build line is:
cd <context> && DOCKER_BUILDKIT=1 docker build [--target T] [--build-arg K=V ...] [--pull] -t <tag> -f <dockerfile> .
DOCKER_BUILDKIT=1 is always set. The host must have the buildx plugin available (docker buildx).
After a successful build, stratum runs docker images --no-trunc --format '{{.ID}}' <tag> and records the full image ID as both image_id and id in state (the id alias is reserved for future resource-attr refs of the form docker_image.X.id).
Update strategy. docker_image does not have a separate update path — both create and update run the same build. The desired-vs-prior diff drives whether the build runs at all: a change to any of context, dockerfile, target, build_args, pull_base, or tag produces an Update step that re-runs docker build. A drift-detected change to image_id (image deleted or rebuilt out of band) also re-runs the build.
Drift detection: read runs docker images --no-trunc --format '{{.ID}}' <tag>. Empty output → Absent. Otherwise Present { host, tag, image_id, id, ...prior fields }. The build-time fields (context, dockerfile, target, build_args, pull_base) are echoed forward from prior state — docker doesn't preserve them post-build, and re-querying them is impossible. Drift on those is detected at plan time via desired-vs-prior diff, not by read.
Delete runs docker rmi <tag> best-effort.
Apply behavior
Trace lines per resource:
[docker] NETWORK `stratum-edge` on root@192.0.2.10
[docker] IMAGE `api:dev` on root@192.0.2.10 (context=/srv/repos/api)
[docker] CONTAINER `traefik` on root@192.0.2.10 (create)
[docker] CONTAINER `traefik` on root@192.0.2.10 (recreate)
git
Owns a git working tree on a remote host. Shells out to the system git binary over SSH — there is no libgit2 dependency. The provider takes no configuration block.
Kinds
git_repo
Clones a remote repository to a fixed path and keeps it pinned to a given ref. Producer kind: state captures the resolved commit_sha, which is what other steps (a docker_image rebuild, a ssh_exec post-deploy) can observe drift against.
resource "git_repo" "app" {
host = host.primary.addr
path = "/srv/app"
url = "https://github.com/example/app.git"
ref = "main"
}
| attr | required | type | default | description |
|---|---|---|---|---|
host | yes | string | — | SSH target. |
path | yes | string | — | Absolute destination path on the remote host. The directory is created by git clone. |
url | yes | string | — | Remote URL. Passed verbatim to git clone. Authentication uses whatever's on the host (~/.git-credentials, deploy key in ~/.ssh, etc.) — stratum does not inject credentials. |
ref | yes | string | — | A branch name, a tag, or a full 40-character lowercase hex SHA. See Refs below for the dispatch rules. |
depth | no | number | none | Shallow-clone depth, passed to --depth. Only honored when ref is a branch or tag; full SHAs always clone full (--depth + bare SHA is fragile). |
State shape
{
"host": "root@192.0.2.10",
"path": "/srv/app",
"url": "https://github.com/example/app.git",
"ref": "main",
"depth": null,
"commit_sha": "bfb77a6c1d808e04..."
}
commit_sha is the output of git rev-parse HEAD after the create / update. Stratum re-reads this on --refresh; a mismatch between the recorded SHA and what ref currently resolves to on the remote becomes drift, which update resolves with a fetch + reset.
Refs
The ref value dispatches at command-build time:
- Branch or tag (anything not matching a full SHA shape) →
git clone --branch <ref> [--depth N] <url> <path>. Updates rungit fetch origin && git reset --hard origin/<ref>. - Full 40-character hex SHA →
git clone <url> <path> && git -C <path> checkout <sha>. The--branchflag does not accept bare SHAs, so the clone-then-checkout shape is required.--depthis intentionally ignored for SHA refs. Updates rungit fetch origin && git checkout <sha>.
A short SHA (bfb77a6) is treated as a branch name — git will reject it as a missing branch at clone time. Use the full 40-char form if you need SHA-pinned behavior.
Recreate vs in-place update
| change | action |
|---|---|
url differs | rm -rf <path> then re-clone. State preserves the new url + sha. |
path differs | rm -rf <old-path> then re-clone at the new path. |
ref / depth only | git fetch origin + git reset --hard origin/<ref> (or git checkout <sha>). |
There is no git fetch --depth shrink / expand path — depth changes alone do not trigger any work today.
Drift detection
read runs a single-roundtrip probe:
if [ ! -d <path>/.git ]; then echo MISSING;
else printf '%s|' "$(git -C <path> rev-parse HEAD)";
git -C <path> remote get-url origin;
fi
<path>or<path>/.gitmissing →Absent.- Probe output
<sha>|<origin_url>→Present { host, path, url, ref, depth, commit_sha }. Theurlin the observed value is the actual remote URL — if it differs from the desiredurl, the plan flags drift onurland the next apply will re-clone. - Anything else →
Unknownwith the unparsed output.
Apply trace
[git] CLONE git_repo `app` -> root@192.0.2.10:/srv/app (ref=main)
[git] UPDATE git_repo `app` (root@192.0.2.10:/srv/app) ref=main
[git] DELETE git_repo `app` (root@192.0.2.10:/srv/app)
Notes
- Stratum does not manage SSH host keys for git remotes. If the remote is a private git server reached over SSH, the host running stratum's
ssh(the remote host, not your laptop) needs to have already accepted the remote's host key. - Credentials live on the host. For GitHub HTTPS, the common pattern is to drop
~/.git-credentialsvia asystem_secret_filesourcing a personal access token fromsecret { from_env = "GH_PAT" }. git_repodoes not run anything inside the cloned tree (nonpm install, nodocker build). Pair it with adocker_image(using the clone path ascontext) or anssh_execfor post-clone work.
CLI
Single binary: stratum. All commands operate against one or more config files (default stratum.strat) and a state file (default .stratum/state.json).
stratum <COMMAND>
| command | purpose |
|---|---|
plan | Print the diff between config and state. |
apply | Execute the plan and save state. |
status | Print per-host resource usage and per-container stats. |
state list | List resources currently tracked in state. |
state show | Print one resource's full state as JSON. |
state merge | Merge two or more state files into one. |
Global flags
These flags work on any subcommand (clap's global = true).
| flag | short | default | description |
|---|---|---|---|
--env-file <PATH> | none | Load env vars from a .env-style file before resolving secret { from_env } refs. Repeatable to layer multiple files. See --env-file and .env auto-load below. | |
--namespace <NAME> | -n | none | Operate within the named namespace declared in the manifest. Resolves -c from the namespace's configs = [...] and -s to .stratum/<name>.json. See Namespace mode below. |
--manifest <PATH> | auto | Override the manifest path used by -n. Default: ./stratum.strat if it exists. Only consulted when -n is set. |
Namespace mode (-n and --manifest)
A namespace is a deployable slice declared in a top-level manifest. When -n NAME is set, the CLI:
- Locates the manifest. If
--manifest PATHis set, that path is used. Otherwise./stratum.stratis required to exist; if it doesn't, the command errors out (requires a manifest, but ./stratum.strat does not exist). - Loads the manifest and looks up the named namespace. If it isn't declared, the error names every known namespace from the manifest.
- Resolves
-cfrom the namespace'sconfigs = [...]. The manifest itself is always the first file in the merged set, so itshost/secret/providerblocks are visible to every per-namespace config. - Resolves
-sto (in priority order) the explicit-son the command line, the namespace's body-levelstate = "...", or.stratum/<name>.json.
stratum -n infra plan # uses ./stratum.strat, configs from `namespace "infra"`
stratum -n app apply -y # state at .stratum/app.json
stratum --manifest deploy/manifest.strat -n web apply -y
-n and -c are mutually exclusive. Passing both errors out — the namespace's configs = [...] is the config list, and a -c override would silently shadow the manifest's intent. Drop the -n to operate in bundle mode, or remove the -c to let the namespace pick.
-s is not mutually exclusive with -n. The CLI default is .stratum/state.json; the namespace's default is .stratum/<name>.json. If you want to point a namespace's apply at a custom state file (typically during migration from bundle mode), pass -s explicitly and it wins.
Split state
In namespace mode, state writes to two files instead of one:
<state_path>(e.g..stratum/<name>.json) — every user-declared resource.<state_path's directory>/_shared.json— every implicit per-host_stratum_*resource (auto-injected swap, sshd OOM tuning, sshd reload).
The shared file is what lets multiple namespaces target the same host without each one trying to recreate the tuning resources. The first namespace's apply creates them; subsequent namespaces see them as no-op. See Architecture: split state for the merge rules.
Cross-namespace conflict checks
Before classification, plan and apply in namespace mode re-load every sibling namespace's configs (with unresolved secrets tolerated) and walk every docker_container they declare. Two collision classes are caught:
- Host port — two namespaces declare a
docker_containerbinding the same(host, host_port). The check parsesH:CandIP:H:Cshapes; ranges and bare-port-random forms are skipped. - Container name — two namespaces declare a
docker_containerwith the same(host, name). The default fornameis the resource's label.
Both error messages name the offending resource and the sibling that already claims the port or name:
cross-namespace port conflict on host `root@192.0.2.10`:
- app::docker_container.web (current) wants 80
- infra::docker_container.traefik (sibling) already claims it
The check runs at plan time and is skipped entirely in bundle mode (no -n).
--env-file and .env auto-load
secret { from_env = "X" } reads std::env::var("X") at config-load time. To avoid having to export X=... (and to keep credentials out of shell history), stratum can load env files before resolving secrets.
stratum --env-file .env.prod plan -c bootstrap.strat
stratum --env-file base.env --env-file overrides.env apply -y
Rules:
- Auto-load. If
--env-fileis not passed at all, stratum auto-loads./.envif it exists. Auto-load is silent on miss — no error if there's no.env. - Explicit list. If
--env-fileis passed one or more times, only those files are loaded. The auto-.envis not consulted. Every listed path must exist and parse, or the command errors before doing any work. - Process env wins. A variable already set in the process environment is never overwritten by a file. This is the 12-factor rule —
FOO=x stratum applykeepsFOO=xregardless of what.envsays. - First-set wins among files. When
--env-file a --env-file bis passed,ais parsed first; a variable set inais not overwritten byb. The CLI prints[env] loaded <path>to stderr for every file successfully loaded.
Useful with secret { from_env = ... }:
# .env
PG_PASSWORD=correcthorsebatterystaple
GH_PAT=ghp_...
secret "pg_password" { from_env = "PG_PASSWORD" }
secret "gh_pat" { from_env = "GH_PAT" }
The .env file is not committed (add it to .gitignore). For team workflows, check in a .env.example with placeholder values instead.
plan
Print what would change if applied. Read-only — never writes state. By default never contacts a host either; pass --refresh to query live state.
stratum plan
stratum plan --refresh
stratum plan -c infra.strat -c app.strat -s .stratum/host.json
stratum plan --allow-unresolved-secrets
stratum -n app plan # namespace mode
| flag | short | default | description |
|---|---|---|---|
--config <PATH> | -c | stratum.strat | Path to a .strat config file. Repeatable: every -c file is merged into one document and evaluated together. See Multi-file configs. |
--state <PATH> | -s | .stratum/state.json | Path to the JSON state file. |
--refresh | off | Query live hosts via Provider::read and annotate drift between state and reality. | |
--allow-unresolved-secrets | off | If a secret block's source is missing (env unset, file unreadable), substitute a <unresolved-secret:NAME> placeholder instead of failing. Plan-only — apply refuses any plan containing placeholders. See Secrets: --allow-unresolved-secrets. |
Output is a list of resources prefixed with an action symbol:
| symbol | action |
|---|---|
| no change |
+ | create |
~ | update (with field-by-field diff) |
- | delete |
Followed by a summary line: N create, N update, N delete, N no-op.
Secret rendering
A leaf value that is a secret marker (in either prior or desired) prints as <secret:NAME sha:abc123>, where abc123 is the first six hex chars of the SHA-256 of the plaintext. Enough to spot a rotation; not enough to attack offline. See Secrets.
~ docker_container.stratum-postgres
~ env.POSTGRES_PASSWORD: <secret:pg_password sha:f7c3bc> -> <secret:pg_password sha:9a1e44>
--refresh: live drift detection
When --refresh is set, stratum calls each provider's read method for every step that has prior state (i.e. not Create), compares the result against what's recorded, and prints per-resource annotations:
docker_container.traefik
! DRIFT: image: state="traefik:v2.11" observed="traefik:v3.0"
system_service.docker
! DRIFT: resource missing on host (state says exists)
A footer line summarizes:
drift: clean
Or, when there's drift:
drift: 2 differ, 1 missing, 4 unreadable
- differ —
readreturned data that doesn't match state on at least one field. - missing —
readreturnedAbsent(the resource is gone on the host but still in state). - unreadable — the provider returned
Observed::Unknown(noreadimpl, orreaderrored). Forsystem_ufw_ruleandssh_exec, this is always the case by design.
Drift detection is one-sided: fields that exist in state but not in the observed data are ignored (providers don't surface every field). It is also marker-aware — a state-side secret marker is hashed against an observed plaintext, so secret-bearing fields don't perpetually drift on every refresh. See Architecture: drift detection and Secrets: drift detection.
Create steps are skipped during refresh — there's no prior state to compare.
apply
Compute the plan, print it, and — if confirmed with -y — execute side effects against remote hosts.
stratum apply -y
stratum apply -y -c infra.strat -c app.strat -s .stratum/host.json
stratum apply -y --allow-destroy
stratum -n app apply -y # namespace mode
| flag | short | default | description |
|---|---|---|---|
--config <PATH> | -c | stratum.strat | Path to a .strat config file. Repeatable, like plan -c. See Multi-file configs. |
--state <PATH> | -s | .stratum/state.json | Path to the JSON state file. |
--yes | -y | off | Skip the confirmation gate and execute. Without it, apply prints the plan and exits. |
--allow-destroy | off | Permit Delete steps. Required whenever the plan would remove a resource that's in state but absent from config. See --allow-destroy below. |
apply refuses to run a plan containing any <unresolved-secret:NAME> placeholder — those only appear via plan --allow-unresolved-secrets, and they're a plan-only construct. Resolve the secret's source and retry.
Confirmation gate: without -y, apply prints the plan and stops with Apply? Re-run with -y to execute against remote hosts. There is no interactive prompt — re-run with -y to proceed.
When -y is set, you'll see this banner before the work starts:
!! Applying: side effects WILL execute on remote hosts.
State is written to --state after every successful apply.
Post-apply self-check
After a successful apply, stratum re-loads the config, rebuilds the plan against the freshly updated state, and runs refresh_plan against the live host. The result is one summary line:
post-apply drift: clean
Or, when something is off:
post-apply drift: 1 differ, 4 unreadable — run 'stratum plan --refresh' to see details
This catches resources that "applied successfully" according to the provider but don't actually match reality (rare, but possible — e.g. a systemd service that exits zero from start but immediately crashes). Unreadable counts are expected when your config includes system_ufw_rule or ssh_exec resources; both return Unknown from read by design.
If the plan was a no-op (Nothing to do.), no apply runs and no post-apply check happens.
--allow-destroy: the destruction guard
This flag exists because of a failure shape that's easy to hit by accident. Apply a config against a state file that holds resources the config doesn't declare — by forgetting a -c, by pointing -s at the wrong file, by reusing a bundle state for one slice of a larger deployment — and build_plan will emit a Delete step for every resource in state but not in config. Without a guard, apply executes those deletes, and a single mis-typed command tears down the whole host: containers, networks, services, packages, ufw rules, in roughly the order BTreeMap iteration of <kind>.<name> produces.
The fix is two-part. The structural fix depends on how you organize configs: one state file per host with multi-file configs, or one state file per namespace. The tactical fix is the destruction guard: if a plan contains any Delete step, apply refuses to run unless --allow-destroy is set.
refusing to apply: plan would delete 9 resources not in config:
- system_ufw_rule.allow-ssh
- system_service.ufw
- system_service.docker
- system_package.ufw
- system_file.traefik-config
- docker_network.edge
- docker_container.traefik
- ...
loaded configs: app.strat
state file: .stratum/host.json
If this is intended, re-run with --allow-destroy. If not, you may be applying against the wrong state file (-s), or have forgotten a -c flag.
The check fires when PlanSummary.delete > 0 — the list is every step with Action::Delete. The error names:
- the resources slated for deletion,
- the loaded config files (so a missing
-cis visually obvious), - the state file path (so a wrong
-sis too), - three likely causes.
The guard runs before the -y confirmation gate, so you'll hit the same bail with or without -y.
When to pass --allow-destroy:
- You really did remove a resource from config and want it gone on the host.
- You're tearing down a whole config (commented out resources, intend a full sweep).
When not to pass it: any time the delete list surprises you. Re-check -s and your -c set.
status
Snapshot per-host resource usage. For each unique host declared in the loaded configs, stratum ssh's the host and prints uptime + free memory + root-disk usage + per-container CPU/RAM/IO from docker stats --no-stream. One section per host.
stratum status
stratum status -c infra.strat -c app.strat -s .stratum/host.json
stratum status --host root@192.0.2.10
| flag | short | default | description |
|---|---|---|---|
--config <PATH> | -c | stratum.strat | Config file(s) to enumerate hosts from. Repeatable. |
--host <FILTER> | none | Only query the matching host. Compared against both the host's addr and the block label (host "name" {...}). |
The state file is not consulted — status doesn't read or write state. It only needs host block declarations.
Output shape:
=== root@192.0.2.10 ===
up: 17:48:12 up 3 days, 4:21, 1 user
load: 0.42, 0.31, 0.19
mem: Mem: 1.9Gi 1.3Gi 85Mi 19Mi 559Mi 625Mi
swap: Swap: 4.0Gi 500K 4.0Gi
disk: /dev/vda1 79G 18G 61G 23% /
CONTAINER CPU% MEM USAGE / LIMIT MEM% NET I/O BLOCK I/O
traefik 0.18% 34.2MiB / 1.9GiB 1.78% 1.2MB / 4.5MB 0B / 12.3kB
web 0.04% 12.1MiB / 1.9GiB 0.62% 245kB / 89kB 0B / 0B
db 0.21% 112MiB / 256MiB 43.75% 332kB / 412kB 2.4MB / 1.8MB
cache 0.08% 8.4MiB / 64MiB 13.13% 189kB / 167kB 0B / 0B
The probe is a single composite shell snippet (one ssh round-trip per host) with sentinel headers (===UPTIME===, ===STATS===) to keep parsing simple. docker stats --no-stream is one snapshot, not the streaming default. If docker is absent or there are no running containers, the section prints (no docker stats available) and moves on.
Errors:
- A
hostblock with noaddris skipped with a[status] skipping hostwarning. - An SSH failure on one host is logged with
[status] <addr>: failed: <reason>but does not stop the loop — remaining hosts are still queried. --hostfiltering with no match is a hard error:no host matched filter `<arg>`.
status is a between-applies diagnostic — not a replacement for a real monitoring stack. The headline use case is spotting RAM/CPU pressure (a container creeping toward its memory limit, a host swap-thrashing) without leaving stratum's CLI.
state list
List every resource currently in state.
stratum state list
stratum state list --path infra-state.json
| flag | short | default | description |
|---|---|---|---|
--path <PATH> | -p | .stratum/state.json | Path to the JSON state file. |
Output is one line per resource: <kind>.<name> [<provider>]. Empty state prints (empty state).
state show
Print the full JSON for a single resource.
stratum state show docker_container.traefik
stratum state show ssh_exec.uptime --path infra-state.json
Positional argument:
| arg | description |
|---|---|
<addr> | Resource address, <kind>.<name> (split on the first .). |
| flag | short | default | description |
|---|---|---|---|
--path <PATH> | -p | .stratum/state.json | Path to the JSON state file. |
Prints the pretty-printed ResourceState JSON, or not found if the address is not in state.
state merge
Merge two or more state files into one. Used to consolidate per-config state files (e.g. .stratum/infra.json, .stratum/app.json) into one state per host.
stratum state merge \
-o .stratum/host.json \
.stratum/infra.json \
.stratum/app.json
| arg / flag | short | default | description |
|---|---|---|---|
<INPUTS>... | — | Source state files. At least two are required. | |
--out <PATH> | -o | — | Output path. Must not already exist — refuses to overwrite. Remove the file or pick a different -o to retry. |
On success:
merged 2 state files into .stratum/host.json (13 resources)
Failure modes:
- The output path already exists →
refusing to overwrite existing state file .... - Any
<kind>.<name>key appears in more than one input →key ... present in both X and Y — refusing to merge. There is no last-writer-wins. Resolve the collision (rename one resource, or remove it from one state) and retry.
After merge, verify by running stratum plan against the consolidated state with the full -c set — a non-zero diff means something is off and the old per-config files should not be deleted yet.
Architecture
stratum is a Cargo workspace of six crates. The flow is config → desired resources → diff against prior state → plan → apply via providers → new state → post-apply drift check.
Workspace layout
crates/
core/ stratum-core
config/ stratum-config
cli/ stratum-cli (the `stratum` binary)
providers/
ssh/ stratum-provider-ssh
docker/ stratum-provider-docker
system/ stratum-provider-system
git/ stratum-provider-git
core has no provider dependencies. config depends on core only for its DesiredResource / ResourceAddr types. Providers depend on core for the trait. The CLI wires everything together.
Core types
All types live in crates/core/src/lib.rs.
ResourceAddr
#![allow(unused)] fn main() { struct ResourceAddr { kind: String, name: String } }
Renders as <kind>.<name>. Used as the key in the state map and as the user-facing identifier in CLI output.
ResourceState
#![allow(unused)] fn main() { struct ResourceState { addr: ResourceAddr, provider: String, attrs: serde_json::Value, } }
One per tracked resource. attrs is whatever the provider returned from its last create / update.
State
#![allow(unused)] fn main() { struct State { version: u32, // file format version, default 1 resources: BTreeMap<String, ResourceState>, // keyed by addr.key() } }
On-disk JSON, loaded with State::load(path) and saved with State::save(path). Default path is .stratum/state.json. A missing file is treated as an empty state. The parent directory is created on save.
Action and FieldChange
#![allow(unused)] fn main() { enum Action { NoOp, Create, Update { changes: Vec<FieldChange> }, Delete, } struct FieldChange { field: String, from: Value, to: Value } }
Action::symbol() returns the two-character prefix used in plan output ( , +, ~, -).
Observed and Drift
#![allow(unused)] fn main() { enum Observed { Present(Value), // resource exists; attrs normalized to state shape Absent, // confirmed gone on the host Unknown(String), // provider can't tell (carries a reason) } struct Drift { changes: Vec<FieldChange>, missing: bool, // state says exists but observed == Absent unreadable: Option<String>, // observed == Unknown OR read returned Err } }
Drift is per-resource, populated by refresh_plan. Drift::is_clean() is true when all three fields are empty/false/none.
PlannedResource and Plan
#![allow(unused)] fn main() { struct PlannedResource { addr: ResourceAddr, provider: String, desired: Value, prior: Value, action: Action, drift: Option<Drift>, // None unless refresh_plan was called } struct Plan { steps: Vec<PlannedResource> } }
Plan::summary() returns a PlanSummary { create, update, delete, noop, drifted, missing, unreadable }. Plan::was_refreshed() is true iff any step has a Some(drift).
DesiredResource
#![allow(unused)] fn main() { struct DesiredResource { addr: ResourceAddr, provider: String, attrs: Value, } }
The output of the config evaluator and the input to build_plan.
Plan / apply flow
- Resolve sources. If
-n NAMEis set, the CLI loads the manifest, extracts the named namespace, and resolves the config list to[manifest, ...namespace.configs]and the state path to.stratum/<name>.json(or the namespace's explicitstate =). Otherwise the configs and state path come straight from-c/-s. See Manifest discovery. - Parse config.
stratum_config::load_files(paths)(orload_filefor a single path) runs lex → parse on each file, tags each block with its source path, concatenates into oneDocument, and runs a multi-pass extract: hosts → secrets → namespaces → providers + resources. The result is anExtracted { hosts, providers, resources, secrets, namespaces, redaction_map }. Anysystem_filewithcontent_fileis inlined during this step — seecontent_file. Duplicate hosts/providers/secrets/resources/namespaces across files are hard errors that name both paths. - Cross-namespace check. In namespace mode only, re-load every sibling namespace's configs and check the current namespace's
docker_containerresources for port and container-name collisions. See Cross-namespace validator. - Load state. In bundle mode,
State::load(state_path). In namespace mode,State::load_merged(state_path, _shared.json)— see Split state. Missing file → default empty state. - Build plan.
build_plan(extracted.resources, &state) -> Result<Plan>:- Run two planner-side validators before classification: a port-conflict check on every
docker_container.ports, and adepends_ontopo sort that orders create / update steps and rejects cycles / unknown refs. - For each desired resource (in topo order): lookup prior by addr key. For kinds in
SECRET_CONTENT_TO_SHA, normalize desired before diffing (see Secret-content normalization on plan). Rundiff_observed(prior.attrs, normalize_for_plan(kind, desired.attrs)). No diff →NoOp. OtherwiseCreate(no prior) orUpdate { changes }. - For each prior resource not in desired →
Delete, in forward topo order over state-residentdepends_onedges.
- Run two planner-side validators before classification: a port-conflict check on every
- Optional refresh. With
plan --refresh, runrefresh_plan(&mut plan, ®istry)to annotate every non-create step with observed drift. - Print plan. Symbol per resource, fields-changed lines for updates, drift annotations if refreshed, summary at the end.
- Confirmation gate. Without
-y, exit here.applywithout-yis identical toplanplus the "Apply? Re-run with -y to execute" line. - Build registry. Instantiate all providers. (No shipped provider reads its
provider { ... }block today.) - Execute. For each plan step (in topo order), look up the provider by kind prefix and call
create/update/delete. After everydocker_containercreate or update, run the post-apply readiness wait before moving to the next step. Update state with the returned attrs (or remove the entry on delete). - Save state. In bundle mode,
state.save(state_path). In namespace mode,state.save_split(state_path, _shared.json)—_stratum_*addresses route to the shared file, everything else to the namespace's file. - Post-apply self-check. Reload the config, rebuild the plan against the new state, run
refresh_planagain, and print one summary line:post-apply drift: cleanorpost-apply drift: N differ, M missing, K unreadable — run 'stratum plan --refresh' to see details.
The Provider trait
#![allow(unused)] fn main() { #[async_trait] trait Provider: Send + Sync { fn name(&self) -> &str; fn kinds(&self) -> &[&'static str]; fn configure(&mut self, _attrs: &Value) -> Result<()> { Ok(()) } async fn create(&self, kind: &str, name: &str, attrs: &Value) -> Result<Value>; async fn update(&self, kind: &str, name: &str, prior: &Value, attrs: &Value) -> Result<Value>; async fn delete(&self, kind: &str, name: &str, prior: &Value) -> Result<()>; async fn read(&self, _kind: &str, _name: &str, _prior: &Value) -> Result<Observed> { Ok(Observed::Unknown("provider does not implement read".into())) } } }
name()is the lookup key in the registry.kinds()lists every kind the provider owns. The registry'sfor_kindscans providers and returns the first match.configureis called once at apply time, with theprovider "<name>" { ... }body. Default impl ignores it. No shipped provider implements it today.create,update,deletereturn the newattrsto record in state (or()for delete). The returned value is what the next plan will diff against.readmust be non-destructive — it's a query, not a side effect. Default impl returnsUnknown. Implementations should normalize the returnedValueto the same shape as state attrs.
The diff algorithm
There are two diff functions in core. They serve different purposes.
diff (symmetric, used by Action::Update legacy path)
#![allow(unused)] fn main() { fn diff(prior: &Value, desired: &Value) -> Vec<FieldChange> }
A recursive walk over JSON values:
- If
prior == desiredexactly, return no changes. - If both are JSON objects, walk their union of keys (sorted, deduplicated). For each key, recurse with the dotted path
<prefix>.<key>. - Otherwise, emit a single
FieldChange { field: <path>, from: prior, to: desired }. Thefieldis"<root>"when the diff lives at the document root.
diff_observed (one-sided, used by build_plan and refresh_plan)
#![allow(unused)] fn main() { fn diff_observed(prior: &Value, observed: &Value) -> Vec<FieldChange> }
Used both by build_plan (comparing state-stored prior against desired config) and by refresh_plan (comparing state against live observation). Rules differ from diff:
- State-only fields are ignored. Only keys present in
observedare walked. A field that's inpriorbut not inobserveddoes not generate drift. This is what lets providers store extra fields (container_id,sha256, etc.) without polluting plans. - Missing key vs empty container = no drift. If
priorhas no keykbutobservedhask: {}ork: [], that's not drift. Same forprior: nullvsobserved: {}/[]. - String arrays are compared as sets.
["a", "b"]and["b", "a"]are equal. Non-string arrays are compared by order. - Added keys in observed → flagged. A key in
observedbut not inpriorshows up asfrom: null, to: <value>.
The provider's read implementation is responsible for trimming the observed value to a shape that mirrors state, so noise doesn't leak through. For example, docker_container strips com.docker.* labels and intersects with the state's label key set.
Drift detection
refresh_plan(&mut plan, ®istry) annotates each plan step with observed drift from live reality.
#![allow(unused)] fn main() { async fn refresh_plan(plan: &mut Plan, registry: &Registry); }
Rules:
Action::Createis skipped — there's no prior state to read.- Sequential per resource. SSH round-trips are I/O-bound but ~10 resources doesn't justify parallelism yet.
- Per-resource errors are caught, not propagated. They become
Drift::unreadable = Some("read failed: ...").refresh_planitself never returnsErr. - The provider's
readis called with(kind, name, &step.prior). The returnedObservedis mapped:Present(observed)→drift.changes = diff_observed(&step.prior, &observed)Absent→drift.missing = trueUnknown(reason)→drift.unreadable = Some(reason)
PlanSummary counts:
- drifted — count of steps where
drift.changesis non-empty. - missing — count of steps where
drift.missing == trueand the action is notDelete. (ADeletestep whose resource is already gone is annotated(already gone on host; delete will noop)instead — that's not drift.) - unreadable — count of steps where
drift.unreadable.is_some().
Planner-side validators
Port-conflict validator
Before classifying steps, build_plan walks every docker_container.ports value across the desired set and checks for (host, ip, host_port) collisions. Two resources binding the same port on the same host is a hard error at plan time, naming both. A 0.0.0.0:N bind symmetrically collides with 127.0.0.1:N — the wildcard bind subsumes the loopback one.
Random ports ("5432" — docker picks the host port) are skipped silently. Port ranges ("8000-8010:8000-8010") get a warning but are not validated. Unrecognized port shapes are skipped to keep the validator forward-compatible.
depends_on topo sort
The planner runs a stable Kahn's-algorithm topo sort over the docker_container.depends_on edges (see depends_on). Properties:
- Stable. Resources without edges keep their input (file) order. Where ties exist, a
BTreeSetready-set picks them in lexicographic addr order. - Implicit
_stratum_*resources stay at the front. They carry no edges and havein_degree = 0, so they land first. - Cycles are a hard error citing the cycle path.
- Unknown references are hard errors citing both the source and the missing target.
The topo order applies to Create and Update steps; Delete order is computed separately.
Secret-content normalization on plan
For kinds where a content field carries a secret value, state stores only sha256 (the plaintext is unrecoverable from state) but desired carries the full plaintext at plan time. A naive diff_observed(prior, desired) would emit content: null -> "<plaintext>" on every plan, leaking the value into CLI output.
build_plan normalizes desired before diffing. The kinds that opt into this live in a const SECRET_CONTENT_TO_SHA: &[(&str, &str)]:
| kind | content field |
|---|---|
system_secret_file | content |
For each entry, normalize_for_plan(kind, attrs) clones attrs, removes the named field, and inserts sha256: <hex> derived from its UTF-8 bytes. Diff then compares sha against sha — exactly the same shape state holds. Plaintext never reaches the diff.
This is the inverse half of the kind's own apply-time unchanged check (which compares the same sha against prior state to decide whether to re-upload). The two together guarantee that a plaintext secret never appears in plan output, in state, or in apply logs.
Plan-level secret redaction
After build_plan returns and after refresh_plan runs, the CLI calls Extracted::redact_plan(&mut plan) once before printing. This walk does two things:
- Apply substring redaction to every step's
desired,prior, and per-FieldChangefrom/to. A leaf string containing a known secret plaintext (introduced via${...}interpolation) gets each occurrence replaced with the inline<secret:NAME:sha256:HEX>marker. Exact-match leaves are replaced with the object marker, same as everywhere else. - Drop redaction-cancelled changes. When state holds the inline substring marker and observed returns plaintext, both sides collapse to the same marker after the walk. Any
FieldChangewherefrom == topost-redaction is dropped. If anAction::Update's changes list becomes empty, the step is downgraded toAction::NoOp— drift that was only a substring-marker-vs-plaintext difference disappears entirely.
This is what stops plan --refresh from emitting spurious updates on every secret-bearing interpolated field. See Secrets: substring redaction.
Post-apply readiness wait
After every successful docker_container create or update, the planner pauses before moving on to the next step (which may be a dependent declared via depends_on). The wait lives in the CLI in post_apply_wait:
- If
desired.healthcheckis present, polldocker inspect --format '{{.State.Health.Status}}' <name>once a second, up to 60 polls. Terminal statuses arehealthy(proceed),unhealthy(fail the apply), or empty /noneon the first poll (no health check at the docker level — proceed).startingand other interim values keep polling. - Otherwise, sleep 500ms. This is cosmetic — docker often needs a beat to wire networks and volumes before something else pokes the container.
Non-docker_container steps return immediately. The provider's own create / update is synchronous: git_repo clones return when done, system_secret_file returns when the SSH upload completes.
The poll loop itself is in core (poll_container_health), separated from SSH plumbing so it's unit-testable with a mocked inspector.
Delete ordering
build_plan emits delete steps in forward topo order over state-resident depends_on edges. For two resources X and Y where X depends_on Y at runtime, X is torn down before Y — the dependent goes first so the dependency is still serving while it shuts down.
Resources without recorded depends_on edges fall through with in_degree = 0 and end up before any edged resources, in reverse-iteration order of the state BTreeMap (which preserves the prior file-order-independent behavior). This keeps the heuristic close to "leaves before roots" for hand-written configs even when no depends_on is declared.
depends_on is recorded in state at create / update time and survives across apply runs, so a delete computed against state still knows the edges the resource was declared with — even when the resource is no longer in config.
Implicit per-host resources
For every host block in the merged document, extract injects three implicit resources before any user-declared ones, addressed under the _stratum_ prefix:
| addr | kind | purpose |
|---|---|---|
ssh_exec._stratum_swap_<host> | ssh_exec | Creates a 4 GB /swapfile, enables it, persists in fstab. |
system_file._stratum_sshd_oom_<host> | system_file | Drops /etc/systemd/system/ssh.service.d/oom.conf with OOMScoreAdjust=-1000. |
ssh_exec._stratum_sshd_reload_<host> | ssh_exec | systemctl daemon-reload && systemctl restart ssh. |
The first two exist so that under memory pressure the kernel does not kill sshd — which would lock the operator out of recovery. The third applies the drop-in. They are stable across versions and live at the front of the desired list (in_degree 0), so they apply before any user resource on the host.
In namespace mode they are routed to _shared.json so multiple namespaces sharing a host don't each try to recreate them. In bundle mode they share the single state file with everything else.
The _stratum_ prefix is reserved. User-declared resources should not use it.
Manifest discovery (namespace mode)
When -n NAME is set, the CLI resolves the config + state paths as follows. See Namespaces for the syntax.
- Locate the manifest. If
--manifest PATHwas passed, that path is used. Otherwise the CLI requires./stratum.stratto exist; if it doesn't, the command errors. - Load the manifest. Runs
stratum_config::load_file(manifest), producing anExtractedwith one or morenamespacedeclarations. - Look up the namespace. If the named namespace isn't in the manifest, error with the list of known namespaces.
- Resolve configs. The merged list is
[manifest, ...ns.configs]. The manifest is always first so its top-levelhost/secret/providerblocks are visible to every per-namespace file. Eachconfigsentry is absolutized at parse time against the manifest's directory. - Resolve state. Priority order: explicit
-son the command line, then the namespace's body-levelstate =, then.stratum/<name>.json.
Passing -c together with -n is a hard error — the namespace's configs = [...] is the config list, and a -c override would silently shadow it. Bundle mode (no -n) is unchanged by namespace support.
Cross-namespace validator
Namespace mode's plan and apply run a sibling-collision check before classification. The check exists because build_plan operates inside one namespace's view of the world — it has no visibility into what other namespaces declare — so two namespaces could each plan a docker_container binding the same (host, host_port) and only discover the conflict at apply time, when one fails over a port already taken by the other.
The check, in validate_cross_namespace:
- Re-loads the manifest (cheap; it has no resources).
- For each sibling namespace (every one except the current), loads its configs with
LoadOptions::allow_unresolved_secrets = trueso a missing env var in some unrelated namespace doesn't block planning the current one. - Walks every
docker_containerin every sibling, collecting:- Port claims. Each
portsentry is parsed for the host-port half ofH:CorIP:H:C. Ranges (8000-8010:...) and bare-port shapes (where docker picks the host port) are skipped. - Name claims. The container's
nameattribute, falling back to the resource's label.
- Port claims. Each
- Checks every
docker_containerin the current namespace against the collected claims, erroring on the first(host, port)or(host, name)collision and naming both the offending current-namespace address and the sibling that owns the claim.
The validator is skipped entirely in bundle mode. Within a single namespace, the existing planner-side port-conflict validator catches collisions within the same desired set. The cross-namespace validator is strictly the inter-namespace layer above it.
The sibling loader uses allow_unresolved_secrets = true defensively — it's only collecting addresses, ports, and names, none of which depend on secret plaintext. If a sibling load fails for any other reason, the error is logged and that sibling is skipped (the plan still proceeds), so a broken sibling doesn't gate apply of an unrelated namespace.
Split state (namespace mode)
In namespace mode the state on disk is two files instead of one:
.stratum/
<name>.json # user-declared resources for namespace `<name>`
_shared.json # implicit per-host _stratum_* resources
State::save_split(ns_path, shared_path) walks self.resources and routes each entry by addr name: anything starting with _stratum_ goes to _shared.json, everything else to <name>.json. Both files are written every save (with parent dirs created), even when one side is empty — that keeps the next load predictable.
State::load_merged(ns_path, shared_path) is the inverse. It loads both files and unions their resources maps, with the namespace's entry winning any addr.key() collision (the more recently touched of the two, since the active scope just ran). Missing files become empty state (matches load).
Bundle mode keeps using the single-file State::load(path) and State::save(path). The CLI picks the right pair via the -n flag — load_state / save_state in crates/cli/src/main.rs switch on whether a shared path is set.
The split is what lets two namespaces targeting the same host co-exist without each trying to own the per-host tuning resources. First namespace applies: _stratum_swap_*, _stratum_sshd_oom_*, _stratum_sshd_reload_* land in _shared.json. Second namespace plans: load_merged pulls them back from the shared file into its working state, so the new plan sees them as no-op. Without the split, the second apply would see them missing from its state file and recreate them, churning the swap file and restarting sshd on every cross-namespace apply.
State file shape
{
"version": 1,
"resources": {
"docker_container.traefik": {
"addr": { "kind": "docker_container", "name": "traefik" },
"provider": "docker",
"attrs": {
"host": "root@192.0.2.10",
"image": "traefik:v2.11",
"container_id": "abc123...",
"...": "..."
}
},
"system_package.docker": { "...": "..." }
}
}
Resources are keyed by <kind>.<name> in a BTreeMap, so the on-disk order is deterministic (lexicographic). The file is overwritten in full on every successful apply.
Secret markers in state
When a resource attr resolves from a secret ref, the provider receives plaintext but state stores a redaction marker:
{
"env": {
"POSTGRES_PASSWORD": {
"__secret": "pg_password",
"__secret_sha256": "sha256:f7c3bc1d808e04..."
}
}
}
The marker is written by Extracted::redact_into, called between every provider return and state.upsert. diff and diff_observed are marker-aware (see core::secret_compare): a marker compares equal to plaintext when the plaintext's hash matches the marker's __secret_sha256, and a marker-vs-marker compare uses only the hashes. This is what keeps --refresh from showing perpetual drift on secret-bearing fields. The CLI's render function prints markers as <secret:NAME sha:abc123> — six hex chars, enough to spot a rotation, not enough to attack offline.
Bootstrap a fresh droplet
End-to-end walkthrough: take a blank Ubuntu 24.04 droplet and bring it to ufw + docker + traefik in one stratum apply. The config and the Traefik file it ships are both written below; substitute your host's address where indicated.
What you get
After apply:
ufwanddocker.ioinstalled via apt.- ufw rules allowing
22/tcp,80/tcp,443/tcp. dockerandufwsystemd units enabled and started.- ufw ruleset actually activated (
ufw --force enableviassh_exec). - A
stratum-edgeDocker network for Traefik to attach to. /etc/traefik/traefik.ymlwritten from a local file.- A
traefik:v2.11container on:80and:443, attached tostratum-edge, with/var/run/docker.sockmounted.
11 resources, all created in one apply.
Prerequisites
- A fresh Ubuntu 24.04 droplet with a public IP. Any cloud provider works; the tutorial uses DigitalOcean. Note the IP.
- SSH key access as
root. Most cloud providers let you inject an SSH key at provision time. Stratum usesBatchMode=yes, so the key must already be in your local ssh-agent or referenced in~/.ssh/config. No password prompts — they'll hang the apply. - stratum built. From the repo root:
cargo build --release.
Step 1: verify SSH
ssh root@<ip> echo ok
Expect a single ok and exit 0. If you get a password prompt or a Permission denied, fix that before continuing — stratum won't be able to authenticate either.
Step 2: write the config
Create a bootstrap.strat with the host's address:
host "primary" {
addr = "root@<ip>"
}
Replace <ip> with the droplet's public IP. The rest of the file references host.primary.addr — you don't have to touch anything else.
The bootstrap pulls a Traefik config from files/traefik.yml, resolved relative to the .strat file itself. Place your Traefik static config at files/traefik.yml next to bootstrap.strat.
Step 3: plan
./target/release/stratum plan -c bootstrap.strat
Expected output:
stratum plan
============
+ docker_container.traefik
+ docker_network.edge
+ ssh_exec.ufw-activate
+ system_file.traefik-config
+ system_package.docker
+ system_package.ufw
+ system_service.docker
+ system_service.ufw
+ system_ufw_rule.allow-http
+ system_ufw_rule.allow-https
+ system_ufw_rule.allow-ssh
11 create, 0 update, 0 delete, 0 no-op
(The exact alphabetical order comes from BTreeMap iteration of <kind>.<name> keys.) Read-only — nothing is touched on the host.
Step 4: apply
./target/release/stratum apply -y -c bootstrap.strat
The order in which resources execute is the same as the plan output (a depends-on graph is not implemented). The bootstrap config is hand-ordered to avoid the obvious foot-guns:
ufwanddocker.iopackages install (apt-get updateruns once).- ufw rules are added — before ufw is activated.
dockerandufwsystemd units start.ssh_exec "ufw-activate"runsufw --force enable. The ruleset is now enforcing.- The
stratum-edgeDocker network is created. traefik.ymlis dropped at/etc/traefik/traefik.yml.- The Traefik container starts.
After the work runs, you'll see:
State saved to .stratum/state.json
post-apply drift: 4 unreadable — run 'stratum plan --refresh' to see details
4 unreadable is expected: the three system_ufw_rule resources and one ssh_exec resource always return Observed::Unknown from read (see providers/system and providers/ssh).
Step 5: verify
# Traefik should respond — even if just with a 404 (no routes configured yet).
curl -k -I https://<ip>
# HTTP/2 404 ...
# Container should be running.
ssh root@<ip> docker ps
# CONTAINER ID IMAGE ... PORTS NAMES
# abc123def traefik:v2.11 ... 0.0.0.0:80->80/tcp, ...:443->443 traefik
# UFW should be active and have the three rules.
ssh root@<ip> ufw status
# Status: active
# To Action From
# 22/tcp ALLOW Anywhere
# 80/tcp ALLOW Anywhere
# 443/tcp ALLOW Anywhere
Step 6: re-plan with drift detection
./target/release/stratum plan --refresh -c bootstrap.strat
Every step prints with (no-op) — config matches state. The footer:
0 create, 0 update, 0 delete, 11 no-op
drift: 4 unreadable
The same 4 unreadable (three ufw rules + one ssh_exec) — nothing surprising. If any value on the host drifts from state (e.g. someone manually docker stop traefik), --refresh will surface it:
docker_container.traefik
! DRIFT: resource missing on host (state says exists)
0 create, 0 update, 0 delete, 11 no-op
drift: 1 missing, 4 unreadable
Tear it down
Comment out (or delete) the resource blocks, leaving the host block. The resulting plan is all Delete steps, which trips the destruction guard — pass --allow-destroy to acknowledge:
./target/release/stratum apply -y --allow-destroy -c bootstrap.strat
build_plan emits deletes in reverse alphabetical order of <kind>.<name> (see Delete ordering). For the bootstrap config that conveniently gives you a reasonable teardown order — the Traefik container goes before the docker service, ufw rules go before the ufw service. It is not guaranteed to be safe for arbitrary configs; if you have inter-resource ordering needs, remove resources from the config in stages.
What's next
- Layer your own containers on top of Traefik by adding more
docker_containerresources, or follow the Serve a static site behind Traefik tutorial to add a second app sharing the same state file. - For multiple independent slices on the same host, see Multi-namespace deployments — each slice plans and applies on its own with cross-slice port and name collisions caught at plan time.
- Read providers/system for the full attribute schema if you want to extend the config.
Serve a static site behind Traefik
End-to-end walkthrough: take a host that already has docker, Traefik, and the stratum-edge network (i.e. one you just bootstrapped with bootstrap-droplet), and add a second application — an nginx container serving a static directory — routed by Traefik.
This is the canonical "second app behind Traefik" pattern. Use it as a template for any static-asset deploy. The two configs (bootstrap + this one) apply together against one state file — see Multi-file configs for why state is per-host and not per-config. If you'd rather apply them as independent slices, the same shape works as two namespaces — see the closing section of this tutorial.
What you get
After apply:
/srv/site/on the host, containing whatever you pointsource_dirat (the output of any static-site build —mdbook build,hugo, a folder of pre-rendered HTML).- A
sitenginx:alpinecontainer, mounting that directory read-only at the nginx web root. - Traefik labels routing
Host(site.example.com)to the container. - Two new resources in state:
system_dir.siteanddocker_container.site. The 11 resources from the bootstrap are untouched.
Prerequisites
- Bootstrap done. The host needs the docker daemon, Traefik running on
:80/:443, and thestratum-edgenetwork. See Bootstrap a fresh droplet. - The shared host state file. Use the same
-s .stratum/host.jsonyou applied the bootstrap with. State is one-per-host, not one-per-config — both.stratfiles apply together against the shared state via repeated-cflags. See Multi-file configs. - A built static tree. Any directory of HTML/CSS/JS/assets works. The config below uploads
./site/from next to the.stratfile; substitute any local directory.
Step 1: the config
host "primary" {
addr = "root@192.0.2.10"
}
resource "system_dir" "site" {
host = host.primary.addr
source_dir = "site"
path = "/srv/site"
mode = "0644"
dir_mode = "0755"
owner = "root"
group = "root"
delete_extra = true
}
resource "docker_container" "site" {
host = host.primary.addr
name = "site"
image = "nginx:alpine"
restart = "unless-stopped"
networks = ["stratum-edge"]
volumes = [
"/srv/site:/usr/share/nginx/html:ro",
]
labels = {
"traefik.enable" = "true"
"traefik.http.routers.site.rule" = "Host(`site.example.com`)"
"traefik.http.routers.site.entrypoints" = "web"
"traefik.http.services.site.loadbalancer.server.port" = "80"
}
}
Two resources. Notable details:
delete_extra = truekeeps the remote tree in sync: files removed fromsite/locally getrm -f'd on the host on the next apply. Without it, deleted local files stay on the host indefinitely.- Substitute
site.example.comwith a hostname that resolves to your host. For a quick demo without DNS, a service likenip.ioresolves any hostname of the form<anything>.<ip>.nip.ioto<ip>— useful for development, not for production. - The container attaches to
stratum-edge— the same network Traefik discovers via the docker socket. No host port mapping needed; Traefik proxies via the network. - The Traefik labels are exactly the Traefik 2.x router/service form. Stratum doesn't parse them — they're opaque strings handed to docker.
Step 2: build the static tree
Build whatever your site is and drop the output in site/ next to the .strat file. The exact command depends on the generator — mdbook build, hugo, npm run build, or cp -R public site/. The system_dir provider doesn't care about the source; it just walks the directory tree and ships every regular file.
Step 3: plan
Apply both configs together against the shared host state. The bootstrap resources are already in state, so the only Create steps are the two new ones.
./target/release/stratum plan \
-c bootstrap.strat \
-c site.strat \
-s .stratum/host.json
Expected output:
stratum plan
============
docker_container.traefik
docker_network.edge
...
+ docker_container.site
+ system_dir.site
...
2 create, 0 update, 0 delete, 11 no-op
(Order shown abbreviated.) Eleven resources are in state and unchanged; two new ones are queued for create.
Step 4: apply
./target/release/stratum apply -y \
-c bootstrap.strat \
-c site.strat \
-s .stratum/host.json
The system_dir step tars + gzips site/ in memory, streams it over SSH, extracts on the host, and applies chown -R + chmod recursively. You'll see one summary line on stderr:
[system] DIR `site` -> root@192.0.2.10:/srv/site (N files, mode=0644 dir_mode=0755 root:root)
Then the container starts. Post-apply self-check:
post-apply drift: clean
(No unreadable count — neither resource uses an Unknown read.)
Step 5: verify
curl -H "Host: site.example.com" http://<host-ip>/
# or open http://site.example.com in a browser (with DNS pointed at the host).
You should get the site's landing page.
Re-deploying after content changes
Rebuild the static tree locally, then apply again. The system_dir provider hashes every file and compares against manifest_sha256 in prior state:
- No content changes →
[system] DIR \site` -> ... unchanged (N files, manifest match)` and the upload is skipped. - Any file added, removed, or modified → manifest digest changes, the whole tree re-tars and re-uploads. With
delete_extra = true, files removed locally arerm -f'd on the host as part of the apply.
The docker_container is unchanged in either case — nginx is just serving from a bind mount, so new files on disk show up immediately without a container restart.
Why both -c flags together, not separately
Apply site.strat alone against .stratum/host.json and the plan diff is "11 deletes, 2 creates" — the site config doesn't mention any of the bootstrap resources, so build_plan flags them for deletion. The destruction guard catches the resulting apply with a list naming the loaded configs:
refusing to apply: plan would delete 11 resources not in config:
- ...
loaded configs: site.strat
state file: .stratum/host.json
The missing config (bootstrap.strat) is visually obvious in loaded configs. The structural fix is to pass every .strat file that touches the host to every plan / apply. State is per-host. See Multi-file configs.
Or: split into two namespaces
If bootstrap.strat and site.strat are logically independent — bootstrap rarely changes, the site re-deploys often — wrapping them as two namespaces lets each one plan and apply on its own without juggling a long -c list. The shape is:
# stratum.strat
host "primary" {
addr = "root@192.0.2.10"
}
namespace "infra" { configs = ["bootstrap.strat"] }
namespace "site" { configs = ["site.strat"] }
Then stratum -n infra apply -y for the host-tier setup and stratum -n site apply -y for the site, each against its own state file. See the Multi-namespace deployments tutorial for the full walkthrough.
What's next
- Layer additional apps the same way: one
system_dir(orsystem_filefor a single config) plus onedocker_containerwith Traefik labels, each in its own.stratfile, and add the file to your-clist (or to a new namespace). - For dynamic apps (build images, manage envs, blue/green rollouts), reach for a per-app deploy tool — stratum is for the host-tier setup, not per-app lifecycle.
Inject a secret into a docker container
You have a docker_container resource that needs a sensitive env var — a database password, an API token, a webhook secret. You want the value to come from your shell environment (or a file outside git), flow through stratum into the container's env map, and never land in .stratum/state.json as plaintext.
This is what secret blocks are for. The pattern is two resources: one secret block sourcing the value, and one docker_container reading secret.<name>.value inside its env map. Stratum substitutes the plaintext at apply time and stores a redaction marker in state.
Why this works the way it does
Anything you put in a .strat file is committed alongside your code. Anything you put in .stratum/state.json is committed (if you commit state) or sits on your disk in plaintext (if you don't). Neither place is somewhere a database password belongs.
Stratum splits the problem: the value lives in your shell (or a file you keep out of git), and the reference lives in the config. State stores a {__secret, __secret_sha256} marker, which is enough to tell that a secret is set and whether it changed, but not what it is. See Secrets for the full mechanism.
Step 1: source the value
Decide where the value comes from. Two options:
# From an env var.
secret "pg_password" {
from_env = "PG_PASSWORD"
}
# Or from a file outside git.
secret "pg_password" {
from_file = "~/.config/stratum/pg-password"
}
from_env reads the variable with std::env::var at config-load time. from_file reads the file; ~ and ~/ expand to $HOME (or $USERPROFILE on Windows). Relative paths resolve next to the .strat file.
Pick one. Setting both, or neither, is a hard error.
Step 2: reference it in a container
host "primary" {
addr = "root@192.0.2.10"
}
secret "pg_password" {
from_env = "PG_PASSWORD"
}
resource "docker_container" "db" {
host = host.primary.addr
name = "db"
image = "postgres:16-alpine"
restart = "unless-stopped"
networks = ["stratum-edge"]
ports = ["127.0.0.1:5432:5432"]
volumes = ["pg-data:/var/lib/postgresql/data"]
env = {
POSTGRES_PASSWORD = secret.pg_password.value
POSTGRES_USER = "postgres"
POSTGRES_DB = "app"
}
}
The ref secret.pg_password.value evaluates to the plaintext during ref resolution. The provider — docker here — receives a normal string in env.POSTGRES_PASSWORD and sets -e POSTGRES_PASSWORD=<value> on docker run.
A secret ref is only allowed in single-leaf string attrs. Putting it inside system_file.content (where it would land in a config file blob stratum can't redact) is rejected at load time. See the honesty guard for the full list.
Step 3: plan it
Set the env var, then plan:
export PG_PASSWORD=$(openssl rand -hex 32)
./target/release/stratum plan \
-c db.strat \
-s .stratum/host.json
The secret-bearing field renders with a 6-char hash prefix:
+ docker_container.db
...
~ env.POSTGRES_PASSWORD: null -> <secret:pg_password sha:f7c3bc>
The hash is enough to spot a rotation (the prefix changes) without leaking the value or a full attackable digest. The null -> ... is because this is a Create step; on Update you'd see both sides with their respective hashes.
Step 4: apply it
./target/release/stratum apply -y \
-c db.strat \
-s .stratum/host.json
What happens at apply time:
- The provider receives the plaintext in
env.POSTGRES_PASSWORDand runsdocker run -e POSTGRES_PASSWORD=<value> .... The value lands inside the container's process environment. - Before stratum persists the provider's returned attrs to state, the redaction walk swaps every leaf string that matches a known plaintext for the marker object.
env.POSTGRES_PASSWORDin state ends up as{"__secret": "pg_password", "__secret_sha256": "sha256:f7c3bc..."}. state.savewrites the file. Inspect it:
./target/release/stratum state show docker_container.db -p .stratum/host.json
You'll see the marker in env.POSTGRES_PASSWORD, not the password.
Step 5: rotate
Rotate by changing the source value and re-applying:
export PG_PASSWORD=$(openssl rand -hex 32)
./target/release/stratum apply -y -c db.strat -s .stratum/host.json
The new value hashes differently, so diff_observed sees a marker change and emits an Update step on the container. Docker tears down and recreates the container with the new env var. The plan output shows the hash prefix changing:
~ docker_container.db
~ env.POSTGRES_PASSWORD: <secret:pg_password sha:f7c3bc> -> <secret:pg_password sha:9a1e44>
No plaintext on either side of that line, in the CLI output or in state.
Embedding a secret inside a larger string
A connection string is the common case where a bare secret.X.value doesn't fit — the password is one piece of a URL, not the whole leaf. Use ${...} interpolation:
resource "docker_container" "api" {
host = host.primary.addr
image = "api:dev"
networks = ["stratum-edge"]
env = {
DATABASE_URL = "postgresql://app:${secret.pg_password.value}@db:5432/app"
}
}
At eval time the placeholder is replaced with the plaintext, so the provider receives a working connection string. At redaction time the substring redactor swaps the plaintext for an inline marker, so state ends up with:
"DATABASE_URL": "postgresql://app:<secret:pg_password:sha256:f7c3bc...>@db:5432/app"
Same drift behavior as exact-match: the marker on the state side compares equal to the plaintext on the observed side via the hash, and no perpetual update.
Whole-file secrets
For values too big or too binary to fit in a single env var — a Firebase service-account JSON, an age-encrypted key, a TLS bundle — use system_secret_file instead. The kind accepts a secret ref directly in content; state stores only the file's sha256 plus its permissions.
secret "firebase_sa" {
from_file = "~/.config/app/firebase-sa.json"
}
resource "system_secret_file" "firebase-sa" {
host = host.primary.addr
path = "/etc/app/firebase-sa.json"
content = secret.firebase_sa.value
mode = "0400"
}
resource "docker_container" "api" {
host = host.primary.addr
image = "api:dev"
volumes = ["/etc/app/firebase-sa.json:/app/firebase-sa.json:ro"]
env = {
GOOGLE_APPLICATION_CREDENTIALS = "/app/firebase-sa.json"
}
}
The pattern is: drop the file with system_secret_file, mount it into the container as a read-only volume, point the app at it via a non-sensitive env var. State holds sha256 + mode + owner + group for the file — never the bytes. Re-applying with the same contents is a no-op via the sha-match check; rotating the secret changes the sha and triggers a re-upload (and a container recreate if the mount is bind-mounted, which it is here).
What about plan-only review?
If you want to share a config or review one without populating the env, run plan --allow-unresolved-secrets:
./target/release/stratum plan --allow-unresolved-secrets \
-c db.strat \
-s .stratum/host.json
Unset env vars become the placeholder string <unresolved-secret:NAME> and flow through the plan as a normal string. apply refuses to execute any plan containing such a placeholder — the flag is plan-only by design. See plan --allow-unresolved-secrets.
What's next
- Full reference for the syntax and semantics: Secrets.
- For values you don't mind printing (debug toggles, public keys), add
sensitive = falseinside thesecretblock — the value still flows but is never added to the redaction map.
Multi-namespace deployments
You have a single host running several independent slices of infrastructure: a base layer (firewall, docker, traefik), a database tier, and a couple of apps. You want each slice to plan and apply on its own — without forgetting a -c flag and tripping the destruction guard, and without one slice's state file silently owning what another slice declared.
This is what namespaces are for. You write one manifest at ./stratum.strat that declares the shared host(s) and lists each slice by name; each slice gets its own state file; stratum -n <name> plan/apply operates inside one slice at a time. Stratum checks for docker_container port and name collisions across siblings at plan time, so two slices can't quietly fight over :80.
This tutorial walks through:
- Writing a manifest that splits one host into two namespaces.
- Applying each namespace independently.
- Observing the on-disk state split (
.stratum/<ns>.json+_shared.json). - Provoking a cross-namespace port collision and reading the error.
- Migrating from an existing bundle (
-c X -c Y -s state.json) to per-namespace state.
If you've never applied a stratum config before, start with Bootstrap a fresh droplet for the basics. The setup here assumes a host already exists.
What you'll end up with
./stratum.strat # manifest (host + namespace blocks)
./infra/edge.strat # namespace "infra" config (traefik, edge network)
./app/web.strat # namespace "app" config (nginx behind traefik)
./app/db.strat # namespace "app" config (postgres)
.stratum/
infra.json # state for namespace "infra"
app.json # state for namespace "app"
_shared.json # implicit _stratum_* tuning resources
Two independent state files for two namespaces, one shared file for the per-host tuning resources stratum injects automatically.
Step 1: write the manifest
The manifest is a plain .strat file. It contains:
hostblocks — visible to every namespace.secretblocks — visible to every namespace.namespaceblocks — one per slice.
It does not contain resource blocks. Those live in the per-namespace configs.
# stratum.strat
host "primary" {
addr = "root@192.0.2.10"
}
namespace "infra" {
configs = ["infra/edge.strat"]
}
namespace "app" {
configs = [
"app/web.strat",
"app/db.strat",
]
}
configs paths are resolved relative to the manifest's directory. The manifest itself is loaded first when a namespace is selected, so anything it declares (the host "primary" block, any secret blocks) is visible to every file under configs.
Step 2: per-namespace configs
# infra/edge.strat
resource "docker_network" "edge" {
host = host.primary.addr
name = "stratum-edge"
}
resource "docker_container" "traefik" {
host = host.primary.addr
name = "traefik"
image = "traefik:v2.11"
restart = "unless-stopped"
ports = ["80:80", "443:443"]
volumes = ["/var/run/docker.sock:/var/run/docker.sock:ro"]
networks = ["stratum-edge"]
}
# app/web.strat
resource "docker_container" "web" {
host = host.primary.addr
name = "web"
image = "nginx:alpine"
restart = "unless-stopped"
networks = ["stratum-edge"]
labels = {
"traefik.enable" = "true"
"traefik.http.routers.web.rule" = "Host(`web.example.com`)"
"traefik.http.routers.web.entrypoints" = "web"
}
}
# app/db.strat
secret "db_password" {
from_env = "DB_PASSWORD"
}
resource "docker_container" "db" {
host = host.primary.addr
name = "db"
image = "postgres:16-alpine"
restart = "unless-stopped"
networks = ["stratum-edge"]
env = {
POSTGRES_PASSWORD = secret.db_password.value
POSTGRES_DB = "app"
}
volumes = ["app-db-data:/var/lib/postgresql/data"]
}
Three details:
- Both
app/web.stratandapp/db.stratreferencehost.primary.addr. The host is declared in the manifest, not in either of these files — that's fine because the manifest is always loaded first in namespace mode. - The
secret "db_password"block is scoped to theappnamespace. Theinfranamespace's plan does not loadapp/db.strat, soDB_PASSWORDdoes not need to be set when planninginfra. - All three containers attach to
stratum-edge. The network is created by theinfranamespace; theappnamespace just attaches to it. Stratum does not validate that the network exists across namespaces — applyinfrafirst.
Step 3: plan and apply the infra namespace
stratum -n infra plan
stratum.strat is auto-discovered in the current directory. The -n infra flag resolves to:
- configs:
stratum.strat(the manifest) +infra/edge.strat. - state:
.stratum/infra.json(default for-n infrasince nostate =was set).
Expected plan: one create for docker_network.edge, one for docker_container.traefik, plus three implicit _stratum_* tuning resources stratum injects per host.
+ _stratum_sshd_oom_primary
+ _stratum_sshd_reload_primary
+ _stratum_swap_primary
+ docker_container.traefik
+ docker_network.edge
5 create, 0 update, 0 delete, 0 no-op
Apply it:
stratum -n infra apply -y
After the apply, inspect the state directory:
ls .stratum/
# _shared.json infra.json
infra.json holds the two user-declared resources (docker_container.traefik, docker_network.edge). _shared.json holds the three _stratum_* resources. The split is by addr name: anything starting with _stratum_ goes to _shared.json, everything else to the namespace's file. This is what lets a second namespace targeting the same host see the tuning resources as already-applied instead of trying to recreate them.
Step 4: plan and apply the app namespace
export DB_PASSWORD=$(openssl rand -hex 32)
stratum -n app plan
The app namespace's plan loads the manifest, then app/web.strat, then app/db.strat. Two creates for the new containers; the three _stratum_* resources show as no-op (they're already in _shared.json from the infra apply).
_stratum_sshd_oom_primary
_stratum_sshd_reload_primary
_stratum_swap_primary
+ docker_container.db
+ docker_container.web
2 create, 0 update, 0 delete, 3 no-op
The three no-ops are the heart of why _shared.json exists. Without it, the app namespace's first apply would try to recreate the swap file and the sshd drop-in, and the second apply of infra would do the same thing in reverse — every cross-namespace apply would churn the tuning resources.
Apply:
stratum -n app apply -y
State is now:
.stratum/
_shared.json # 3 _stratum_* resources
app.json # docker_container.{web,db}
infra.json # docker_container.traefik, docker_network.edge
Each namespace can now plan / apply / be torn down on its own without touching the other.
Step 5: collisions are caught at plan time
Now provoke a port conflict on purpose. Add a ports line to app/web.strat claiming :80:
# app/web.strat — buggy version
resource "docker_container" "web" {
# ...
ports = ["80:80"] # WRONG: traefik already binds :80 on this host
}
stratum -n app plan
The cross-namespace validator runs before any plan output prints:
Error: cross-namespace port conflict on host `root@192.0.2.10`:
- app::docker_container.web (current) wants 80
- infra::docker_container.traefik (sibling) already claims it
Three things to notice:
- The error names both namespaces (
app::...for the resource being planned,infra::...for the sibling that already claimed the port). - The host is named in the prefix. Two namespaces using
:80on different hosts is allowed. - The validator runs at plan time, before any side effects — apply doesn't get a chance to fight docker over the port.
Container name collisions are caught the same way:
Error: cross-namespace container name conflict on host `root@192.0.2.10`:
- app::docker_container.db (current) uses name `traefik`
- infra::docker_container.traefik (sibling) already uses it
Resolve the conflict by removing the ports line — the web container is fronted by traefik over the stratum-edge network, so it doesn't need a host port binding.
Step 6: cross-namespace depends_on doesn't work — duplicate instead
Suppose you want to add a build step: a ssh_exec runs docker build to produce an image, and the docker_container in the app namespace consumes it.
# app/web.strat
resource "docker_container" "web" {
# ...
depends_on = ["ssh_exec.build-web"] # WRONG: declared in another namespace
}
If ssh_exec.build-web lives in some other namespace, the planner can't see it — it only loads the current namespace's resources — and you'll get an undeclared-target error at plan time.
The workaround is duplicate the producer: move (or copy) the build step into the namespace that consumes it.
# app/web.strat
resource "ssh_exec" "build-web" {
host = host.primary.addr
command = "cd /srv/repos/web && docker build -t web:dev ."
}
resource "docker_container" "web" {
# ...
image = "web:dev"
pull = false
depends_on = ["ssh_exec.build-web"]
}
Now depends_on is local to the namespace, and the planner can topo-sort the build ahead of the container start. If two namespaces share the same git checkout and need to build it for different consumers, declare the ssh_exec once per namespace — the apt-package equivalent of "each apt cache update runs once per host, not once per consumer."
Migrating from bundle mode
If you've been running stratum with -c X -c Y -s droplet.json, the path to namespaces is mechanical:
- Move shared declarations into a new
stratum.strat. Pull everyhostandsecretblock out of the per-config files and into the manifest. Add onenamespace "<name>" { configs = [...] }per logical slice. - Leave the per-slice configs in place. They keep their
resourceblocks. Anyhost.<name>.<field>references in them now resolve against the manifest'shostdeclaration — no edits required, as long as the host name is unchanged. - Split the bundle state file. Run
stratum state show <addr> -p droplet.jsonfor each resource to identify which namespace it belongs to. Hand-write the per-namespace state files by copying entries out ofdroplet.json. Implicit_stratum_*resources go to_shared.json. - Verify with
plan. For each namespace, runstratum -n <name> plan. Every step should show no-op (0 create, 0 update, 0 delete, N no-op). Anything else means a resource ended up in the wrong file — fix the split. - Mind the
depends_onedges. If anydocker_container.depends_oncrosses what is now a namespace boundary (the producer is in namespace A, the consumer in namespace B), duplicate the producer into B and edge to the local copy. See Step 6 above.
The bundle workflow keeps working without migration. If you don't need per-slice state files, you can keep using -c X -c Y -s state.json indefinitely — nothing about namespaces is mandatory.
What's next
- Namespaces reference — full attribute table and error catalog.
-nand--manifestCLI flags — exact flag semantics, including the-soverride rule.- Architecture: split state — how
_shared.jsonis reconciled at load and save time.
Changelog
2026-05-27
- Documented the
namespaceblock shipment. New pagelanguage/namespaces.mdcovers the manifest-onlynamespace "<n>" { configs = [...] state = "..." }syntax, allowed top-level blocks (manifest vs per-namespace), shared-vs-scoped host/secret/provider visibility, the cross-namespacedepends_onlimitation, and the cross-namespace port + container-name collision check. New tutorialtutorials/namespaces.mdwalks through writing a manifest with two namespaces sharing a host, observing the split state (.stratum/<n>.json+.stratum/_shared.json), provoking a port collision, and migrating from bundle mode. Updatedcli.mdfor the new global-n/--namespaceand--manifestflags, including the-n+-cmutual-exclusion, the-soverride rule, the split-state file pair, and the cross-namespace conflict check. Updatedarchitecture.mdwith new sections: manifest discovery, cross-namespace validator, split state (save_split/load_mergedsemantics), implicit per-host_stratum_*resource catalog, and renumbered the plan/apply flow to thread namespace mode through it. Updatedlanguage/overview.mdfor the new top-level block and the four-pass evaluator. Updatedlanguage/multi-config.mdto point at namespaces as the alternative for multi-slice deployments and generalized "one state per droplet" to "one state per host." Refreshedintroduction.md"what works" with namespaces. Site-specific names (vortex,portal,68.183.228.11,sotheara-say/*,*.nip.io,deploydas a deployed app) were swept fromcli.md,tutorials/{bootstrap-droplet,book-serve,inject-secret-into-container}.md,providers/{system,docker,git}.md,language/{secrets,types,interpolation,resources}.md, and the introduction — replaced with genericweb/api/db/app/host.primary.addr/ RFC 5737 documentation IPs (192.0.2.10) /example.com.
2026-05-26
- Documented the multi-feature shipment (8 gaps closed across config, core, docker, system, and a new git provider). New page
language/interpolation.mdcovers${...}string interpolation: grammar, scalar coercion,\${escape, the honesty guard still applying through templates. New pageproviders/git.mdcovers thegit_repokind: branch / tag / SHA dispatch, recreate-on-url-change,commit_shadrift, depth handling. Extendedproviders/system.mdwithsystem_secret_file(whole-file secret kind,sha256-only state, stricter default mode 0400, secret ref directly incontent). Extendedproviders/docker.mdwithdocker_image(build-on-host producer kind,DOCKER_BUILDKIT=1,image_idin state) and four newdocker_containerattrs:depends_on(planner topo sort + cycle / unknown-ref detection),healthcheck(map lowered to--health-*flags + post-apply readiness wait up to 60s),memory/memory_swap(passthrough todocker run), and list-formcommand(argv-style with shell-escaping). Updatedproviders/ssh.mdfor the newssh_exec.envmap (sorted, shell-quoted, supports secret refs). Updatedcli.mdfor the new global--env-fileflag with auto-./.envload (12-factor: process env wins, first-set wins among files) and the newstratum statussubcommand (per-host uptime + free + df + docker stats table). Updatedarchitecture.mdwith planner-side validators (port-conflict,depends_ontopo sort), thenormalize_for_plan/SECRET_CONTENT_TO_SHAplaintext-leak fix, plan-levelredact_planwalk (drops marker-vs-plaintext spurious drift), the post-apply readiness wait, and rewrote the delete-ordering section for forward-topo with reverse-iteration fallback. Refreshedintroduction.md"what works" (now four providers;${...}interpolation,--env-file,status,git_repo,docker_image,system_secret_file,depends_on,healthcheck, planner validators all listed). Extendedtutorials/inject-secret-into-container.mdwith the connection-string-via-${...}pattern and the whole-file-secret-via-system_secret_filepattern. SUMMARY.md addslanguage/interpolation.mdandproviders/git.md.
2026-05-24 (latest)
- Documented four shipped features. Added
language/secrets.md(full reference forsecretblocks: sources, refs, redaction map, marker shape, sensitive/short-value rules, the honesty guard,--allow-unresolved-secrets). Addedlanguage/multi-config.md(one-state-per-host rule, cross-file refs, duplicate hard errors, pointer tostate merge). Addedtutorials/inject-secret-into-container.md(env-var-on-docker_containerpattern with rotation; inline code blocks — listings infra not yet present). Rewrote the destruction-guard section incli.mdaround the wrong-state-file deletion footgun, updated the error template to includeloaded configs+ state path. Addedstate mergetocli.md. Added--allow-unresolved-secretstoplanflag table. Documenteddocker_container.pull = falseinproviders/docker.mdwith the locally-built-image use case. Documentedsystem_dirempty-dir mode (nosource_dir) inproviders/system.mdand madesource_diroptional in the attribute table; updatedlanguage/types.mdto match. Addedsecretas a top-level block inlanguage/overview.mdand addedsecretas a ref root inlanguage/references.md. Updatedtutorials/book-serve.mdto teach one-state-per-host via multi--c, not one-state-per-config. Refreshedintroduction.md"what works" with all four features. Fixedtraefik:v3.1->traefik:v2.11inarchitecture.mdstate-file example and added a secret-marker subsection.
2026-05-24 (later)
- Documented
system_dirkind (tar+gz upload, manifest sha tracking,delete_extra, 200-filereadcap) underproviders/system.md. Addedsource_dirsemantics tolanguage/types.md. Addedtutorials/book-serve.mdwalking through the canonical "second app behind Traefik" pattern, including the one-state-file-per-host rule. Documentedapply --allow-destroyand the destruction-guard rationale oncli.md; updated the bootstrap teardown step to use it. Refreshed introduction "what works" withsystem_dirand the destroy guard.
2026-05-24
- Scope pivot to ansible-replacement: coolify provider deleted, app-deployment work moved out of scope.
--liveflag dropped (apply -ynow executes). Newsystemprovider documented (system_package,system_service,system_file,system_ufw_rule). Drift detection shipped (stratum plan --refresh, post-apply self-check,Observed/Drifttypes, one-sideddiff_observed).content_fileattribute onsystem_filedocumented under language reference. Replacedtutorials/slice-1-hello.mdwithtutorials/bootstrap-droplet.md. Architecture page got a drift section and a delete-ordering note.
2026-05-24 (earlier)
- Backfilled the book from current source: introduction, language reference (overview / hosts / resources / references / types), provider pages (coolify / ssh / docker), CLI reference, architecture, and the Slice 1 tutorial. Doc agent did the writing; source under
crates/is the source of truth.