Architecture
How podspawn hooks into native sshd to provide ephemeral containers without reimplementing SSH
Podspawn is not an SSH server. It is a single Go binary that sshd invokes on demand -- not a daemon, not a service, not a port listener. Every SSH feature works because OpenSSH handles the protocol. Podspawn handles containers.
The native sshd integration model
The entire server-side integration is two lines in /etc/ssh/sshd_config:
AuthorizedKeysCommand /usr/local/bin/podspawn auth-keys %u %t %k
AuthorizedKeysCommandUser nobodyWhen someone SSHes in, sshd calls podspawn auth-keys with the username. Podspawn reads the local key store at /etc/podspawn/keys/<username>. If the user exists, it returns their public keys wrapped in a command= directive. If not, it returns nothing and sshd falls through to normal ~/.ssh/authorized_keys authentication.
No custom daemon. No port 2222. No TLS termination. No key exchange code.
How AuthorizedKeysCommand works
OpenSSH's AuthorizedKeysCommand is a hook that runs an external program to fetch authorized keys for a user. sshd passes the connecting username as an argument and reads authorized_keys-format lines from stdout.
The authkeys.Lookup function in internal/authkeys/authkeys.go does the actual work:
- Validates the username (rejects path traversal like
../or/) - Opens
/etc/podspawn/keys/<username>-- if the file doesn't exist, returns 0 keys - For each public key line, wraps it with a forced command directive
- Writes the result to stdout for sshd to consume
Each key line returned looks like this:
command="/usr/local/bin/podspawn spawn --user alice",restrict,pty,agent-forwarding,port-forwarding,X11-forwarding ssh-ed25519 AAAA... alice@laptopThe restrict keyword (OpenSSH 7.4+) disables everything by default. The explicit options after it re-enable only what podspawn needs: PTY allocation, agent forwarding, port forwarding, and X11 forwarding.
If podspawn auth-keys crashes or errors out, sshd simply gets no keys and proceeds with normal authentication. Real system users are never affected. The panic recovery in cmd/auth_keys.go ensures the process always exits cleanly.
The session router
When a key matches, sshd forces podspawn spawn --user <username>. The session router in internal/spawn/spawn.go reads SSH_ORIGINAL_COMMAND to determine what the user is doing:
SSH_ORIGINAL_COMMAND="" --> interactive shell (PTY)
SSH_ORIGINAL_COMMAND="sftp-server" --> SFTP subsystem
SSH_ORIGINAL_COMMAND="scp -t /path" --> scp transfer
SSH_ORIGINAL_COMMAND="rsync --server ..." --> rsync
SSH_ORIGINAL_COMMAND="anything else" --> remote command executionThe routing logic in routeSession is straightforward:
- Empty command: interactive shell with TTY, SIGWINCH resize handling via
ExecIDCallback - SFTP detected: exec
/usr/lib/openssh/sftp-serverinside the container - Everything else: exec
sh -c "<original command>"inside the container
All paths pipe stdin/stdout/stderr between sshd and docker exec, then propagate the exit code back through os.Exit. Tools that check exit codes (rsync, CI scripts, VS Code) all behave correctly.
End-to-end flow
User runs: ssh alice@work.pod
|
v
Client ~/.ssh/config matches *.pod
|
v
ProxyCommand: podspawn connect alice work.pod 22
|
v
Resolves work.pod --> actual server from ~/.podspawn/config.yaml
|
v
SSH connection to real server, username "alice"
|
v
sshd calls: podspawn auth-keys alice
|
+-- alice in /etc/podspawn/keys/alice?
| YES --> return keys with command="podspawn spawn --user alice"
| NO --> return nothing, sshd falls through to normal auth
|
v
Key matches --> forced command: podspawn spawn --user alice
|
v
Session router reads SSH_ORIGINAL_COMMAND, creates/reattaches container
|
v
I/O piped, exit code propagated
|
v
User exits --> grace period --> container destroyedWhy zero SSH protocol code
Every custom SSH server in Go -- including ContainerSSH and anything built on gliderlabs/ssh or golang.org/x/crypto/ssh -- carries the risk of bugs in key exchange, cipher negotiation, and channel handling. CVE-2024-45337 (authentication bypass in golang.org/x/crypto/ssh) demonstrated this concretely.
Podspawn's architecture is immune because it never touches the SSH protocol. These features are handled entirely by sshd with zero code in podspawn:
| Feature | How sshd handles it |
|---|---|
Port forwarding (-L, -R) | direct-tcpip and tcpip-forward channels, processed before ForceCommand |
SOCKS proxy (-D) | Dynamic forwarding, native sshd feature |
Agent forwarding (-A) | sshd creates socket, sets SSH_AUTH_SOCK; podspawn bind-mounts it |
| X11 forwarding | sshd handles protocol, sets DISPLAY; podspawn passes env |
| VS Code Remote SSH | SFTP for file sync + exec channels -- both routed by session router |
| JetBrains Gateway | Same as VS Code -- SFTP + exec |
| tmux/screen | Runs inside the container, works naturally |
The Runtime interface
The Runtime interface in internal/runtime/runtime.go is the abstraction boundary between podspawn and container engines:
type Runtime interface {
ContainerExists(ctx context.Context, name string) (bool, error)
CreateContainer(ctx context.Context, opts ContainerOpts) (string, error)
StartContainer(ctx context.Context, id string) error
Exec(ctx context.Context, containerID string, opts ExecOpts) (int, error)
StopContainer(ctx context.Context, id string, timeout time.Duration) error
RemoveContainer(ctx context.Context, id string) error
ResizeExec(ctx context.Context, execID string, height, width uint) error
BuildImage(ctx context.Context, buildCtx io.Reader, tag string) error
ImageExists(ctx context.Context, ref string) (bool, error)
CreateNetwork(ctx context.Context, name string) (string, error)
RemoveNetwork(ctx context.Context, id string) error
ListContainers(ctx context.Context, labelFilter map[string]string) ([]ContainerInfo, error)
InspectContainer(ctx context.Context, id string) (*ContainerInfo, error)
}Two implementations exist:
DockerRuntime(internal/runtime/docker.go) -- production implementation using the Docker Go SDK (github.com/docker/docker/client). Handles image pulling, container lifecycle, exec with TTY/non-TTY multiplexing, network management.FakeRuntime(internal/runtime/fake.go) -- test double that records all calls for assertion. Thread-safe viasync.Mutex. Used by the 96+ unit tests so they run in under 2 seconds without touching Docker.
This separation is the key design decision for testability. The spawn.Session struct accepts any Runtime, so unit tests inject FakeRuntime and integration tests (behind //go:build integration) use DockerRuntime.
The Runtime interface follows the Fly.io Machines pattern: creation (slow -- image pull, rootfs prep) is separated from start (fast -- subsecond). This enables future warm container pools for instant SSH-to-shell times.
Process model
Podspawn's server component is not a long-running daemon. Every SSH session spawns a separate podspawn spawn process, invoked by sshd. Multiple processes hit the same SQLite database simultaneously, coordinated by per-user file locks at /var/lib/podspawn/locks/<username>.lock (see Session Lifecycle for details).
The only optional daemon is podspawn cleanup --daemon, which enforces max lifetimes and expires grace periods. It is not in the critical path -- the system self-heals without it, just with slightly delayed cleanup.
Key paths
| Path | Purpose |
|---|---|
/etc/podspawn/config.yaml | Server configuration |
/etc/podspawn/keys/<username> | Per-user SSH public keys |
/var/lib/podspawn/state.db | SQLite session state (WAL mode) |
/var/lib/podspawn/locks/ | Per-user flock files |
~/.podspawn/config.yaml | Client configuration |