Architecture

HyperionDB Vault has three deployable parts plus a reusable core library, all co-located on each node of a pg_replica cluster.

                      ┌─────────────────────────── node (every pg_replica member) ───────────────────────────┐
   clients  ─────────▶│  hyperion-vault-api (axum)                                                            │
   (readers,          │    ├─ IP allowlist guard (reads)        ├─ writer pool ──(target_session_attrs=rw)──┐ │
    admins)           │    ├─ admin-token guard (writes)        └─ reader pool ──(target_session_attrs=any) │ │
                      │    ├─ service: envelope seal/open ──────▶ AWS KMS (GenerateDataKey / Decrypt)        │ │
                      │    └─ rotation worker (claims jobs)                                                  │ │
                      │                                                                                      ▼ │
                      │  PostgreSQL 18 + extension hyperion_vault                                  primary ◀─┘ │
                      │    ├─ schema `vault` (secrets, secret_versions, admin_tokens, rotation_jobs, audit)    │
                      │    ├─ RLS + service-role policies                                                      │
                      │    └─ rotation supervisor bgworker (primary-only: enqueue + NOTIFY)                    │
                      └──────────────────────────────────────────────────────────────────────────────────────┘
                                              │ physical WAL streaming (pg_replica)
                                              ▼  byte-for-byte to all standbys (read-only)

Components

  • hyperion-vault-core — pure Rust, no I/O. Owns the cryptographic envelope format (seal/open over XChaCha20-Poly1305), the KeyWrapper abstraction, the IPv4 allowlist, admin-token generation/fingerprinting, and the rotation policy math. This is where correctness is proven by tests.
  • hyperion_vault extension — schema, constraints, RLS, SQL helper functions (vault.status, vault.enqueue_due_rotations, vault.expire_grace_versions, vault.grant_service_role), and the rotation supervisor background worker.
  • hyperion-vault-api — the data/management plane: HTTP handlers, the two guards, the dual connection pools, the async KMS providers, and the rotation worker that performs re-encryption.

Data model (schema vault)

Table Holds
admin_tokens name, token_sha256 (32-byte fingerprint), revoked_at, last_used_at.
secrets name, kind (manual/automatic), rotation_interval, grace_period, current_version, next_rotation_at.
secret_versions (secret_id, version), kms_key_id, wrapped_dek, nonce, ciphertext, aad, expires_at. Plaintext never stored.
rotation_jobs work queue for due automatic rotations (claim with FOR UPDATE SKIP LOCKED).
audit_log append-only operation record.

A version is the unit of encryption. The current version has expires_at = NULL. When a secret rotates, the previous version’s expires_at is set to now() + grace_period; it remains decryptable and verify-able until then, after which vault.expire_grace_versions() removes it.

Cryptographic envelope

encrypt(plaintext, name, version):
    aad      = name || ":" || version
    dek      = KMS.GenerateDataKey(AES_256)          # plaintext + wrapped form
    nonce    = 24 random bytes
    ct       = XChaCha20Poly1305(dek).seal(nonce, aad, plaintext)
    store { kms_key_id, wrapped_dek, nonce, ct, aad }
    zeroize(dek)

decrypt(row, name, version):
    assert row.aad == name || ":" || version          # version/name binding
    dek = KMS.Decrypt(row.wrapped_dek)
    plaintext = XChaCha20Poly1305(dek).open(row.nonce, row.aad, row.ct)
    zeroize(dek)

The AAD binds each ciphertext to its name and version, so a stored row cannot be replayed under a different secret or version even by someone with table write access.

Request flows

Read (GET /v1/secrets/{name}): IP allowlist guard → reader pool (local node) → fetch current version → KMS Decrypt → AEAD open → return value.

Write (POST/PUT/DELETE, /rotate): admin-token guard → writer pool (read-write → current primary) → transaction { seal new version via KMS, insert, supersede old version, update secret } → commit → replicate.

Verify (POST /v1/secrets/{name}/verify): IP allowlist guard → reader pool → for each currently-valid version (current + within-grace) → decrypt → constant-time compare → return {valid, version}.

Rotation

  1. The extension’s supervisor bgworker runs on the primary only (pg_is_in_recovery() = false). Every scan_interval_secs it calls vault.enqueue_due_rotations() (inserts jobs for automatic secrets whose next_rotation_at <= now() with no open job) and NOTIFY vault_rotation.
  2. The API rotation worker on each node polls/claims jobs from the queue (FOR UPDATE SKIP LOCKED, with stale-claim reclaim). Because the claim is a write, it executes on the primary; SKIP LOCKED makes concurrent workers across nodes safe.
  3. For each job it generates new material, seals a new version, sets the old version’s expires_at = now() + grace_period, advances current_version and next_rotation_at, then completes the job.

See DECISIONS.md for the rationale behind these choices.