# Roadmap to v0.4.0

A concrete, point-release-by-point-release plan from where we are (v0.3.2) to v0.4.0. Each release is small, shippable, and tied to a measurable claim. Each release runs `verify-all.sh` green pre-tag.

The principle: **grow what works carefully**. Each release adds one or two shippable things, retires nothing that's holding the celebrated path up, leaves the verify suite passing.

## v0.3.2 → next ship (this point release)

The point release that lands these three docs (roadmap, issue triage) plus the verify suite. Already in flight.

**Exit criteria:**
- `verify-all.sh` green on hub2 ✓
- `docs/INDEX.md` registers `roadmap-next.md`, `issue-triage.md`, and `scripts/verify/` ✓
- `package.json` has `verify` script wired up to verify-all.sh

**Scope:** small. Don't widen this release; it's a hygiene point release, not a feature point release.

## v0.3.3 — Catalog reads scores

**Why:** v0.3.0 ships the orchestrator producing `scores.json`. But the catalog still emits any registered gateway. The resilience claim ("we route around degraded paths") is hollow until the catalog actually filters on score.

**Demand:** When `gateway.composite < threshold` for ≥ T seconds, that gateway disappears from `/v1/catalog`. When the score recovers, the gateway reappears.

**Code shape:**
- `control-plane/lib/catalog.js` — read `scores.json` if present, exclude gateways with composite below threshold
- Threshold: env var `VPN_CATALOG_MIN_COMPOSITE` (default 0.5)
- New verify suite: `verify/catalog-filtering.sh` — synthesize a low-score scenario, confirm gateway disappears

**Exit criteria:**
- Verify suite green
- Field manual §3.10 updated from "open" to "shipped"
- Story E in field manual closed

**Risk:** could mistakenly filter the only healthy gateway during a transient score dip. Mitigation: never filter to zero gateways; if all are below threshold, emit them all anyway and log a warning.

**Estimated work:** half a day.

## v0.3.4 — QR code in the wizard

**Why:** Story F (border traversal under time pressure). iOS WireGuard's "Create from QR code" is the cleanest import path on phones; we render `.conf` text with a Download button today.

**Demand:** After enrollment, the wizard renders a QR code containing the `.conf` payload. Tappable, large, captionless.

**Code shape:**
- `scripts/site/render-get-flow.js` — add a `<canvas>` element after enrollment success
- Inline ~10KB QR encoder (Project Nayuki's pure-JS or `qrcode-svg`)
- Render the rendered config into the canvas

**Exit criteria:**
- Verify suite green
- Manual: scan from iOS WireGuard, confirm tunnel imports

**Risk:** the QR for a full `.conf` is large (~600 modules). On small viewports, scannability suffers. Test on real iPhone viewport sizes.

**Estimated work:** half a day.

## v0.3.5 — AmneziaWG (T1) frontdoor

**Why:** Story A (friend on a hostile network blocking UDP/51820). Single highest-ROI ship for hostile-network friends. T0 is dead in many real networks; T1 (AmneziaWG: junk-packet preamble + magic-header rewrite) defeats off-the-shelf DPI.

**Demand:** A second listening port on hub2 running AmneziaWG. Catalog advertises both T0 (port 51820) and T1 (port TBD) as alternative frontdoors on the same gateway. Wizard offers a "try a heavier transport" button if T0 doesn't handshake within ~10s.

**Code shape:**
- Compile/install AmneziaWG userspace fork on hub2 (`amneziawg-go` or kernel module, depending on availability)
- Second `wg-quick@wgT1` unit on a different port, sharing the gateway keypair
- `state.json` gateway record gets a second `frontdoor` entry with `transport: "amneziawg"`
- Wizard's Step 4 polls handshake status; on timeout offers T1 swap

**Exit criteria:**
- Verify suite green
- Manual: from a network blocking UDP/51820, T1 frontdoor handshakes
- Field manual §3.7 updated

**Risk:** AmneziaWG's clients are not as ubiquitous as standard WireGuard. The wizard must clearly say "this requires Amnezia VPN client, not stock WireGuard."

**Estimated work:** 1 day setup + 1 day field-testing.

## v0.3.6 — Captive portal navigator

**Why:** Coffee-shop wifi. The wizard's first network call returns HTML (the captive portal interstitial) where JSON was expected. Today this fails opaquely.

**Demand:** When a wizard request returns non-JSON, surface "this network requires you to click through a portal — open `http://example.com` first."

**Code shape:** ~30 lines in `scripts/site/render-get-flow.js`. Detect via response content-type. Surface a banner with retry.

**Exit criteria:** field demo against a real captive portal.

**Estimated work:** 2 hours.

## v0.4.0 — DigitalOcean live integration + state split

**Why:** v0.3 has been stretching `coordination-layer.md`'s claims by adding stubs (Cloudflare, Tailscale). v0.4 puts the second live API integration online. Validates the abstraction holds across two real providers, not just one. Concurrently, the `state.json` split (premortem §1) keeps the JSON store viable past the device count where we are now.

**Demand A (DigitalOcean live):** With `DO_TOKEN` set, the orchestrator can `createIngressNode` on DigitalOcean and have it appear in the catalog. End-to-end provisioning, no human in the loop beyond setting the token.

**Demand B (state split):** `state.json` becomes:
- `inventory.json` — gateways, egress pools, access tiers (slow, durable)
- `devices/<id>.json` — one file per device (frequent, hot)
- `events.log` — append-only newline-delimited audit log (rotates via logrotate)
- `signing/` — catalog signing key, mode 600, never alongside backup-able state

**Why both in one release:** they constrain each other. The orchestrator under load will write more state more often; doing that against a single state.json invites the worst Phase-0 forecast.

**Exit criteria:**
- Verify suite green (with new tests for state-split integrity)
- DigitalOcean adapter creates a real droplet under `HCLOUD_TOKEN` policy gates
- Field manual §3.2 (resilient identity / state) and §3.9 (provider polyculture) both move "open" → "shipped"
- Premortem §1 (JSON store growth) marked addressed

**Risk:** state split is invasive. Every `store-json.js` operation touches it. We'd cut a feature freeze branch first.

**Estimated work:** 1 week.

## What's deliberately NOT in this roadmap

- **Tailscale overlay live integration.** Adapter is stubbed; we're not committing to a release until v0.4 lands the cleaner state model. Otherwise we'd be writing tailnet membership records into a `state.json` we're about to retire.
- **Cloudflare cdn_front live integration.** Same reasoning. Plus, requires careful UX (the wizard would need to surface "this connection rides Cloudflare's network" honestly).
- **Stateless admission tickets** (premortem alternative architecture). Genuinely competitive design, but a different product. Not on this v0.x trajectory.
- **Multi-control-plane.** Premature until we have multi-gateway.
- **Phone-to-phone mesh.** Open research, not engineering. Stays in `field-manual.md` §5 brainstorm.

## How releases interact with the verify suite

Every point release runs `npm run verify` (alias for `bash scripts/verify/verify-all.sh`) before tagging. The `damm-orchestrator.timer` already includes a hourly verify hook in v0.3.2.

When a verify suite fails:
1. **Don't tag.** The release is held until green.
2. **Triage the failure.** It's either: a real regression, or a verify-suite drift (the suite became wrong). Both are bugs.
3. **The celebrated-path check is non-negotiable.** A failure there blocks any other work.

## How this roadmap interacts with the doc set

Per `docs/INDEX.md` rule 1: every doc has a "Last verified" date. When this roadmap ships a feature, the dates on the relevant docs (`field-manual.md`, `architecture-map.md`, `coordination-layer.md`) get updated on the same commit. If we land a feature without updating the dates, that's a drift marker waiting to bite us.

Per `INDEX.md` rule 5: the field manual is the lookup-first doc. Every closed item here gets reflected in `field-manual.md` §3 (demand status: open → shipped) and §7.5 (what just shipped).

## The single grounding metric

After every release, the bedrock fact (field-manual §0) must remain true. The celebrated peer's accumulated tx must be at least as high as it was pre-release. **That's the only metric that matters across releases.** Everything else can be re-checked in the next verify run; the bedrock is permanent and proves the system has done its job at least once.

The roadmap does not add a single feature whose deployment risks the bedrock. If a release would, the release is split.
