# DAMM Architecture

## Purpose

This document describes the actual system shape DAMM is converging on.

It is not meant to be inspirational. It is meant to answer:

- what runs where
- which component is allowed to know what
- how a device gets from zero to connected
- how routing and failover decisions are represented
- where the current implementation is still intentionally thin

## System Thesis

DAMM should be a disciplined WireGuard VPN system rather than a clever tunnel experiment.

That means:

- use WireGuard as the transport primitive
- keep private keys at the edge that owns them
- keep routing and policy intelligence in the control plane
- keep the packet-forwarding path comparatively boring
- make state, evidence, and operator decisions legible

## Top-Level Components

### 1. Device Client

The device client currently exists as:

- a CLI that generates keys, enrolls, fails over, bundles, and diagnoses
- a browser-side DAMM companion page for profile inspection and install guidance
- native WireGuard or OS networking integration on the target platform

Responsibilities:

- generate and retain the device private key
- submit only the public key to the control plane
- write the local WireGuard config
- verify signed catalogs before failover
- store local device state and diagnostics

Non-responsibilities:

- acting as a policy authority
- trusting unsigned endpoint changes
- storing server-side secrets

### 2. Control Plane

The control plane is the authority for:

- enrollment
- access-tier policy
- tunnel address allocation
- gateway and egress selection
- signed catalog publication
- admin audit
- operator report generation
- device lifecycle actions such as revoke and key rotation

It should not be a data-plane hot path.

### 3. Gateway

The gateway is the WireGuard session terminator.

Responsibilities:

- hold the gateway private key
- terminate tunnels
- forward traffic
- report heartbeat and usage

Non-responsibilities:

- inventing policy locally
- silently rewriting client routing decisions
- becoming a shadow control plane

### 4. Egress Pool

An egress pool is the outbound identity surface.

Responsibilities:

- represent outbound IP inventory
- advertise supported exit countries
- expose load and headroom characteristics

Why it exists separately:

- ingress reachability and egress reputation are different problems
- inbound IP churn should not force outbound churn
- load and placement decisions need separate levers

### 5. Store

The store persists:

- gateways
- devices
- access tiers
- usage
- audit events
- catalog signing material

Current implementation:

- JSON backend for the main validated path
- Postgres backend scaffolded for durable-state hardening

### 6. Orchestrator

The orchestrator is the fleet-shaping layer.

Responsibilities:

- plan ingress and egress additions
- reconcile desired and current topology
- apply provider-specific actions
- enforce provider policy guardrails

Non-responsibilities:

- deciding device-level failover live on the client
- serving as a packet relay

## Trust Boundaries

These are the boundaries that should remain rigid.

### Device Key Boundary

- device private keys stay on the device
- the control plane should only see the device public key
- reissue flows should preserve that boundary

### Gateway Key Boundary

- gateway private keys stay on the gateway
- the control plane stores gateway public keys and identity metadata only

### Control Plane Authority Boundary

The control plane is authoritative for:

- signed catalogs
- enrollment state
- access tier policy
- operator reporting

But it is not authoritative for:

- pretending a gateway is healthy without heartbeat evidence
- raw packet forwarding success

### Browser Boundary

The browser companion may:

- inspect profiles
- store local profile copies
- guide installation
- export diagnostics

The browser companion may not:

- represent itself as the VPN tunnel
- own routing at the OS level
- substitute for a native WireGuard client on iPhone or macOS

## Core Data Model

### Device

A device record includes:

- device identity and name
- public key
- assigned tunnel IP
- current gateway attachment
- egress attachment
- access tier
- usage totals
- lifecycle state such as active, revoked, or suspended
- catalog trust metadata

### Gateway

A gateway record includes:

- gateway identity and region
- provider and status metadata
- gateway public key
- front-door inventory
- registration state
- last heartbeat and load metrics
- tunnel address metadata

### Front Door

A front door is a reachable ingress endpoint associated with a gateway.

It should carry:

- endpoint string
- transport label
- priority
- active/inactive state
- gateway association

### Egress Pool

An egress pool includes:

- provider and region
- supported exit countries
- IP inventory
- load score
- headroom score

### Access Tier

An access tier includes:

- allowed regions
- allowed exit countries
- session limit
- byte limit
- lifecycle status

## Current Network Model

The current validated reference model is:

- overlay subnet: `10.44.0.0/24`
- gateway tunnel address default: `10.44.0.1/24`
- client addresses: allocated from `.10` upward
- default client allowed IPs: `0.0.0.0/0, ::/0`
- signed catalogs scoped by region and filtered to healthy registered gateways

This is a product skeleton, not the final routing topology.

## Control Flow

### Enrollment

1. The device generates a WireGuard keypair locally.
2. The client submits the public key and policy hints.
3. The control plane validates the enrollment token.
4. The control plane resolves:
   - access tier
   - allowed region / exit-country policy
   - healthy ingress gateway
   - primary front door
   - egress pool
   - tunnel IP
5. The control plane returns peer material plus catalog trust metadata.
6. The device writes the local config.

### Gateway Admission

1. A gateway registers with a gateway-scoped token.
2. The gateway heartbeats with current metrics.
3. The control plane marks it usable only while heartbeat freshness holds.
4. Signed catalogs and new enrollments only consider healthy registered gateways.

### Client Failover

1. The client fetches the signed catalog.
2. The client verifies signature and freshness.
3. The client chooses the next front door for the same gateway.
4. The client rewrites local config.

### Metering

1. The gateway reports per-device usage.
2. The control plane accumulates usage on the device record.
3. Over-quota devices can be suspended.
4. Report output surfaces that state to operators.

## Routing And Placement Logic

Routing decisions should stay explainable.

Current inputs:

- requested region
- requested exit country
- gateway heartbeat freshness
- gateway load score
- egress load score
- front-door priority

Current outputs:

- chosen gateway
- chosen front door
- chosen egress pool
- placement recommendations in the operator report

What is still intentionally missing:

- live latency-aware route ranking
- long-horizon failure memory
- autonomous gateway-side route adaptation

## Packaging And UX Boundary

The correct current packaging split is:

- native WireGuard or OS networking owns the tunnel
- DAMM bundle owns the profile and install instructions
- DAMM client companion owns browser-side inspection and support

This is less flashy than a fake browser-native VPN and more correct.

## Failure Model

DAMM should explicitly handle:

- no healthy gateways
- stale catalog
- revoked device
- over-quota device
- heartbeat before registration
- oversized request bodies
- no alternate front door on failover

It should fail with deliberate, meaningful responses rather than generic internal errors.

## What This Repo Does Not Yet Solve

- full durable multi-process state
- full live provider bootstrap into registered gateways
- long-run gateway churn under sustained load
- packet-plane enforcement tied directly to metering state
- polished native first-party clients
- globally robust behavior under severe connectivity collapse

## Operational Direction

The next meaningful architectural improvements are:

1. move the validated runtime toward Postgres-backed durable state
2. feed real gateway metrics into placement instead of synthetic scores
3. turn remote gateway creation into a full register-and-serve flow
4. keep the browser companion thin and honest
5. expand workload validation beyond tracer bullets and point-to-point host tunnels
