Skip to main content

Mesh Architecture

The Rafka v2 mesh is the substrate every other layer rides on. Every node — gateway, broker, compute, registry, bridge — is a single binary that:

  1. Holds an iroh-net Endpoint keyed by its NodeId (Ed25519 public key)
  2. Discovers peers via mDNS (LAN) and DERP relays (WAN via iroh-relay)
  3. Maintains one QUIC connection per peer pair under a single ALPN
  4. Multiplexes two distinct planes over that connection

The two planes

PlaneMechanismReliabilityWhat rides it
Controliroh-gossip (HyParView for membership + Plumtree for broadcast)Internal control messages reliable-ordered; application-level state digests broadcast lossilyMembership churn (join/leave), small state digests (CPU load, peer counts, mesh_id), auth deltas, routing-table updates
Dataconnection.open_bi() (bidirectional QUIC streams)Reliable + ordered per stream; independent flow control across streams; shared congestion control across the connectionHeavy request/response payloads (Kafka ops, batch fetches, large auth pushes), anything that exceeds the safe gossip MTU or requires acknowledgement

These ride the same QUIC connection, not separate connections. One ALPN = one TLS handshake per peer pair = one congestion-control context shared between control and data.

Opening a new stream on an established connection is effectively free (a local stream-ID allocation; no network handshake) because QUIC pre-grants stream quota via MAX_STREAMS frames.

Stream wire grammar

Every QUIC stream (bi or uni) follows the same per-stream framing:

stream = tag(u8) length(unsigned-varint) payload(postcard) [EOF]
  • tag — 1 byte routing the stream to a handler. Lookup table below.
  • length — LEB128 unsigned varint, byte length of the payload that follows.
  • payload — postcard-encoded value of the type the tag's handler expects.
  • [EOF] — single-use streams. Sender writes the frame and calls finish(); receiver reads exactly length bytes, deserializes, drops the stream.

Serialization choice: postcard

  • postcard for mesh frames and control messages — LEB128 varints give the smallest wire size; pure serde keeps domain types clean.
  • rkyv reserved for WAL records read many times (future storage layer).
  • serde_json for human-read external data only (REST APIs, audit logs). Never the internal wire format.

Tag namespace

RangeClassUse
0x00RESERVEDSentinel / null-detection. Never assigned.
0x01–0x0FControl planePointer-gossip pulls (IHAVE/IWANT), auth-state pushes, control-plane request/response
0x10–0x7FData plane0x10 = Ping/Pong/Hello (substrate). 0x11+ reserved for KafkaOp, compute, batch fetches
0x80–0xFFExtensionsVendor / future / experimental. Drop on unknown tag with unsupported_tag span.

The KafkaOp carrier uses KAFKA_OP_PRODUCE = 0x02 today (control-plane range; will move to 0x10+ when the full tag namespace ships in Phase 1.2).

Pointer gossip (oversize control deltas)

For control-plane deltas exceeding the QUIC datagram MTU ceiling (~1200 bytes), Rafka uses Pointer Gossip — the application-level name for Plumtree's native IHAVE/IWANT lazy-push pattern:

  1. Source computes hash(payload), caches it locally, and broadcasts a tiny {hash, size, source_node_id} pointer datagram over gossip.
  2. Receivers see the pointer. Cache hit → done. Cache miss → open a unidirectional QUIC stream (tag 0x01) to source_node_id, source responds with the payload.
  3. Receiver decodes, caches, processes.

Sub-millisecond datagram dissemination for routing; reliable stream fallback for the actual transfer. No fragmentation at the gossip layer.

QoS properties

  • Flow control is per-stream. A stalled heavy consumer backpressures the producer on that stream only. A tiny pointer-pull on another stream of the same connection is unaffected.
  • Congestion control is per-connection. Network saturation throttles the shared QUIC connection cooperatively — streams cooperate on path bandwidth rather than fighting as separate ALPNs would.

What is built today

LayerStatusNotes
iroh-net Endpoint per nodeLivecrates/rafka-mesh-transport
NodeId-keyed peer registryLivecrates/rafka-node-base
mDNS peer discoveryLiveiroh built-in
Single QUIC connection per peer pairLiveiroh manages
Single ALPN (rafka-mesh-v1)Live
Ping/Pong/Hello frames over QUIC streamsLiverafka-mesh-ops
W3C trace-context in frame headerLive32-byte header; cross-process span propagation
postcard wire codecLivecommit 24a19ee
KafkaOp correlated RPC carrierLivecommit 40ba255; Produce op verified
1-byte tag stream demuxPhase 1.2
Property-tested framer cratePhase 1.1rafka-mesh-ops::framer
iroh-gossip wiringPhase 1.3currently bootstrapped over mDNS
Pointer Gossip patternPhase 2needs 0x01 handler + payload cache

Internal references

Architecture decisions, sprint configs, and implementation notes live in the repo but are not published to this site. Key references:

  • docs/eng/rafka-golden-principles.md — engineering principles (#2 broker design, #7 per-message observability, #11 serialization, #12 election)
  • docs/plans/mesh-v1/06-decisions.md — D-027: locks iroh-gossip + backpressure tests
  • docs/features/frame-exchange/ — Ping/Pong/Hello implementation detail
  • docs/features/peer-discovery/ — mDNS + DERP discovery