Skip to Content
ArchitectureData Flow

Data Flow

How solution data moves through a per-solution Orbit stack: the agent as the single writer over the solution repositories, the read-only webapp fetching the solution tree over HTTP, and live refresh via server-sent events. This is the architecture introduced by the 2026-07-02 agent-served-solution-data refactor — before it, the webapp read the solution repo from its own mount; after it, the agent is the only process that touches the filesystem.

Single writer, many readers

The governing principle is that exactly one process owns the solution repo mounts: maxq-orbit-agent. It is the sole reader and sole writer of the customer repo (mounted at /repo) and the internal repo (/repo-internal). The orbit-webapp has no repo mount at all — it obtains solution data exclusively over HTTP from the agent.

Why this split:

  • One source of truth for “current”. The agent’s task pipeline mutates the working tree (branch per request, commit per task, merge on success — see the write path). If readers walked the same filesystem independently, they could observe half-written artefacts or a mid-checkout tree. Instead the agent loads the tree into memory at well-defined moments and serves immutable snapshots.
  • No filesystem coupling for readers. On Azure the reader app needs no Azure Files share and no storage link — just an ORBIT_AGENT_URL. The webapp can scale out (0–3 replicas) while the agent stays pinned at a single replica, which is load-bearing: the in-memory model and the event bus are per-process state.
  • Deliberately no file watcher. Bind mounts over SMB/Azure Files drop inotify events unreliably, so freshness is driven by explicit reload hooks in the write pipeline rather than by watching the disk (see the out-of-band edit gotcha).

The shared loader package implementation/shared/trajectory-loader/ (@maxq/trajectory-loader) defines both halves of the contract: the loadSolutionTree(root) walker the agent runs against the solution root, and the RawSolutionTree types the webapp’s transformer consumes. Both codebases depend on it as a file: dependency.

The in-memory model: SolutionModel

The agent holds the whole solution tree in memory (maxq-orbit-agent/codebase/src/solution/SolutionModel.ts). Each successful (re)load produces an immutable SolutionSnapshot:

FieldMeaning
versionMonotonic per-process counter, starts at 1. Bumped on every successful reload.
loadedAtISO timestamp of the load.
headShaCustomer-repo git HEAD at load time (observability only, may be null).
treeThe RawSolutionTree produced by loadSolutionTree.
jsonThe /solution/tree response body — { version, instance, tree } serialized once per version, so the multi-MB payload is not re-stringified per request.

Alongside the snapshot, the model carries a per-boot instanceId (randomUUID() at construction). This exists because version restarts at 1 whenever the agent restarts — so cache identity is always the (instance, version) pair, never the bare version. A consumer comparing versions alone would happily keep serving a stale cache across an agent restart (new boot, version 1 again, “nothing changed”).

Reload semantics:

  • Serialized and coalescing. One reload runs at a time; any reload requested while one is in flight collapses into exactly one queued follow-up (last reason wins). Callers on the task pipeline fire-and-forget.
  • Failed reloads keep the previous snapshot. Consumers never see a half-loaded tree; the error is logged and the old version keeps serving.
  • Startup is eager but non-blocking. init() fires the first load in the background so the HTTP server comes up promptly; until it lands, the data endpoints answer 503 { loading: true }, retrying with backoff (5 s doubling to a 60 s cap).

Every successful load publishes a solution.updated event on the agent’s event bus — the trigger for the live-refresh channel.

What the loader produces

loadSolutionTree walks the Trajectory v1.8 solution root and returns a RawSolutionTree bundling everything the reader renders: solution, charter, memory, glossary, components (with their operations, stores, dependencies, integrations, auth, permissions), domains, findings, assessments + assessmentsActivation, concerns, attestations, workflows, entities, questions, sourceOrientations, technologiesActivation, domainCounts, touchpointsByDomain, and a markdown map of raw markdown files by path. The YAML reader is deliberately tolerant: duplicate map keys parse with last-value-wins, and a genuinely broken file is logged and treated as absent so one bad artefact degrades a single view instead of crashing the whole page.

The agent’s solution-data API

On Azure the agent is reachable only from inside the ACA environment (internal ingress since 2026-07-03 — its callers, the reader’s server side and Mission Control, address it as http://agent-<h>; no edge secret or bypass header is involved). Defined in maxq-orbit-agent/codebase/src/routes/solution.ts, routes/sources.ts, and routes/events.ts.

MethodPathPurposeResponse (high level)
GET/solutionIdentity summary for Mission Control.{ id, shortName?, name?, customer? } — from the in-memory model; falls back to a direct solution.yaml read while the first load is still in flight.
GET/solution/versionCheap freshness probe — the webapp calls this once per page render.{ version, instance, loadedAt, headSha }; 503 { loading: true } until the first load.
GET/solution/treeThe whole-tree payload (multi-MB).{ version, instance, tree }; gzip via hono/compress; ETag: "<instance>:<version>" with If-None-Match304.
POST/solution/reloadManual reload escape hatch (out-of-band edits).202 { version } — the pre-reload version; the reload itself is async.
GET/sources/treeSource-repositories explorer: directory listing.{ root, entries }; 500 { error } on failure.
GET/sources/file?path=Source-repositories explorer: file content (1 MB cap, extension allowlist, path-escape guards).File-content result; 400 on missing param / path escape, 404 when not found.
GET/events/subscribeThe SSE event stream (all agent events, including solution.updated).text/event-stream; supports reconnect-replay via Last-Event-ID header or ?since=; 15 s ping heartbeat.

ETag weakening gotcha: hono/compress weakens the outgoing ETag to W/"instance:version", so clients echo the weak form back. The agent compares If-None-Match weak-insensitively (stripping the W/ prefix) — a 304 asserts equivalence of content, not of encoding.

The webapp’s /api/sources/tree and /api/sources/file routes are thin, transparent proxies to the agent’s /sources/* endpoints (status codes forwarded as-is; 502 when the agent itself is unreachable).

Read path: initial page load

The webapp side lives in orbit-webapp/codebase/src/lib/solution/:

  • agent-source.ts — the server-only HTTP client. ORBIT_AGENT_URL is required; on Azure it carries the agent’s in-env address (http://agent-<h> — the agent has internal ingress and no public URL). A 503 from the agent surfaces as a typed AgentLoadingError. The browser never talks to the agent — everything goes through the webapp’s server side.
  • index.tsgetOrbitData() — the per-request entry point every page render goes through.

The webapp’s caching model

getOrbitData() keeps one module-level cache per webapp replica:

  • One cheap version check per page render. Every request calls GET /solution/version; the expensive pipeline (tree fetch → transform → MDX render) runs only when (instance, version) moved.
  • Keyed by (instance, version). If the agent restarted (new instance), the cache is invalid even though the version number may look equal or lower. Within one instance the cache never regresses: a slow rebuild that finishes after a newer one must not overwrite it.
  • Coalesced rebuilds. Concurrent page loads that all observe the same new version share one in-flight rebuild promise instead of each fetching and MDX-rendering the whole tree.
  • Availability over freshness once warm. If the version check or a rebuild fails and a cached model exists, the webapp logs a warning and serves the cached model. Only with no cache at all does the error propagate to the error page.
  • Per-replica-correct. Each webapp replica rebuilds independently and converges on the same model, because the key comes from the agent — no cross-replica coordination is needed.

The rebuild itself is the expensive step: transformToOrbitData reshapes the raw tree into the view model, then every prose field (descriptions, notes, recommendations, charter, packaging docs, …) is MDX-compiled into *Rendered mirrors in roughly fifteen parallel passes over the model.

The write path: how mutations land

Solution data changes when the agent’s TaskProcessor — the single, process-wide, strictly serial consumer of the task queue — runs worker tasks (maxq-orbit-agent/codebase/src/orchestrator/TaskProcessor.ts):

  1. Branch per request. Before a request’s first task runs, the processor checks out (or creates from base) a branch named after the request id. Serial execution is what makes this safe: only one request branch is ever checked out in the shared working tree at a time.
  2. Commit per task. When a task finishes (success or failure), its changes under workspace/ are committed to the request branch and the branch is pushed. The .orbit/ audit trail is committed separately to the internal repo on its linear main.
  3. Reload per commit. Each task commit fires a fire-and-forget SolutionModel.reload("task-commit") — so the reader shows live mid-request progress, not just the final merged state.
  4. Merge on success. Once all of a request’s tasks completed, the branch is merged back into base and the model reloads with reason request-merged, carrying the list of changed files. If any task failed (or the merge conflicts), the branch is left unmerged for inspection, the working tree returns to base, and the model reloads with reason request-failed — the in-progress edits the reader was showing disappear from view again, correctly.

Every reload reason the model can broadcast:

ReasonWhen
startupThe eager first load at agent boot.
task-commitA worker task committed to the request branch (live mid-request progress).
request-mergedThe request branch merged into base; the event carries changedFiles.
request-failedTasks failed or the merge conflicted; the tree flipped back to base.
recoveryUnconditional reload after the post-crash recovery checkout.
manualSomeone called POST /solution/reload.

Live refresh: SSE

Freshness reaches the browser through three hops:

  1. Agent → event bus → SSE. Every successful reload publishes solution.updated with shape { type, version, reason, requestId?, changedFiles?, ts } on the agent’s event bus, which /events/subscribe streams to subscribers (with reconnect-replay from a ring buffer via Last-Event-ID).
  2. Webapp SSE proxy — /api/orbit-events. The browser cannot reach the agent, so the webapp exposes a proxy route (orbit-webapp/codebase/src/app/api/orbit-events/route.ts) that opens /events/subscribe upstream (keeping the agent’s in-env URL server-side) and forwards only solution.updated frames. It sends its own 15 s heartbeat toward the browser (the agent’s pings are filtered out), passes the browser’s Last-Event-ID through for replay, and answers 502 when the agent is unreachable.
  3. Browser listener — SolutionRefresh. A client component (src/components/orbit/solution-refresh.tsx) opens an EventSource on /api/orbit-events. On solution.updated it debounces 400 ms (task commits can arrive in bursts) and calls router.refresh() — an in-place RSC reconcile, so navigation state, theme, and scroll survive — and shows a brief “Solution updated” notice.

EventSource reconnection: per spec, EventSource only auto-reconnects while the connection stays retriable — a non-200 response (the proxy’s 502 while the agent is down) closes it for good. The listener therefore recreates the EventSource itself with exponential backoff (3 s doubling to 30 s), and fires one catch-up refresh when the connection comes back, since the reader always wants latest, not history.

When the request later merges (or fails), the same chain runs again with reason request-merged (or request-failed), settling the reader on the merged base (or flipping the in-progress edits back out of view).

Gotcha: out-of-band edits

Hand-edits to the solution repo do not show up in the reader by themselves. There is deliberately no filesystem watcher — SMB/Azure Files bind mounts drop inotify events unreliably, so freshness is tied to the write pipeline’s explicit reload hooks instead. Task runs reload automatically; a manual edit does not.

After editing the solution repo out-of-band (hand-editing files in local dev, or a git push landing directly on the share), trigger a reload explicitly:

# local dev: the agent's host port curl -X POST http://localhost:3010/solution/reload

The endpoint answers 202 immediately (with the pre-reload version) and runs the reload asynchronously; the resulting solution.updated event then refreshes every connected reader through the normal SSE chain.