Data Flow
How solution data moves through a per-solution Orbit stack: the agent as the single writer over the solution repositories, the read-only webapp fetching the solution tree over HTTP, and live refresh via server-sent events. This is the architecture introduced by the 2026-07-02 agent-served-solution-data refactor — before it, the webapp read the solution repo from its own mount; after it, the agent is the only process that touches the filesystem.
Single writer, many readers
The governing principle is that exactly one process owns the solution repo
mounts: maxq-orbit-agent. It is the sole reader and sole writer of the
customer repo (mounted at /repo) and the internal repo (/repo-internal).
The orbit-webapp has no repo mount at all — it obtains solution data
exclusively over HTTP from the agent.
Why this split:
- One source of truth for “current”. The agent’s task pipeline mutates the working tree (branch per request, commit per task, merge on success — see the write path). If readers walked the same filesystem independently, they could observe half-written artefacts or a mid-checkout tree. Instead the agent loads the tree into memory at well-defined moments and serves immutable snapshots.
- No filesystem coupling for readers. On Azure the reader app needs no
Azure Files share and no storage link — just an
ORBIT_AGENT_URL. The webapp can scale out (0–3 replicas) while the agent stays pinned at a single replica, which is load-bearing: the in-memory model and the event bus are per-process state. - Deliberately no file watcher. Bind mounts over SMB/Azure Files drop inotify events unreliably, so freshness is driven by explicit reload hooks in the write pipeline rather than by watching the disk (see the out-of-band edit gotcha).
The shared loader package implementation/shared/trajectory-loader/
(@maxq/trajectory-loader) defines both halves of the contract: the
loadSolutionTree(root) walker the agent runs against the solution root, and
the RawSolutionTree types the webapp’s transformer consumes. Both codebases
depend on it as a file: dependency.
The in-memory model: SolutionModel
The agent holds the whole solution tree in memory
(maxq-orbit-agent/codebase/src/solution/SolutionModel.ts). Each successful
(re)load produces an immutable SolutionSnapshot:
| Field | Meaning |
|---|---|
version | Monotonic per-process counter, starts at 1. Bumped on every successful reload. |
loadedAt | ISO timestamp of the load. |
headSha | Customer-repo git HEAD at load time (observability only, may be null). |
tree | The RawSolutionTree produced by loadSolutionTree. |
json | The /solution/tree response body — { version, instance, tree } serialized once per version, so the multi-MB payload is not re-stringified per request. |
Alongside the snapshot, the model carries a per-boot instanceId
(randomUUID() at construction). This exists because version restarts at 1
whenever the agent restarts — so cache identity is always the
(instance, version) pair, never the bare version. A consumer comparing
versions alone would happily keep serving a stale cache across an agent
restart (new boot, version 1 again, “nothing changed”).
Reload semantics:
- Serialized and coalescing. One reload runs at a time; any reload requested while one is in flight collapses into exactly one queued follow-up (last reason wins). Callers on the task pipeline fire-and-forget.
- Failed reloads keep the previous snapshot. Consumers never see a half-loaded tree; the error is logged and the old version keeps serving.
- Startup is eager but non-blocking.
init()fires the first load in the background so the HTTP server comes up promptly; until it lands, the data endpoints answer503 { loading: true }, retrying with backoff (5 s doubling to a 60 s cap).
Every successful load publishes a solution.updated event on the agent’s
event bus — the trigger for the live-refresh channel.
What the loader produces
loadSolutionTree walks the Trajectory v1.8 solution root and returns a
RawSolutionTree bundling everything the reader renders: solution,
charter, memory, glossary, components (with their operations, stores,
dependencies, integrations, auth, permissions), domains, findings,
assessments + assessmentsActivation, concerns, attestations,
workflows, entities, questions, sourceOrientations,
technologiesActivation, domainCounts, touchpointsByDomain, and a
markdown map of raw markdown files by path. The YAML reader is deliberately
tolerant: duplicate map keys parse with last-value-wins, and a genuinely
broken file is logged and treated as absent so one bad artefact degrades a
single view instead of crashing the whole page.
The agent’s solution-data API
On Azure the agent is reachable only from inside the ACA environment (internal
ingress since 2026-07-03 — its callers, the reader’s server side and Mission
Control, address it as http://agent-<h>; no edge secret or bypass header is
involved). Defined in maxq-orbit-agent/codebase/src/routes/solution.ts,
routes/sources.ts, and routes/events.ts.
| Method | Path | Purpose | Response (high level) |
|---|---|---|---|
GET | /solution | Identity summary for Mission Control. | { id, shortName?, name?, customer? } — from the in-memory model; falls back to a direct solution.yaml read while the first load is still in flight. |
GET | /solution/version | Cheap freshness probe — the webapp calls this once per page render. | { version, instance, loadedAt, headSha }; 503 { loading: true } until the first load. |
GET | /solution/tree | The whole-tree payload (multi-MB). | { version, instance, tree }; gzip via hono/compress; ETag: "<instance>:<version>" with If-None-Match → 304. |
POST | /solution/reload | Manual reload escape hatch (out-of-band edits). | 202 { version } — the pre-reload version; the reload itself is async. |
GET | /sources/tree | Source-repositories explorer: directory listing. | { root, entries }; 500 { error } on failure. |
GET | /sources/file?path= | Source-repositories explorer: file content (1 MB cap, extension allowlist, path-escape guards). | File-content result; 400 on missing param / path escape, 404 when not found. |
GET | /events/subscribe | The SSE event stream (all agent events, including solution.updated). | text/event-stream; supports reconnect-replay via Last-Event-ID header or ?since=; 15 s ping heartbeat. |
ETag weakening gotcha: hono/compress weakens the outgoing ETag to
W/"instance:version", so clients echo the weak form back. The agent compares
If-None-Match weak-insensitively (stripping the W/ prefix) — a 304 asserts
equivalence of content, not of encoding.
The webapp’s /api/sources/tree and /api/sources/file routes are thin,
transparent proxies to the agent’s /sources/* endpoints (status codes
forwarded as-is; 502 when the agent itself is unreachable).
Read path: initial page load
The webapp side lives in orbit-webapp/codebase/src/lib/solution/:
agent-source.ts— the server-only HTTP client.ORBIT_AGENT_URLis required; on Azure it carries the agent’s in-env address (http://agent-<h>— the agent has internal ingress and no public URL). A503from the agent surfaces as a typedAgentLoadingError. The browser never talks to the agent — everything goes through the webapp’s server side.index.ts—getOrbitData()— the per-request entry point every page render goes through.
The webapp’s caching model
getOrbitData() keeps one module-level cache per webapp replica:
- One cheap version check per page render. Every request calls
GET /solution/version; the expensive pipeline (tree fetch → transform → MDX render) runs only when (instance, version) moved. - Keyed by (instance, version). If the agent restarted (new
instance), the cache is invalid even though the version number may look equal or lower. Within one instance the cache never regresses: a slow rebuild that finishes after a newer one must not overwrite it. - Coalesced rebuilds. Concurrent page loads that all observe the same new version share one in-flight rebuild promise instead of each fetching and MDX-rendering the whole tree.
- Availability over freshness once warm. If the version check or a rebuild fails and a cached model exists, the webapp logs a warning and serves the cached model. Only with no cache at all does the error propagate to the error page.
- Per-replica-correct. Each webapp replica rebuilds independently and converges on the same model, because the key comes from the agent — no cross-replica coordination is needed.
The rebuild itself is the expensive step: transformToOrbitData reshapes the
raw tree into the view model, then every prose field (descriptions, notes,
recommendations, charter, packaging docs, …) is MDX-compiled into *Rendered
mirrors in roughly fifteen parallel passes over the model.
The write path: how mutations land
Solution data changes when the agent’s TaskProcessor — the single,
process-wide, strictly serial consumer of the task queue — runs worker tasks
(maxq-orbit-agent/codebase/src/orchestrator/TaskProcessor.ts):
- Branch per request. Before a request’s first task runs, the processor checks out (or creates from base) a branch named after the request id. Serial execution is what makes this safe: only one request branch is ever checked out in the shared working tree at a time.
- Commit per task. When a task finishes (success or failure), its
changes under
workspace/are committed to the request branch and the branch is pushed. The.orbit/audit trail is committed separately to the internal repo on its linearmain. - Reload per commit. Each task commit fires a fire-and-forget
SolutionModel.reload("task-commit")— so the reader shows live mid-request progress, not just the final merged state. - Merge on success. Once all of a request’s tasks completed, the branch
is merged back into base and the model reloads with reason
request-merged, carrying the list of changed files. If any task failed (or the merge conflicts), the branch is left unmerged for inspection, the working tree returns to base, and the model reloads with reasonrequest-failed— the in-progress edits the reader was showing disappear from view again, correctly.
Every reload reason the model can broadcast:
| Reason | When |
|---|---|
startup | The eager first load at agent boot. |
task-commit | A worker task committed to the request branch (live mid-request progress). |
request-merged | The request branch merged into base; the event carries changedFiles. |
request-failed | Tasks failed or the merge conflicted; the tree flipped back to base. |
recovery | Unconditional reload after the post-crash recovery checkout. |
manual | Someone called POST /solution/reload. |
Live refresh: SSE
Freshness reaches the browser through three hops:
- Agent → event bus → SSE. Every successful reload publishes
solution.updatedwith shape{ type, version, reason, requestId?, changedFiles?, ts }on the agent’s event bus, which/events/subscribestreams to subscribers (with reconnect-replay from a ring buffer viaLast-Event-ID). - Webapp SSE proxy —
/api/orbit-events. The browser cannot reach the agent, so the webapp exposes a proxy route (orbit-webapp/codebase/src/app/api/orbit-events/route.ts) that opens/events/subscribeupstream (keeping the agent’s in-env URL server-side) and forwards onlysolution.updatedframes. It sends its own 15 s heartbeat toward the browser (the agent’s pings are filtered out), passes the browser’sLast-Event-IDthrough for replay, and answers502when the agent is unreachable. - Browser listener —
SolutionRefresh. A client component (src/components/orbit/solution-refresh.tsx) opens anEventSourceon/api/orbit-events. Onsolution.updatedit debounces 400 ms (task commits can arrive in bursts) and callsrouter.refresh()— an in-place RSC reconcile, so navigation state, theme, and scroll survive — and shows a brief “Solution updated” notice.
EventSource reconnection: per spec, EventSource only auto-reconnects
while the connection stays retriable — a non-200 response (the proxy’s 502
while the agent is down) closes it for good. The listener therefore
recreates the EventSource itself with exponential backoff (3 s doubling to
30 s), and fires one catch-up refresh when the connection comes back, since
the reader always wants latest, not history.
When the request later merges (or fails), the same chain runs again with
reason request-merged (or request-failed), settling the reader on the
merged base (or flipping the in-progress edits back out of view).
Gotcha: out-of-band edits
Hand-edits to the solution repo do not show up in the reader by themselves. There is deliberately no filesystem watcher — SMB/Azure Files bind mounts drop inotify events unreliably, so freshness is tied to the write pipeline’s explicit reload hooks instead. Task runs reload automatically; a manual edit does not.
After editing the solution repo out-of-band (hand-editing files in local dev, or a git push landing directly on the share), trigger a reload explicitly:
# local dev: the agent's host port
curl -X POST http://localhost:3010/solution/reloadThe endpoint answers 202 immediately (with the pre-reload version) and runs
the reload asynchronously; the resulting solution.updated event then
refreshes every connected reader through the normal SSE chain.