Two external apps per solution; the agent is internal-only

Status: Accepted · Date: 2026-06-30 (Q4 resolved), revised 2026-07-01 (same-origin proxy draft reverted) and 2026-07-03 (agent made internal-only) · Area: Deployment

Context

A deployed solution stack has three user-facing surfaces: the read-only reader webapp, the writer agent (an HTTP service), and the Mission Control console for the agent. Two questions had to settle: which apps are externally reachable, and how apps find each other when the images are shared across all solutions (ADR-002) — so a per-solution URL cannot be baked into an image, because Next.js inlines NEXT_PUBLIC_* variables at build time. An initial draft made the agent internal-only behind a same-origin proxy; an even earlier draft had considered internal ingress for the agent.

Decision (as of 2026-06-30, revised 2026-07-01)

All three apps are external — orbit-<h> (reader), agent-<h> (agent), mc-<h> (Mission Control, stateless, no share mount). The agent was public by design; the internal-ingress draft was judged wrong, and the same-origin proxy revision was reverted on 2026-07-01.
Each gets its own Cloudflare subdomain, provisioned per solution: <id>.<zone>, <id>-agent.<zone>, <id>-mc.<zone> — proxied CNAME, Full-strict SSL, and an ACA managed-cert custom-domain binding, via shared lib.sh helpers used by both the singleton and per-solution paths.
Mission Control reaches the agent via a runtime env var. mc-<h> receives the agent’s public URL as AGENT_URL at deploy time; the force-dynamic root layout injects window.__AGENT_URL__ for the browser and the server reads process.env.AGENT_URL. Why not bake: per-solution images would break ADR-002. Why not proxy (the 2026-07-01 reasoning): hiding the agent behind the webapp was held to be incompatible with later securing the agent’s own subdomain with WorkOS.
The reader’s link out to Mission Control is also runtime-wired, but differently — it is a link, not browser API calls: the reader gets AGENT_WEBAPP_URL and a single force-dynamic route handler (/mission-control) issues a 302 at runtime, avoiding force-dynamic on the whole reader.
The apps are public by default, with an optional edge-secret lockdown behind edge.lock_solutions=true: dormant enforcement primitives (Next src/proxy.ts in the two webapps, a Hono edgeGuard middleware in the agent, /health exempt) activated by EDGE_SHARED_SECRET plus one per-solution Cloudflare Transform Rule over the external hosts.

Revised 2026-07-03: the agent is internal-only

The 2026-07-01 position — “the agent is public by design” — is superseded. Nothing outside the ACA environment ever needed the agent: its only callers are the reader’s server side and Mission Control. Keeping a public subdomain meant keeping an entire exposed surface (and the follow-on plan of putting WorkOS in front of it) for no consumer. The revised decision:

The agent’s ACA ingress is internal (external: false). Requests from outside the environment are rejected by the environment proxy (404). The agent has no Cloudflare subdomain — <id>-agent.<zone> no longer exists for new stacks (teardown still cleans up legacy DNS records).
In-env callers address the agent as http://agent-<h> — the app name resolves via the environment’s internal DNS and hits the ingress on :80. The .internal.<domain> FQDN form does not resolve on this environment — the same pattern as the internal registry services (svc-*).
The reader needed no code change: it already talked to the agent server-side only (ORBIT_AGENT_URL); that env var now carries the in-env address.
Mission Control’s browser can no longer call the agent directly, so the same-origin proxy returns — this time as the accepted mechanism. Mission Control gains a streaming proxy route (src/app/agent/[...path]/route.ts); the root layout now injects the literal prefix /agent as window.__AGENT_URL__ whenever AGENT_URL is set, and the proxy forwards fetch and SSE to the agent’s in-env address (AGENT_URL env = http://agent-<h>). Local dev is unchanged: the browser hits the agent directly via NEXT_PUBLIC_AGENT_URL.
The scale-to-zero keep-alive follows the address change: AGENT_SELF_URL now long-polls http://agent-<h>; the request still passes the (internal) ingress, so ACA’s HTTP scale rule still sees the app busy (ADR-004).
Edge lockdown shrinks to the two public apps: the per-solution Transform Rule covers 2 hosts, the agent needs no edge secret, and the reader→agent bypass header (x-orbit-bypass / EDGE_BYPASS_SECRET) is obsolete — the config keys edge.bypass_header_name / edge.bypass_secret were removed.
Aurora’s registry deployment record no longer stores an agentUrl, and the portfolio UI no longer renders an “agent” link (legacy records may still carry the field).

There is a conscious irony here: the original 2026-07-01 draft’s same-origin proxy was rejected then and is the mechanism now. What changed is the constraint, not the argument — the proxy was rejected because it conflicted with WorkOS-protecting the agent’s own subdomain, and the agent no longer has a subdomain to protect. The runtime-URL injection machinery (ADR-002’s no-baking rule) survives unchanged; only the injected value differs (/agent instead of a public URL).

Consequences

Origin isolation per public app and per solution, matching the platform’s URL convention (flat one-level subdomains under maxqlabs-orbit.com for the wildcard cert), and a clean path to putting WorkOS user auth in front of each public subdomain — the recorded follow-on, which now concerns only the reader and Mission Control.
The agent’s entire HTTP surface is unreachable from outside the ACA environment — one whole exposed surface removed, with no edge secret or WorkOS front needed for it.
Edge lockdown blocks direct-origin access to the two public apps’ raw ACA FQDNs but is not user authentication; until WorkOS lands, unlocked solutions are public.
Turning lockdown on requires cloudflare.zone_name (without the header injector every app would 403); the provisioner refuses otherwise. With the flag off, empty secrets are pruned so behavior is exactly pre-lockdown.
One naming scar is load-bearing: env-storage link names put the role word first (cust-rw-<h>, never <h>-cust-rw) because Azure requires an alphabetic first character and the hash often starts with a digit.

Evidence

infrastructure/azure/provision-solution.sh + app-orbit.yaml.tmpl, app-agent.yaml.tmpl (internal ingress), app-agent-webapp.yaml.tmpl
infrastructure/azure/lib.sh — shared Cloudflare/ACA custom-domain helpers and reconcile_transform_rule_hosts (2 hosts per solution)
implementation/maxq-orbit-agent-webapp — src/app/layout.tsx (injects /agent as window.__AGENT_URL__), src/app/agent/[...path]/route.ts (the same-origin streaming proxy), src/lib/agent-client.ts
implementation/orbit-webapp/codebase/src/app/mission-control/route.ts — the runtime redirect
memory/orbit-auto-deploy.md (the three-apps decision and its revisions), memory/orbit-platform-landscape.md (URL convention)