Skip to Content
ApplicationsActivity Service

Activity Service

The activity service is the platform’s narrative record: an append-only feed of business events — “Patrick deployed smals-sas-poc to Azure”, “the agent completed request r-0042 on my-fit-portal” — recorded by every producing service and queryable at any level of the portfolio (customer, tenant, solution, actor). It implements designs/activity-service.md (v1, built 2026-07-05) and consists of two pieces:

PieceWhat it isWhere
activity-serviceHono microservice: append-only event store + query APIimplementation/activity-service/
@maxq/activity-kitShared fire-and-forget emitter every producer usesimplementation/shared/activity-kit/
ServiceOwnsPostgres schemaLocal portAzure app
activity-serviceactivity.events (immutable ActivityEvents)activity3024svc-activity

The component name is deliberately prosaic, matching the registry-service fleet convention; the evocative name is reserved for the UI — the planned Aurora feed page is called “Flight Log”.

v1 scope: the service, the emitter kit, and the deployment wiring are built and live-verified. No producer emits anything yet — the emit() call sites are inventoried below and integrate in a follow-up — and the Aurora Flight Log UI does not exist yet. The kit’s disabled mode (url: undefined ⇒ no-op) exists precisely so producers can ship their call sites before the service is deployed everywhere.

What it is — and is not

The registry services already keep an audit.events table (before/after JSON per registry mutation, written by registry-kit’s writeAudit). The activity feed is a different artefact, and deliberately not derived from it (decision E0):

  • audit.events covers registry mutations only; deploys, agent work, and chat actions never touch it.
  • Before/after diffs carry no business meaning — data.tenant: null → smals-main is not “Attached to tenant smals-main”. The producer that performs an action writes the sentence describing it.
  • Ownership stays with the producer; the feed accepts what it is told.

The activity service is also not a metrics/observability system (no latency, no health), not a message bus (nothing subscribes to it to trigger behaviour), and not a guaranteed-delivery audit log — delivery is best-effort by declared contract (E3). Anything that genuinely requires guaranteed capture belongs in audit.events or the internal repo’s .orbit/ trail.

The event model

One entity: ActivityEvent — an immutable camelCase JSON document.

FieldSet byDescription
idserviceULID — time-ordered, so the primary key doubles as the feed’s pagination cursor
occurredAtproducer (kit defaults to now)When it happened
recordedAtserviceWhen it was persisted; divergence beyond the batching window signals delivery lag
sourcekit (once, at construction)The producing component: aurora-webapp, maxq-orbit-agent, orbit-deploy, …
actionproducerDotted noun.verb in past tense: solution.deployed, tenant.created
actorproducer{ type: user | agent | system, id, display? } — email for users, component id otherwise
subjectproducer{ type, id, display? } — the object the event is about
contextproducer{ customer?, tenant?, solution? } — the portfolio chain the subject sits in
descriptionproducerOne human-readable sentence, renderable as-is (only *emphasis* markup)
severityproducer (default info)info | notice (milestone) | warning (a human should look) — display weight, not alerting
correlationIdproducerGroups events of one logical operation (a deploy run, an agent request)
metadataproducerFree-form JSONB detail, display-only, soft-capped at 8 KB (E5)

Three rules give the model its shape:

  1. The producer resolves the context chain at emit time. If the subject is a tenant, the producer fills context.customer; a solution fills all three. The service stores what it is given and never calls the registry to enrich or verify (E1): the feed must accept events when every other service is down, and historical events must reflect the chain as it was then — a solution later moved to another tenant keeps its old-context events.
  2. The subject is duplicated into the context where applicable (a tenant subject also appears as context.tenant), so one indexed path answers “everything at tenant X” whether the tenant was the subject or merely the stage.
  3. display fields snapshot names at event time. Renames don’t rewrite history; the UI may resolve fresher names from the registry and fall back to the snapshot for deleted objects.

action and subject.type are validated by pattern, not enum (E2) — a new producer verb must never require redeploying the service. The starting vocabulary (typed helpers ship in the kit as KNOWN_ACTIONS):

ProducerActions
aurora-webapp (BFF / chat)customer.created customer.updated tenant.created tenant.updated solution.created solution.registered solution.attached solution.detached solution.archived sweep.completed
orbit-deploy (engine)deployment.started deployment.succeeded deployment.failed deployment.torn-down frontdoor.provisioned
maxq-orbit-agentrequest.received request.planned task.completed task.failed request.completed solution.reloaded
platform / opsrelease.published service.migrated

Storage

Same Postgres server and database (portfolio) as the registry, new schema activity — but not the Cosmos-shaped records table. Events are immutable, so the DocumentStore machinery (version column, If-Match optimistic concurrency) would be dead weight. Instead, one purpose-built append-only table: filterable fields promoted to real columns, the full document in data, and a composite (field, occurred_at DESC) index per feed dimension — every query is “filter + order by time descending”:

-- activity.events — from migrations/001-init.sql CREATE TABLE IF NOT EXISTS {{schema}}.events ( id text PRIMARY KEY, -- ULID (time-ordered) occurred_at timestamptz NOT NULL, recorded_at timestamptz NOT NULL DEFAULT now(), source text NOT NULL, action text NOT NULL, actor_type text NOT NULL, actor_id text NOT NULL, subject_type text NOT NULL, subject_id text NOT NULL, customer text, -- context chain, denormalised tenant text, solution text, severity text NOT NULL DEFAULT 'info', correlation_id text, data jsonb NOT NULL -- the full event document );

From registry-kit the service reuses createPool (including the Entra-token-as-password Azure path), applyMigrations (advisory-locked boot-time SQL), serviceGuard, and healthPayload. It does not use DocumentStore, If-Match, writeAudit, or PeerClient — and it does not call ensureAudit: the feed has no audit sidecar, it is append-only.

There is no retention policy in v1 (E6): at portfolio scale the volume is thousands of rows per month. ULID keys and time-indexed queries partition cleanly when the numbers ever say otherwise.

HTTP API

Same trust model as the registry services: serviceGuard (dormant x-service-secret shared-secret check) on everything, /health exempt, internal ingress only — no browser talks to it directly. Aurora’s BFF will proxy reads; producers write server-side through the kit.

Write — POST /activities

Body is one event or an array (the kit always sends arrays). Validation is per item (E4): the response is 202 { accepted, rejected: [{index, error}], ids } — one malformed event never sinks the batch that shares its flush window. The service stamps id and recordedAt. There is no update and no delete route — immutability is enforced by API absence, not by column grants.

Read — GET /activities

All filters optional, AND-combined:

ParameterMeaning
customer= / tenant= / solution=Context-chain scoping (any level)
actor=actor.id exact match
action=Exact (solution.deployed) or prefix (deployment.*)
source= / severity= / correlationId=Exact match
since= / until=ISO-8601 bounds on occurredAt
limit=Default 50, max 200
cursor=Id of the last event seen (keyset pagination)

Ordering is occurredAt DESC, id DESC (stable; ties within one millisecond are arbitrary by design). Pagination is keyset, not OFFSET: the response carries nextCursor (the last id, or null when the page wasn’t full); the service resolves the cursor row’s own occurred_at to anchor (occurred_at, id) < (…). GET /activities/{id} serves one event for deep-links; GET /health is the registry-kit payload with a db ping.

Out of v1 deliberately: aggregation endpoints, full-text search over descriptions, and an SSE live tail (the feed UI polls; activity.appended over the existing SSE-proxy pattern is the natural later addition).

The emitter — @maxq/activity-kit

The whole point of the kit is that producers can call it anywhere, including inside request handlers and mutation paths, with zero risk. It is a file: workspace dependency like trajectory-loader and registry-kit (same build-context widening, same baked-into-images rebuild gotcha), with zero runtime dependencies.

import { createActivityEmitter } from "@maxq/activity-kit"; const activity = createActivityEmitter({ url: process.env.ACTIVITY_SERVICE_URL, // undefined ⇒ disabled no-op emitter secret: process.env.SERVICE_SHARED_SECRET, source: "aurora-webapp", }); activity.emit({ action: "solution.attached", actor: { type: "user", id: actorEmail }, subject: { type: "solution", id: solutionId, display: solutionName }, context: { customer: customerId, tenant: tenantId, solution: solutionId }, description: `Attached solution *${solutionName}* to tenant *${tenantId}*.`, });

The mechanics, in contract order:

  • emit() is synchronous and infallible. It validates shape (warn + drop on failure — never throw), stamps occurredAt and the defaults, and pushes onto the in-memory queue. It never awaits network.
  • Batched background flush: one loop posts the queue when it reaches 20 events or 3 seconds, whichever first. Failed flushes retry with exponential backoff (1 s doubling to a 60 s cap) without blocking new emits — this is what makes svc-activity’s scale-from-zero cold starts a non-event instead of a lost write.
  • Bounded queue, drop-oldest: cap 1 000 events, one aggregated warning per overflow burst. An unreachable activity service degrades to lost feed entries, never to memory growth or caller latency.
  • Graceful drain: await emitter.flush() (bounded by a timeout) for short-lived processes such as CLI deploy scenarios; long-running services never need it. The flush timer is unref’d — the kit never keeps a process alive.
  • Disabled mode: constructing with url: undefined returns a no-op emitter.
  • withCorrelation(id) returns a child emitter sharing the parent’s queue that stamps correlationId on every emit — one line at the top of a deploy run or agent request groups everything under it.

Producers (planned integration)

Where the emit() calls land when integration happens — the layer that knows the business meaning and the actor emits, not the storage layer below it:

ProducerWhereEvents
aurora-webapp BFFafter successful registry mutations, createSolution, attach/detach, sweepcustomer.*, tenant.*, solution.*, sweep.completed
aurora-webapp agent chattool-effect boundariessolution.created, deployment.started (chat user in metadata.onBehalfOf)
orbit-deploy enginescenario start/success/failure, front door, teardowndeployment.*, frontdoor.provisioned
maxq-orbit-agentTaskProcessor lifecyclerequest.*, task.* (context chain injected at provision time — open question Q2)
registry servicesnever — they keep audit.events; the BFF above them emits the business event

The consumer side is the planned Aurora Flight Log rail entry (the portfolio-wide feed with a filter bar) plus an Activity tab on the customer, tenant, and solution detail pages — the same feed, pre-scoped, read through a thin BFF proxy.

Deployment

The service copies the registry-service deployment shape wholesale (codebase/ + dockerfiles/, node:20-slim, tsc → node dist/index.js, boot-time migrations):

  • Local: compose service activity-service on host port 3024 (ACTIVITY_SERVICE_PORT), same portfolio database, PG_SCHEMA=activity. Like the registry services, changing the shared kit requires an image rebuild (docker compose up -d --build activity-service).
  • Azure: svc-activity, rendered as the fourth pass of app-service.yaml.tmpl by provision-services.sh (its peer-URL placeholders stay empty and are pruned — the service consults no peers). Internal ingress, minReplicas: 0, Entra token auth to Postgres, in-environment URL in app-name form (http://svc-activity).
  • Releases: build-images.sh activity-service builds from the activity-service/v<semver> git tag and records releases/activity-service/v<version>/release.yaml; the deployed tag pins in config.local.yaml’s images.activity_service_tag. The kit is not released separately — it ships inside its consumers’ images.

First deploy on an environment: commit → git tag activity-service/v0.1.0 → pin the tag in config → build-images.sh activity-servicedeploy.sh services.

Design decisions (summary)

#Decision
N1Component named activity-service (fleet convention); “Flight Log” is the UI name only
E0Explicit emission from producers — not derived from audit.events, which stays untouched as the forensic trail
E1The service stores, it does not verify: no peers, no registry lookups, context taken on faith — availability and historical fidelity over referential purity
E2Open action/subject-type vocabulary, pattern-validated; new verbs need no service deploy
E3Fire-and-forget with bounded loss, not guaranteed delivery; no outbox tables, no queue infrastructure
E4Per-item batch acceptance — one malformed event never sinks its flush window
E5 / E68 KB metadata soft cap; no retention policy yet (append-only is years of runway at portfolio scale)

Full rationale, alternatives, and open questions (actor identity before WorkOS auth, the agent’s context chain, tenant-scoped reader access, SSE live tail) are in designs/activity-service.md.