Draft. This document is a work in progress shared for early review. Content and diagrams may change before the final version.

docs / ARCHITECTURE

Fred Platform
Architecture

Component topology, data flows, storage boundaries, network layout, and the security perimeter — for platform engineers, security teams, and architects.

Scope & audience

This document answers the structural questions that the Developer Guide deliberately skips: how the components are wired together, what calls what over which protocol, where data lives, and where the security boundaries sit.

Primary audience: platform engineers sizing or hardening a deployment, security and compliance teams (RSSI, DSI) reviewing the attack surface, architects integrating Fred into a larger infrastructure.

Not covered here: developer onboarding, the contribution workflow, or the reading path through the codebase — those are in the Developer Guide. What each component does (APIs, formats, agents) is in the Feature Reference.

Logical architecture & core flows

Above the physical topology, Fred separates management from execution. The Control Plane is the sole authority for agentic-pod discovery, agent enrollment, session and metadata, and for resolving the execution URL the UI calls. Agentic pods — the bundled fred-agents and any custom team pods — authenticate each request themselves against the caller's Keycloak identity, run the agent logic, and stream results back over SSE. There is no signed grant or capability token in this path: the control plane issues none, and each pod authorizes every request itself with a per-request ReBAC (OpenFGA) check. Governance policies and ReBAC authorization apply across the whole path.

Management (Control Plane) is decoupled from execution (agentic pods). Retention/erasure and other lifecycle work runs on the Control Plane's Temporal worker; long-running ingestion runs on Knowledge Flow's.

Conversation flow

The user authenticates through OIDC (Keycloak); the UI calls the Control Plane's prepare-execution endpoint with its bearer token.
The Control Plane resolves the managed agent and returns an ingress-relative execution URL. No signed grant or capability token is issued — the control plane issues none.
The UI calls the agentic pod directly at that URL, presenting the same Keycloak bearer token. The pod authenticates the JWT and authorizes the request itself with a per-request ReBAC (OpenFGA) check on the caller's team_id — there is no callback to the Control Plane on this path.
The agentic pod executes the agent (ReAct or Graph via fred-runtime), consulting governance and model-routing policies for the effective model and tool behavior.
Output and reasoning traces stream back to the UI over SSE, directly from the pod; checkpoints and session history are persisted by the pod.

Processing flow

Long-running work — Knowledge Flow ingestion/retrieval, and Control-Plane lifecycle actions such as scheduled erasure — runs on Temporal workflows.
Durable execution gives retries, reconciliation, and operational observability; workers scale independently of the interaction path.

Governance is policy-first. Model, tool/MCP, prompt, agent, and data-scope decisions are resolved from policies rather than hardcoded, and access is enforced by ReBAC (OpenFGA). See Policy-based LLM routing and Security.

Why this shape

Decoupling interaction from long-running processing keeps the chat path responsive while ingestion, evaluation, and erasure run durably in the background.
Enterprise governance and auditable behavior — every execution is team-scoped and authorized; policies, not code, decide what a team can use.
Independent, forkable agent pods — agents are built and deployed in their own repositories; the Control Plane discovers and routes to them, with no dependency on the Fred monorepo.

Kubernetes deployment topology

A Fred instance runs entirely inside a single Kubernetes namespace. All external traffic enters through one Ingress. Agent execution pods — both the bundled fred-agents service and any custom team pods — are treated identically by the control plane and by the Ingress routing rules.

Reading the diagram. All traffic enters through the Kubernetes Ingress — there is no other exposed port. When a user starts an agent session, the frontend first calls the control plane's prepare-execution endpoint, which resolves the managed agent and returns an ingress-relative execution URL. There is no signed grant or capability token — the control plane issues none. The frontend then opens an SSE stream directly to the agent pod at /runtime/{id}/agents/execute/stream, presenting the user's own Keycloak bearer token. The agent pod authenticates that JWT and authorizes the request itself with a per-request ReBAC (OpenFGA) check on the caller's team — without calling back to the control plane. This design was chosen deliberately: an earlier revision had the control plane mint a signed ExecutionGrant, which was withdrawn because it made the control plane a proprietary cryptographic root of trust — an unnecessary homologation burden for a C3 deployment. Identity and authorization now rest entirely on Keycloak and OpenFGA, the same model already used elsewhere in the platform. The same flow applies to both the bundled fred-agents service and any custom agent pods. Agent pods are the only services that call external LLM APIs.

The agentic pod extensibility model

Any team can build and deploy their own agent pod by importing fred-runtime and fred-sdk. Once the pod is deployed and its base_url + runtime_id are added to control-plane-backend's runtime_catalog_sources, the control plane enrolls it and the Ingress gains a new /runtime/{runtime_id}/ route. From the user's perspective, a custom agent pod is indistinguishable from the bundled ones.

Future: FRDC v1 (proposed) will replace the static catalog with Kubernetes-native auto-discovery using Service labels (fred.io/runtime=true). Not yet implemented — the static runtime_catalog_sources list is the current production mechanism.

GKE on S3NS — network security model

The Fred instance runs on a private GKE cluster inside an S3NS tenant with no internet exposure. Cloud Armor (GCP-native WAF) enforces a strict separation between user traffic and admin traffic at the application layer, backed by source-IP allowlisting. All admin actions are captured in immutable Cloud Audit Logs — including every Cloud Console operation, with no SSH or VM involved.

Why no bastion, no custom WAF. There are no VMs and no SSH surface — a bastion protects nothing. Cloud Armor is the GCP-native WAF provided by S3NS; bringing a separate appliance would add complexity without adding protection. The admin CIDR allowlist in Cloud Armor provides structural separation: admin paths are unreachable from the user network at the L7 layer, independently of any application-level RBAC. Every Cloud Console action (RBAC changes, config updates, pod restarts) lands in Cloud Audit Logs automatically — no custom instrumentation required.

Fred PlatformArchitecture

Scope & audience

Logical architecture & core flows

Conversation flow

Processing flow

Why this shape

Kubernetes deployment topology

The agentic pod extensibility model

GKE on S3NS — network security model

Related documents

Fred Platform
Architecture