Note for website readers: Links to source files work on GitHub and in VS Code.
They will not open on this website.
docs / FEATURES
Fred
Feature Reference
Concise, complete inventory of what the Fred platform provides — from document
ingestion and agent execution to platform governance and security certification.
What Fred does
Fred is a governed platform for deploying AI agents at team scale in regulated
environments. It covers the full lifecycle: ingesting knowledge, running agents
against that knowledge, and exposing the result through a managed chat interface
— all within strict team-scoped authorization boundaries.
Knowledge ingestion
31 file formats — documents, spreadsheets, images, audio, video — converted to
searchable, citation-rich Markdown via configurable processing profiles.
Agent execution
HTTP SSE streaming, HITL pause/resume, LangGraph checkpoints,
and a typed authorization envelope that keeps product concerns out of the execution engine.
MCP tool ecosystem
Agents discover and call tools via the Model Context Protocol.
Operators configure which servers each agent instance may use at enrollment time.
Platform governance
ReBAC (OpenFGA) enforces team scoping, role separation, and
resource quotas. Operators set immutable platform guardrails; teams configure within them.
Security certification
First homologation at C3 (French national security classification)
completed. Security posture is actively maintained and re-evaluated across platform updates.
Bring-your-own agents
Import fred-runtime and fred-sdk, define
agent behaviors, deploy as a container. The control plane enrolls custom pods
exactly like the bundled apps/fred-agents.
Services
| Service | Layer | Responsibility |
apps/control-plane-backend |
Control plane |
Teams, sessions, agent enrollment, ExecutionGrant issuance, prompt library, admin APIs |
apps/knowledge-flow-backend |
Control plane |
Document ingestion pipeline, vector search, object storage, resource lifecycle |
apps/fred-agents |
Runtime |
Reference agent pod — ReAct agents, test harness, MCP tool integration |
apps/frontend |
Client |
React SPA — managed chat, agent catalog, knowledge management, session lifecycle |
libs/fred-runtime |
Runtime lib |
Agent execution framework — SSE, HITL, checkpoints, grant validation. Core import for custom agent pods. |
libs/fred-sdk |
SDK lib |
Typed execution contracts, UiPart primitives, prompt utilities, authoring helpers |
libs/fred-core |
Cross-cutting lib |
Caching, configuration, structured logging, KPI stores, team ID utilities |
Processing profiles
Each ingestion request selects a profile. Profiles are declared in
configuration.yaml under processing.profiles
and can be overridden per deployment.
| Profile | PDF engine | OCR | Tables | Images | Use for |
| fast default |
pypdfium2 |
No |
Preserve (no structure) |
No |
Text-native PDFs, high-volume ingestion |
| medium |
docling_parse |
OpenVINO |
Structure detection |
No |
Scanned PDFs, complex tables |
| rich |
docling_parse |
OpenVINO (full page) |
Structure + images |
Yes (captions) |
High-fidelity technical documents, figure extraction |
Text splitting (all profiles): chunk size 1500 tokens, overlap 150 tokens,
table preservation enabled. Configurable per deployment.
Audio & video transcription
AudioProcessor converts spoken content to timestamped Markdown transcripts.
Video files have their audio track extracted first; the intermediate WAV is deleted on
completion. The Whisper model is loaded lazily on first use.
| Property | Value |
| Engine | faster-whisper ≥ 1.1.0 |
| Video demux | PyAV ≥ 14.0.0 — resamples to 16 kHz mono WAV |
| Default model | base (configurable via audio_model.whisper_model_size) |
| Device | cpu default; cuda supported (configurable via audio_model.device) |
| Language | Auto-detected; overridable via audio_model.language |
| Output | Markdown with [HH:MM:SS] timestamp per segment, detected language, total duration |
| Beam size | 5 (Whisper default) |
SSE event stream
Every agent execution produces a typed Server-Sent Events stream. Events are defined
in libs/fred-sdk and frozen — agents and frontends depend on this contract.
| Event | When | Key fields |
assistant_delta |
Each streaming text chunk |
content: str, exchange_id |
status |
Intermediate agent step |
kind: ThoughtKind (planning · tool_use · observation · reflection · synthesis), content |
tool_call |
Agent invokes an MCP tool |
tool_name, tool_call_id, arguments |
tool_result |
Tool returns result |
tool_call_id, content, sources[], latency_ms, metadata |
final |
Turn complete |
content (full), sources: VectorSearchHit[], token_usage, ui_parts[] |
awaiting_human |
HITL pause requested |
checkpoint_id, HumanInputRequest (prompt, options, kind) |
node_error |
Graph node failure with routing |
node, error, recoverable: bool |
execution_error |
Unhandled pipeline exception (terminal) |
error, request_id |
UI rendering parts (UiPart)
Rich structured output emitted in the final event ui_parts[] array.
| Type | Fields | Frontend action |
LinkPart |
href, title, kind: "download" | "open" | "cite" |
Rendered as action button or inline citation |
GeoPart |
GeoJSON FeatureCollection |
Rendered as interactive map component |
Execution endpoints
| Endpoint | Mode |
POST /agents/execute/stream | SSE streaming (primary) |
POST /agents/execute | Synchronous request/response |
GET /agents/sessions/{session_id}/messages | Full turn history as ChatMessage[] |
HITL & checkpoints
Human-in-the-loop pause lets an agent request a decision or input mid-execution.
The graph state is persisted to a LangGraph checkpoint; the session resumes from
that exact point when the human responds.
| Capability | Detail |
| Pause trigger |
Agent emits awaiting_human event with checkpoint_id and a HumanInputRequest (prompt, options list, kind) |
| Resume |
POST /agents/execute/stream with action: "resume" + checkpoint_id + human response payload in ExecutionGrant |
| Persistence |
LangGraph checkpoint storage; HITL request and response persisted as hitl_request / hitl_response channels in session history |
| Stateless execution guarantee |
No in-memory session state between requests; full state in checkpoint only |
| CLI inspection |
/checkpoints [limit], /checkpoint <thread_id>, /stats |
ExecutionGrant
The ExecutionGrant is the trust envelope issued by the control plane and validated
by the runtime on every request. It is the primary security boundary between product
and execution — the runtime never issues one, and the frontend cannot forge or
escalate it.
| Field | Purpose |
user_id | Keycloak UUID of the requesting user |
team_id | Team scope for the execution |
agent_instance_id | Enrolled agent instance being run |
session_id | Stable multi-turn conversation key |
action | execute or resume (HITL) |
expires_at | Short-lived (5 minutes); cannot be cached or replayed |
scopes | Optional fine-grained capability constraints |
storage_scope | Allowed object storage paths for this execution |
Prepare-execution flow: POST /teams/{team_id}/agent-instances/{id}/prepare-execution
returns ExecutionPreparation — grant + runtime URLs + effective chat options + capability flags
(supports_streaming, supports_hitl, supports_ui_parts).
The grant expires after 5 minutes regardless of whether execution has started.
Developer CLI (fred-agents-cli)
A REPL for interacting with any runtime agent directly — authentication, session
management, checkpoint inspection, and KPI display — without the frontend.
| Command | Description |
/agents, /agent <id> | List available agents and switch active agent |
/session <id>, /sessions | Set session scope or list existing sessions |
/history [session_id] | Print full message history for a session |
/checkpoints [limit], /checkpoint <id> | List or inspect LangGraph checkpoint state |
/stats | Checkpoint storage statistics |
/team [team_id|clear] | Control team scope for managed execution |
/context | Print active execution context summary |
/kpi [pattern] | Display runtime KPI counters and timings |
/login, /login-password | PKCE browser flow or password authentication |
/whoami, /logout | Auth status and session termination |
/help <question> | Ask the active agent a question in natural language |
Bundled agent templates
apps/fred-agents is the reference agent pod. Its templates are available
to any team after enrollment via the control plane. Third-party pods can supply
additional templates using the same mechanism.
| Template ID | Name | RAG | Description |
fred.github.general_assistant |
General assistant |
Optional |
General-purpose ReAct agent; configurable system prompt, no retrieval by default |
fred.github.rag_expert |
RAG expert |
Yes |
ReAct with Knowledge Flow vector search; formats cited sources with [N] anchors |
fred.github.sql_expert |
SQL expert |
No |
Natural-language to SQL; table exploration via MCP database tool |
fred.github.sentinel |
Sentinel |
No |
Observability and monitoring assistant; MCP servers locked — operator-configured only |
fred.github.test_assistant |
Test harness |
N/A |
No-LLM deterministic agent covering all SSE event types, HITL flows, markdown rendering, error scenarios; used for integration and UI testing |
ReAct execution engine
All production agents use a LangGraph-backed ReAct loop. The graph is stateless
across requests; all state lives in the LangGraph checkpoint.
| Property | Detail |
| Framework | LangGraph agent graph |
| Cycle | think → plan → tool_call → observe → reflect → finalize |
| Thought kinds |
planning · tool_use · observation · reflection · synthesis — each visible as a status event and in the reasoning trace UI |
| Model routing | Configurable model profile per agent instance; tunable via prompts.system override field |
| KPI per node | Latency, token counts, and tool timings tracked per graph step |
| Langfuse tracing | Full trace exported with user_id, team_id, agent_instance_id, session_id, checkpoint_id, template_agent_id |
MCP tool integration
Agents use tools via the Model Context Protocol. The MCP catalog is declared in the
agent pod's mcp_catalog.yaml and proxied to the frontend by the control
plane — the control plane does not interpret tool behavior, only routes configuration.
| Capability | Detail |
| Catalog declaration | Per-pod mcp_catalog.yaml — server IDs, config fields, behavioral contracts, optional agent_instructions fragment injected into system prompt |
| Per-server config | Typed config_fields[] with FieldSpec (string/number/boolean/enum, required, default, validation); set per agent instance at enrollment |
| Server selection | Agent instance holds tri-state: inherit pod defaults · empty list · explicit subset of server IDs |
| Locked servers | Servers marked locked=True cannot be toggled by team managers — operator-controlled only |
| Bundled servers | Knowledge Flow text search, Knowledge Flow corpus; extensible via custom MCP server deployment |
Markdown rendering
Chat messages are rendered with react-markdown +
remark-gfm + rehype-sanitize. A streaming fence
guard prevents transient parse errors during chunked arrival.
| Feature | Detail |
| CommonMark base | Bold, italic, headings, lists, blockquotes, horizontal rules, links |
| GFM extensions | Tables, strikethrough, task lists, autolinks |
| Code blocks | Syntax highlighting; fenced blocks with language tag |
| Math | KaTeX inline ($...$) and block ($$...$$) |
| Diagrams | Mermaid (` ```mermaid `) rendered as SVG |
| Directives | :::details collapsible blocks; mindmap JSON blocks |
| Citations | [N] markers converted to clickable SourceBadge atoms linked to source detail modal |
| Images | Presigned MinIO URLs (1-minute TTL) injected server-side; rendered inline |
| Security | rehype-sanitize applied; no raw HTML injection |
Reasoning trace
Every agent turn renders an expandable reasoning trace alongside the answer, showing
the full thought process — plans, tool calls with arguments and results, observations,
and reflections.
| Behaviour | Detail |
| Auto-open | Trace panel opens on the first status event during streaming |
| Auto-close | Panel collapses on the final event |
| Entry types | Combo: tool_call + matching tool_result grouped as one card. Solo: plan, thought, observation, error as individual cards. |
| Status chips | Pending (streaming) · ok (success) · error — shown on each tool card |
| Detail drawer | Monaco JSON viewer showing full {call, result} or solo ChatMessage payload |
| Sources panel | Sources extracted from tool_result and final events; expandable with document viewer link when source.uid present |
Chat attachments
Files attached in the chat composer are uploaded to Knowledge Flow and their path
or content is injected into the agent's execution context.
| Feature | Detail |
| Upload endpoint | POST /knowledge-flow/v1/storage/user/upload |
| Path injection | File path /workspace/uploads/{filename} passed to agent as execution context |
| Multimodal | Images base64-encoded for vision-capable models |
| Drag-and-drop | Initiates Knowledge Flow ingestion task with live progress via SSE task stream |
| Progress UI | Reuses TaskStateBadge, TaskProgressBar, and useTaskSseManager from the scheduler primitives |
Sessions
| Property | Detail |
| Ownership split | Control plane owns: title, timestamps, status, agent/team binding. Runtime owns: message content and checkpoints. |
| Session list | GET /teams/{team_id}/sessions — ordered by updated_at DESC, limit 50 |
| Editable titles | SessionTitleEditor component; title defaults to first user message (up to 120 chars) |
| Deep links | /team/{teamId}/managed-chat/{agentInstanceId}?session={uuid} |
| Purge policies | Control-plane-managed session retention; session_purge_queue handles scheduled deletion |
Role model
Fred enforces two orthogonal role axes. Platform roles are global (Keycloak groups).
Team roles are per-relationship (OpenFGA tuples). Neither axis grants access to the
other's surfaces — this is enforced at the API layer, not only in the UI.
| Role | Axis | Authority |
owner |
Team (ReBAC) |
Create teams, assign managers, set TeamPlatformPolicy (quotas, allowed models, allowed MCP servers) |
manager |
Team (ReBAC) |
Configure TeamRoutingPolicy, manage team agents (enroll/edit/delete instances), manage team-scoped prompts |
member |
Team (ReBAC) |
Use team agents, use visible prompts, manage own personal-team prompts and agents |
admin |
Platform (RBAC) |
Access internal/admin APIs (e.g., runtime binding resolver); does not inherit team-level access |
editor / viewer |
Platform (RBAC) |
Application-level read/write permissions distinct from team governance |
Orthogonality rule: owner cannot manage agents or prompts.
manager cannot touch platform policy or quotas. Hardcoded in the authorization layer.
Team scoping
| Concept | Detail |
| Personal team |
Every user has exactly one personal team. ID derived deterministically from Keycloak UUID via personal_team_id(user_id). Cannot be shared; governed by the same authorization model as regular teams. |
| Team-scoped resources |
Agents, prompts, sessions, document libraries, storage quotas — all scoped to a team. There is no global/unscoped resource namespace. |
| Organization singleton |
organization:fred in OpenFGA holds global role context without granting implicit team access. Used for platform-admin checks. |
| FrontendBootstrap |
GET /control-plane/v1/frontend/bootstrap returns resolved current_user, active_team, available_teams, permissions, feature_flags, ui_settings, gcu_version — single authenticated round-trip for the SPA shell. |
| PermissionSummary |
Flattened boolean capability flags per team (can_manage_team_agents, can_create_prompt, etc.) — UI renders conditionally on these, not on role strings. |
Policies & quotas
| Policy | Set by | What it controls |
TeamPlatformPolicy |
Owner |
Maximum storage quota, allowed model profiles, allowed MCP servers, team-level rate limits |
TeamRoutingPolicy |
Manager |
Default model profile for new agent instances, default RAG scope, search policy |
| Storage quota |
Owner (default: 5 GB team, 5 GB personal) |
Enforced on knowledge-flow document uploads per team |
| MCP server locks |
Pod author / platform |
Servers marked locked=True cannot be toggled by managers; operator-configured only |
| GCU (usage terms) |
Platform |
Version-tracked acceptance state per user; returned in FrontendBootstrap.gcu_version |
Runtime API
| Endpoint | Description |
POST /agents/execute/stream | SSE streaming execution (primary path) |
POST /agents/execute | Synchronous execution |
GET /agents/sessions/{session_id}/messages | Full turn history as ChatMessage[] |
GET /agents | Agent template catalog (pod-scoped) |
GET /health | Liveness / readiness probe |
Control-plane API
| Endpoint | Description |
GET /control-plane/v1/frontend/bootstrap | Single-call SPA init — user, team, permissions, flags |
POST /teams/{team_id}/agent-instances/{id}/prepare-execution | Issue ExecutionGrant + runtime URLs + effective chat options |
GET/POST/PATCH/DELETE /teams/{team_id}/agent-instances | Enroll, list, update, remove agent instances |
GET/POST/PATCH/DELETE /teams/{team_id}/sessions | Session metadata CRUD |
GET /teams/{team_id}/agent-templates | Proxied catalog from all registered runtime pods |
POST/GET /api/v1/tasks | Start and list long-running tasks (ingestion, migration) |
GET /api/v1/tasks/{id}/events | SSE task event stream with Last-Event-ID replay |
POST /api/v1/tasks/{id}/cancel | Cancel task (idempotent, 202) |
OpenAI compatibility layer
A secondary interface that allows tools expecting the OpenAI Chat Completions API
to connect to Fred agents without modification.
| Endpoint | Notes |
GET /v1/models | Model list (mapped to registered agent templates) |
POST /v1/chat/completions | Streaming-compatible; X-Fred-Team-Id header for team scoping |
Limitation: team-scoped managed execution and HITL are not fully supported
via the /v1 protocol. For full platform features use the native runtime API.
fred-sdk
The shared contract library imported by agent pods and the runtime. It contains no
server or product dependency — any team can build against it to author a deployable
agent pod.
| Module | Provides |
| Execution types | ExecutionGrant, ChatMessage, VectorSearchHit, HumanInputRequest, all SSE event types |
| UiPart primitives | LinkPart, GeoPart and the UiPart discriminated union |
| ThoughtKind enum | planning · tool_use · observation · reflection · synthesis |
| Prompt utilities | System prompt assembly, tuning field resolution, context injection helpers |
| Authoring primitives | Base classes and decorators for defining agent behaviors in custom pods |
Observability
| System | What it captures | Configuration |
| Langfuse |
Full LLM traces with identity context: user_id, team_id, agent_instance_id, session_id, checkpoint_id, template_agent_id |
Enabled via LANGFUSE_* env vars; traces per turn, per node |
| Prometheus |
Process metrics, SQL connection pool metrics, per-agent KPI counters, latencies |
observability.metrics: prometheus in deployment config |
| KPI pipeline |
In-process counters (token counts, latency per step, tool timings) exported at configurable intervals |
kpi_process_metrics_interval_sec, kpi_log_summary_interval_sec |
| Correlation IDs |
request_id, trace_id, correlation_id propagated through all log lines, KPI entries, and metrics |
Automatic; no configuration required |
| Structured logs |
JSON-structured log output at configurable level (debug default); all services |
app.log_level in deployment config |
Security & compliance
Authentication
| Feature | Detail |
| Identity provider | Keycloak — OpenID Connect, configurable realm |
| Auth flows | PKCE browser flow (primary), resource-owner password flow (CLI/dev), no-security mode (local/airgapped) |
| Token validation | Bearer token required on all protected endpoints; short-lived ExecutionGrant additionally required for execution |
| MFA | TOTP secrets migrate with Keycloak realm export; WebAuthn/passkey supported (hardware-bound, re-enrolment required on migration) |
| No-security mode | Auth bypass for local development and air-gapped deployments; team_id defaults to "personal" |
Authorization
| Feature | Detail |
| Engine | OpenFGA — Relationship-Based Access Control (ReBAC) |
| Policy evaluation | Runtime checks tuple relationships on every protected operation; no role string comparison in business logic |
| Tuple format | user:<keycloak-uuid> format exclusively (no username strings in production) |
| Enforcement layer | API level — not UI-only guards. Even direct API calls are rejected if tuples do not authorize the relationship. |
Data isolation
| Control | Detail |
| Team namespace isolation | All data (agents, prompts, sessions, documents) is team-scoped at the storage and authorization layer |
| Storage scope in grant | ExecutionGrant.storage_scope restricts which object storage paths a runtime execution may access |
| Session content isolation | Runtime returns session messages only to requests carrying a valid grant for that session's team |
| Personal team isolation | Personal team data is accessible only to the owning user; the personal team ID is not guessable (UUID-derived) |
| Document download TTL | Presigned URLs for document images expire after 1 minute |
Security certification
C3 homologation achieved. Fred has completed its first homologation
at the C3 level of French national security classification. Security posture is
actively maintained: architecture reviews, OpenFGA policy audits, and Keycloak
hardening are ongoing as the platform evolves toward new homologation cycles.
Design for regulated environments
| Property | How it is achieved |
| Auditability | All execution attributed to user_id + team_id; Langfuse traces record full identity context per turn |
| Role separation | Owner / manager / member surfaces are orthogonal and API-enforced; no privilege escalation path |
| Least-privilege execution | ExecutionGrant is short-lived, scoped to one session and one agent instance; cannot be forged or escalated by the frontend |
| No shared global state | Every resource is team-scoped; there is no unscoped global API for non-admin operations |
| Air-gap capable | No-security mode + local object storage + local vector store; no mandatory external service dependencies in the execution path |
Deployment
Kubernetes native
| Concern | Approach |
| Service exposure | Standard Kubernetes Service (internal) + Ingress/Gateway (browser); runtime pods exposed at /runtime/{runtime_id} prefix |
| Service discovery | Kubernetes DNS; no custom mesh or sidecar required |
| Network policy | Standard NetworkPolicy primitives; each service is independently policy-able |
| Helm chart | Modernized chart per FRED-CHART-MODERNIZATION-RFC.md; values-driven deployment for all services |
| Custom agent pods | Import fred-runtime, package as container, register runtime URL in platform.runtime_catalog_sources; no Fred core changes needed |
Configuration model
| Layer | File | Contains |
| Secrets / env | ENV_FILE | Database passwords, API keys, Keycloak secrets |
| Deployment config | CONFIG_FILE (YAML) | Base URL, ingress prefix, runtime catalog sources, processing profiles, log level, metrics config |
| Processing profiles | configuration.yaml → processing.profiles | PDF engine, OCR settings, chunk size, input processor registry per suffix |
| Audio config | configuration.yaml → audio_model | Whisper model size, device (cpu/cuda), language override |
Local development
| Command | What it does |
make run | Start the service's API process |
make run-worker | Start the Temporal/background worker |
make cli | Backend validation CLI (all backends expose this) |
make code-quality | Ruff + format (Python) or tsc + prettier (frontend) — run from repo root |
make test | Offline unit tests only (no live stack required) |
Infrastructure stack (local): PostgreSQL, Keycloak, MinIO (object storage),
OpenSearch (vector index), OpenFGA (authorization), Temporal (background workers) —
all provided via Docker Compose from ignored/fred-deployment-factory.