Note for website readers: Links to source files work on GitHub and in VS Code. They will not open on this website.
docs / FEATURES

Fred
Feature Reference

Concise, complete inventory of what the Fred platform provides — from document ingestion and agent execution to platform governance and security certification.

What Fred does

Fred is a governed platform for deploying AI agents at team scale in regulated environments. It covers the full lifecycle: ingesting knowledge, running agents against that knowledge, and exposing the result through a managed chat interface — all within strict team-scoped authorization boundaries.

Knowledge ingestion
31 file formats — documents, spreadsheets, images, audio, video — converted to searchable, citation-rich Markdown via configurable processing profiles.
Agent execution
HTTP SSE streaming, HITL pause/resume, LangGraph checkpoints, and a typed authorization envelope that keeps product concerns out of the execution engine.
MCP tool ecosystem
Agents discover and call tools via the Model Context Protocol. Operators configure which servers each agent instance may use at enrollment time.
Platform governance
ReBAC (OpenFGA) enforces team scoping, role separation, and resource quotas. Operators set immutable platform guardrails; teams configure within them.
Security certification
First homologation at C3 (French national security classification) completed. Security posture is actively maintained and re-evaluated across platform updates.
Bring-your-own agents
Import fred-runtime and fred-sdk, define agent behaviors, deploy as a container. The control plane enrolls custom pods exactly like the bundled apps/fred-agents.

Services

ServiceLayerResponsibility
apps/control-plane-backend Control plane Teams, sessions, agent enrollment, ExecutionGrant issuance, prompt library, admin APIs
apps/knowledge-flow-backend Control plane Document ingestion pipeline, vector search, object storage, resource lifecycle
apps/fred-agents Runtime Reference agent pod — ReAct agents, test harness, MCP tool integration
apps/frontend Client React SPA — managed chat, agent catalog, knowledge management, session lifecycle
libs/fred-runtime Runtime lib Agent execution framework — SSE, HITL, checkpoints, grant validation. Core import for custom agent pods.
libs/fred-sdk SDK lib Typed execution contracts, UiPart primitives, prompt utilities, authoring helpers
libs/fred-core Cross-cutting lib Caching, configuration, structured logging, KPI stores, team ID utilities

Supported ingestion formats

All formats are handled by pluggable BaseMarkdownProcessor implementations registered per file suffix in configuration.yaml. The output is always Markdown with optional inline images, tables, and citation anchors.

Documents

FormatExtensionsEngineNotes
PDF (fast) .pdf LitePdfMarkdownProcessor — pypdfium2 Text-layer only, no OCR, fastest; default in the fast profile
PDF (full) .pdf PdfMarkdownProcessor — docling_parse OpenVINO OCR, table structure detection, figure extraction; medium/rich profiles
Word document .docx DocxMarkdownProcessor Text, tables, inline images
PowerPoint .pptx PptxMarkdownProcessor Slide content and speaker notes
CSV / spreadsheet .csv CsvTabularProcessor Excel-export resilient (encoding detection, BOM stripping); produces Markdown tables
Plain text .txt TextMarkdownProcessor Pass-through with front-matter enrichment
Markdown .md MarkdownMarkdownProcessor Preserved as-is
JSON Lines .jsonl JsonlMarkdownProcessor Each line as a structured entry; supports field projection

Images

Processed by ImageProcessor using vision-capable model inference.

.png .jpg .jpeg .gif .bmp .svg .webp .ico

Audio & video

Processed by AudioProcessor via faster-whisper. See § Audio & video for full details.

.mp3 .wav .ogg .flac .m4a .aac .mp4 .mkv .avi .mov .webm

Processing profiles

Each ingestion request selects a profile. Profiles are declared in configuration.yaml under processing.profiles and can be overridden per deployment.

ProfilePDF engineOCRTablesImagesUse for
fast default pypdfium2 No Preserve (no structure) No Text-native PDFs, high-volume ingestion
medium docling_parse OpenVINO Structure detection No Scanned PDFs, complex tables
rich docling_parse OpenVINO (full page) Structure + images Yes (captions) High-fidelity technical documents, figure extraction
Text splitting (all profiles): chunk size 1500 tokens, overlap 150 tokens, table preservation enabled. Configurable per deployment.

Audio & video transcription

AudioProcessor converts spoken content to timestamped Markdown transcripts. Video files have their audio track extracted first; the intermediate WAV is deleted on completion. The Whisper model is loaded lazily on first use.

PropertyValue
Enginefaster-whisper ≥ 1.1.0
Video demuxPyAV ≥ 14.0.0 — resamples to 16 kHz mono WAV
Default modelbase (configurable via audio_model.whisper_model_size)
Devicecpu default; cuda supported (configurable via audio_model.device)
LanguageAuto-detected; overridable via audio_model.language
OutputMarkdown with [HH:MM:SS] timestamp per segment, detected language, total duration
Beam size5 (Whisper default)

SSE event stream

Every agent execution produces a typed Server-Sent Events stream. Events are defined in libs/fred-sdk and frozen — agents and frontends depend on this contract.

EventWhenKey fields
assistant_delta Each streaming text chunk content: str, exchange_id
status Intermediate agent step kind: ThoughtKind (planning · tool_use · observation · reflection · synthesis), content
tool_call Agent invokes an MCP tool tool_name, tool_call_id, arguments
tool_result Tool returns result tool_call_id, content, sources[], latency_ms, metadata
final Turn complete content (full), sources: VectorSearchHit[], token_usage, ui_parts[]
awaiting_human HITL pause requested checkpoint_id, HumanInputRequest (prompt, options, kind)
node_error Graph node failure with routing node, error, recoverable: bool
execution_error Unhandled pipeline exception (terminal) error, request_id

UI rendering parts (UiPart)

Rich structured output emitted in the final event ui_parts[] array.

TypeFieldsFrontend action
LinkPart href, title, kind: "download" | "open" | "cite" Rendered as action button or inline citation
GeoPart GeoJSON FeatureCollection Rendered as interactive map component

Execution endpoints

EndpointMode
POST /agents/execute/streamSSE streaming (primary)
POST /agents/executeSynchronous request/response
GET /agents/sessions/{session_id}/messagesFull turn history as ChatMessage[]

HITL & checkpoints

Human-in-the-loop pause lets an agent request a decision or input mid-execution. The graph state is persisted to a LangGraph checkpoint; the session resumes from that exact point when the human responds.

CapabilityDetail
Pause trigger Agent emits awaiting_human event with checkpoint_id and a HumanInputRequest (prompt, options list, kind)
Resume POST /agents/execute/stream with action: "resume" + checkpoint_id + human response payload in ExecutionGrant
Persistence LangGraph checkpoint storage; HITL request and response persisted as hitl_request / hitl_response channels in session history
Stateless execution guarantee No in-memory session state between requests; full state in checkpoint only
CLI inspection /checkpoints [limit], /checkpoint <thread_id>, /stats

ExecutionGrant

The ExecutionGrant is the trust envelope issued by the control plane and validated by the runtime on every request. It is the primary security boundary between product and execution — the runtime never issues one, and the frontend cannot forge or escalate it.

FieldPurpose
user_idKeycloak UUID of the requesting user
team_idTeam scope for the execution
agent_instance_idEnrolled agent instance being run
session_idStable multi-turn conversation key
actionexecute or resume (HITL)
expires_atShort-lived (5 minutes); cannot be cached or replayed
scopesOptional fine-grained capability constraints
storage_scopeAllowed object storage paths for this execution
Prepare-execution flow: POST /teams/{team_id}/agent-instances/{id}/prepare-execution returns ExecutionPreparation — grant + runtime URLs + effective chat options + capability flags (supports_streaming, supports_hitl, supports_ui_parts). The grant expires after 5 minutes regardless of whether execution has started.

Developer CLI (fred-agents-cli)

A REPL for interacting with any runtime agent directly — authentication, session management, checkpoint inspection, and KPI display — without the frontend.

CommandDescription
/agents, /agent <id>List available agents and switch active agent
/session <id>, /sessionsSet session scope or list existing sessions
/history [session_id]Print full message history for a session
/checkpoints [limit], /checkpoint <id>List or inspect LangGraph checkpoint state
/statsCheckpoint storage statistics
/team [team_id|clear]Control team scope for managed execution
/contextPrint active execution context summary
/kpi [pattern]Display runtime KPI counters and timings
/login, /login-passwordPKCE browser flow or password authentication
/whoami, /logoutAuth status and session termination
/help <question>Ask the active agent a question in natural language

Bundled agent templates

apps/fred-agents is the reference agent pod. Its templates are available to any team after enrollment via the control plane. Third-party pods can supply additional templates using the same mechanism.

Template IDNameRAGDescription
fred.github.general_assistant General assistant Optional General-purpose ReAct agent; configurable system prompt, no retrieval by default
fred.github.rag_expert RAG expert Yes ReAct with Knowledge Flow vector search; formats cited sources with [N] anchors
fred.github.sql_expert SQL expert No Natural-language to SQL; table exploration via MCP database tool
fred.github.sentinel Sentinel No Observability and monitoring assistant; MCP servers locked — operator-configured only
fred.github.test_assistant Test harness N/A No-LLM deterministic agent covering all SSE event types, HITL flows, markdown rendering, error scenarios; used for integration and UI testing

ReAct execution engine

All production agents use a LangGraph-backed ReAct loop. The graph is stateless across requests; all state lives in the LangGraph checkpoint.

PropertyDetail
FrameworkLangGraph agent graph
Cyclethink → plan → tool_call → observe → reflect → finalize
Thought kinds planning · tool_use · observation · reflection · synthesis — each visible as a status event and in the reasoning trace UI
Model routingConfigurable model profile per agent instance; tunable via prompts.system override field
KPI per nodeLatency, token counts, and tool timings tracked per graph step
Langfuse tracingFull trace exported with user_id, team_id, agent_instance_id, session_id, checkpoint_id, template_agent_id

MCP tool integration

Agents use tools via the Model Context Protocol. The MCP catalog is declared in the agent pod's mcp_catalog.yaml and proxied to the frontend by the control plane — the control plane does not interpret tool behavior, only routes configuration.

CapabilityDetail
Catalog declarationPer-pod mcp_catalog.yaml — server IDs, config fields, behavioral contracts, optional agent_instructions fragment injected into system prompt
Per-server configTyped config_fields[] with FieldSpec (string/number/boolean/enum, required, default, validation); set per agent instance at enrollment
Server selectionAgent instance holds tri-state: inherit pod defaults · empty list · explicit subset of server IDs
Locked serversServers marked locked=True cannot be toggled by team managers — operator-controlled only
Bundled serversKnowledge Flow text search, Knowledge Flow corpus; extensible via custom MCP server deployment

Markdown rendering

Chat messages are rendered with react-markdown + remark-gfm + rehype-sanitize. A streaming fence guard prevents transient parse errors during chunked arrival.

FeatureDetail
CommonMark baseBold, italic, headings, lists, blockquotes, horizontal rules, links
GFM extensionsTables, strikethrough, task lists, autolinks
Code blocksSyntax highlighting; fenced blocks with language tag
MathKaTeX inline ($...$) and block ($$...$$)
DiagramsMermaid (` ```mermaid `) rendered as SVG
Directives:::details collapsible blocks; mindmap JSON blocks
Citations[N] markers converted to clickable SourceBadge atoms linked to source detail modal
ImagesPresigned MinIO URLs (1-minute TTL) injected server-side; rendered inline
Securityrehype-sanitize applied; no raw HTML injection

Reasoning trace

Every agent turn renders an expandable reasoning trace alongside the answer, showing the full thought process — plans, tool calls with arguments and results, observations, and reflections.

BehaviourDetail
Auto-openTrace panel opens on the first status event during streaming
Auto-closePanel collapses on the final event
Entry typesCombo: tool_call + matching tool_result grouped as one card. Solo: plan, thought, observation, error as individual cards.
Status chipsPending (streaming) · ok (success) · error — shown on each tool card
Detail drawerMonaco JSON viewer showing full {call, result} or solo ChatMessage payload
Sources panelSources extracted from tool_result and final events; expandable with document viewer link when source.uid present

Chat attachments

Files attached in the chat composer are uploaded to Knowledge Flow and their path or content is injected into the agent's execution context.

FeatureDetail
Upload endpointPOST /knowledge-flow/v1/storage/user/upload
Path injectionFile path /workspace/uploads/{filename} passed to agent as execution context
MultimodalImages base64-encoded for vision-capable models
Drag-and-dropInitiates Knowledge Flow ingestion task with live progress via SSE task stream
Progress UIReuses TaskStateBadge, TaskProgressBar, and useTaskSseManager from the scheduler primitives

Sessions

PropertyDetail
Ownership splitControl plane owns: title, timestamps, status, agent/team binding. Runtime owns: message content and checkpoints.
Session listGET /teams/{team_id}/sessions — ordered by updated_at DESC, limit 50
Editable titlesSessionTitleEditor component; title defaults to first user message (up to 120 chars)
Deep links/team/{teamId}/managed-chat/{agentInstanceId}?session={uuid}
Purge policiesControl-plane-managed session retention; session_purge_queue handles scheduled deletion

Role model

Fred enforces two orthogonal role axes. Platform roles are global (Keycloak groups). Team roles are per-relationship (OpenFGA tuples). Neither axis grants access to the other's surfaces — this is enforced at the API layer, not only in the UI.

RoleAxisAuthority
owner Team (ReBAC) Create teams, assign managers, set TeamPlatformPolicy (quotas, allowed models, allowed MCP servers)
manager Team (ReBAC) Configure TeamRoutingPolicy, manage team agents (enroll/edit/delete instances), manage team-scoped prompts
member Team (ReBAC) Use team agents, use visible prompts, manage own personal-team prompts and agents
admin Platform (RBAC) Access internal/admin APIs (e.g., runtime binding resolver); does not inherit team-level access
editor / viewer Platform (RBAC) Application-level read/write permissions distinct from team governance
Orthogonality rule: owner cannot manage agents or prompts. manager cannot touch platform policy or quotas. Hardcoded in the authorization layer.

Team scoping

ConceptDetail
Personal team Every user has exactly one personal team. ID derived deterministically from Keycloak UUID via personal_team_id(user_id). Cannot be shared; governed by the same authorization model as regular teams.
Team-scoped resources Agents, prompts, sessions, document libraries, storage quotas — all scoped to a team. There is no global/unscoped resource namespace.
Organization singleton organization:fred in OpenFGA holds global role context without granting implicit team access. Used for platform-admin checks.
FrontendBootstrap GET /control-plane/v1/frontend/bootstrap returns resolved current_user, active_team, available_teams, permissions, feature_flags, ui_settings, gcu_version — single authenticated round-trip for the SPA shell.
PermissionSummary Flattened boolean capability flags per team (can_manage_team_agents, can_create_prompt, etc.) — UI renders conditionally on these, not on role strings.

Policies & quotas

PolicySet byWhat it controls
TeamPlatformPolicy Owner Maximum storage quota, allowed model profiles, allowed MCP servers, team-level rate limits
TeamRoutingPolicy Manager Default model profile for new agent instances, default RAG scope, search policy
Storage quota Owner (default: 5 GB team, 5 GB personal) Enforced on knowledge-flow document uploads per team
MCP server locks Pod author / platform Servers marked locked=True cannot be toggled by managers; operator-configured only
GCU (usage terms) Platform Version-tracked acceptance state per user; returned in FrontendBootstrap.gcu_version

Runtime API

EndpointDescription
POST /agents/execute/streamSSE streaming execution (primary path)
POST /agents/executeSynchronous execution
GET /agents/sessions/{session_id}/messagesFull turn history as ChatMessage[]
GET /agentsAgent template catalog (pod-scoped)
GET /healthLiveness / readiness probe

Control-plane API

EndpointDescription
GET /control-plane/v1/frontend/bootstrapSingle-call SPA init — user, team, permissions, flags
POST /teams/{team_id}/agent-instances/{id}/prepare-executionIssue ExecutionGrant + runtime URLs + effective chat options
GET/POST/PATCH/DELETE /teams/{team_id}/agent-instancesEnroll, list, update, remove agent instances
GET/POST/PATCH/DELETE /teams/{team_id}/sessionsSession metadata CRUD
GET /teams/{team_id}/agent-templatesProxied catalog from all registered runtime pods
POST/GET /api/v1/tasksStart and list long-running tasks (ingestion, migration)
GET /api/v1/tasks/{id}/eventsSSE task event stream with Last-Event-ID replay
POST /api/v1/tasks/{id}/cancelCancel task (idempotent, 202)

OpenAI compatibility layer

A secondary interface that allows tools expecting the OpenAI Chat Completions API to connect to Fred agents without modification.

EndpointNotes
GET /v1/modelsModel list (mapped to registered agent templates)
POST /v1/chat/completionsStreaming-compatible; X-Fred-Team-Id header for team scoping
Limitation: team-scoped managed execution and HITL are not fully supported via the /v1 protocol. For full platform features use the native runtime API.

fred-sdk

The shared contract library imported by agent pods and the runtime. It contains no server or product dependency — any team can build against it to author a deployable agent pod.

ModuleProvides
Execution typesExecutionGrant, ChatMessage, VectorSearchHit, HumanInputRequest, all SSE event types
UiPart primitivesLinkPart, GeoPart and the UiPart discriminated union
ThoughtKind enumplanning · tool_use · observation · reflection · synthesis
Prompt utilitiesSystem prompt assembly, tuning field resolution, context injection helpers
Authoring primitivesBase classes and decorators for defining agent behaviors in custom pods

Observability

SystemWhat it capturesConfiguration
Langfuse Full LLM traces with identity context: user_id, team_id, agent_instance_id, session_id, checkpoint_id, template_agent_id Enabled via LANGFUSE_* env vars; traces per turn, per node
Prometheus Process metrics, SQL connection pool metrics, per-agent KPI counters, latencies observability.metrics: prometheus in deployment config
KPI pipeline In-process counters (token counts, latency per step, tool timings) exported at configurable intervals kpi_process_metrics_interval_sec, kpi_log_summary_interval_sec
Correlation IDs request_id, trace_id, correlation_id propagated through all log lines, KPI entries, and metrics Automatic; no configuration required
Structured logs JSON-structured log output at configurable level (debug default); all services app.log_level in deployment config

Security & compliance

Authentication

FeatureDetail
Identity providerKeycloak — OpenID Connect, configurable realm
Auth flowsPKCE browser flow (primary), resource-owner password flow (CLI/dev), no-security mode (local/airgapped)
Token validationBearer token required on all protected endpoints; short-lived ExecutionGrant additionally required for execution
MFATOTP secrets migrate with Keycloak realm export; WebAuthn/passkey supported (hardware-bound, re-enrolment required on migration)
No-security modeAuth bypass for local development and air-gapped deployments; team_id defaults to "personal"

Authorization

FeatureDetail
EngineOpenFGA — Relationship-Based Access Control (ReBAC)
Policy evaluationRuntime checks tuple relationships on every protected operation; no role string comparison in business logic
Tuple formatuser:<keycloak-uuid> format exclusively (no username strings in production)
Enforcement layerAPI level — not UI-only guards. Even direct API calls are rejected if tuples do not authorize the relationship.

Data isolation

ControlDetail
Team namespace isolationAll data (agents, prompts, sessions, documents) is team-scoped at the storage and authorization layer
Storage scope in grantExecutionGrant.storage_scope restricts which object storage paths a runtime execution may access
Session content isolationRuntime returns session messages only to requests carrying a valid grant for that session's team
Personal team isolationPersonal team data is accessible only to the owning user; the personal team ID is not guessable (UUID-derived)
Document download TTLPresigned URLs for document images expire after 1 minute

Security certification

C3 homologation achieved. Fred has completed its first homologation at the C3 level of French national security classification. Security posture is actively maintained: architecture reviews, OpenFGA policy audits, and Keycloak hardening are ongoing as the platform evolves toward new homologation cycles.

Design for regulated environments

PropertyHow it is achieved
AuditabilityAll execution attributed to user_id + team_id; Langfuse traces record full identity context per turn
Role separationOwner / manager / member surfaces are orthogonal and API-enforced; no privilege escalation path
Least-privilege executionExecutionGrant is short-lived, scoped to one session and one agent instance; cannot be forged or escalated by the frontend
No shared global stateEvery resource is team-scoped; there is no unscoped global API for non-admin operations
Air-gap capableNo-security mode + local object storage + local vector store; no mandatory external service dependencies in the execution path

Deployment

Kubernetes native

ConcernApproach
Service exposureStandard Kubernetes Service (internal) + Ingress/Gateway (browser); runtime pods exposed at /runtime/{runtime_id} prefix
Service discoveryKubernetes DNS; no custom mesh or sidecar required
Network policyStandard NetworkPolicy primitives; each service is independently policy-able
Helm chartModernized chart per FRED-CHART-MODERNIZATION-RFC.md; values-driven deployment for all services
Custom agent podsImport fred-runtime, package as container, register runtime URL in platform.runtime_catalog_sources; no Fred core changes needed

Configuration model

LayerFileContains
Secrets / envENV_FILEDatabase passwords, API keys, Keycloak secrets
Deployment configCONFIG_FILE (YAML)Base URL, ingress prefix, runtime catalog sources, processing profiles, log level, metrics config
Processing profilesconfiguration.yaml → processing.profilesPDF engine, OCR settings, chunk size, input processor registry per suffix
Audio configconfiguration.yaml → audio_modelWhisper model size, device (cpu/cuda), language override

Local development

CommandWhat it does
make runStart the service's API process
make run-workerStart the Temporal/background worker
make cliBackend validation CLI (all backends expose this)
make code-qualityRuff + format (Python) or tsc + prettier (frontend) — run from repo root
make testOffline unit tests only (no live stack required)
Infrastructure stack (local): PostgreSQL, Keycloak, MinIO (object storage), OpenSearch (vector index), OpenFGA (authorization), Temporal (background workers) — all provided via Docker Compose from ignored/fred-deployment-factory.