Policy-based LLM Routing
Overview
Fred v2 uses a policy-based routing layer to select models. Agents do not hardcode providers or model names. They execute with a capability and runtime context, and the platform resolves the effective model profile.
Model routing is one domain of Fred’s broader governance plane (models, tools/MCP, prompts, agents, data). This page focuses specifically on the model domain.
This gives:
- consistent governance
- centralized model lifecycle management
- deterministic, auditable behavior
- lower coupling between agent code and model infrastructure
Concepts
- Capability: technical interface required by runtime (
chat,language,embedding,image). - Purpose: optional business intent discriminator (for example
rag,chatbot). - Operation: pipeline phase (
routing,planning,analysis,generate_draft,self_check, …).
This is the primary routing dimension for multi-step agents. - Model Profile: named, reusable model configuration (
provider,name,settings).
Source of Truth
Routing policy is loaded from config/models_catalog.yaml (or override env path).
Main sections:
common_model_settings: global defaults merged into profilesdefault_profile_by_capability: fallback profile per capabilityprofiles: concrete model profilesrules: routing overrides
Settings merge order is deterministic:
common_model_settingscommon_model_settings_by_capability[capability](if defined)profile.model.settings
Rule Shape
Preferred rule format (flat):
rules:
- rule_id: react.phase.routing.fast
capability: chat
operation: routing
target_profile_id: chat.openai.gpt5mini
- rule_id: react.phase.planning.quality
capability: chat
operation: planning
target_profile_id: chat.openai.gpt5Optional criteria can be added at rule root: purpose, agent_id, team_id, user_id.
Legacy match: { ... } format is still accepted for backward compatibility.
Resolution Algorithm
For one model selection request, Fred applies:
- Keep rules with same
capabilityas request. - Keep rules whose criteria all match request context (
purpose,agent_id,team_id,user_id,operation). - Select winner by:
- highest specificity (number of defined criteria),
- then first declared rule (stable tie-break).
- If no rule matches, use
default_profile_by_capability[capability].
This behavior is deterministic and testable.
Runtime Behavior in v2 Agents
- ReAct v2 (LangChain path): model can be selected per call using operation inference:
routingafter user messageplanningafter tool result
- ReAct v2 (HITL path): same operation-based resolution is applied in custom loop.
- Graph v2: routing is available; today selection is generally performed during runtime/model build unless agent logic adds per-operation calls.
Observability
Routing decisions are logged with prefix:
[V2][MODEL_ROUTING]
Typical fields include:
source=ruleorsource=defaultrule=...profile=...model=provider/name- context (
team,user,agent)
Scope and Governance Position
Current production posture is policy-first:
- routing managed from catalog/policies
- no end-user model picker as routing authority
This keeps enterprise behavior predictable and aligned with team governance rules.
For the broader governance architecture and roadmap positioning, see:
docs/reference/architecturedocs/guides/roadmap