July 20, 2025 in architecture, ops, orchestration by Alban Capitant, Simon Cariou, Florian Muller and Dimitri Tombroff5 minutes
Thanks to Temporal, Fred makes document ingestion as reliable as microservices — with automatic retries, fine-grained observability, and custom pipelines for push or pull ingestion. A true game-changer for AI apps.
Ingesting documents sounds simple — until you need to do it at scale, reliably, and with visibility.
What if you could:
Fred now does exactly that. Thanks to Temporal, we’ve given our ingestion system a durable, observable, and developer-friendly architecture.
When you build GenAI apps that process user documents, ingestion is always the first bottleneck.
You need to:
And if any of this fails — especially in a batch — you want to know which file, why, and how to fix it.
Traditionally, teams solve this with brittle scripts, Celery queues, or one-off APIs. But that leads to poor observability, no retry logic, and painful debugging.
Fred now uses Temporal workflows to define its ingestion pipeline in three reusable steps:
Extract Metadata
Input Process
Output Process
Each step is a Temporal activity — which means:
The architecture is quite simple and is illustrated next:
flowchart TD
subgraph UI["🧑 User Interface"]
User[Fred Web UI]
end
subgraph KnowledgeFlow["📚 Knowledge Flow Backend"]
KnowledgeAPI[Ingestion API]
TemporalClient[Temporal Client SDK]
end
subgraph Temporal["⏱️ Temporal Orchestration Layer"]
TemporalServer[🧭 Temporal Server]
Worker1[🔧 Knowledge Flow Worker 1]
Worker2[🔧 Knowledge Flow Worker 2]
ScaleHint[...auto-scale as needed...]
end
User -->|REST or Upload| KnowledgeAPI
KnowledgeAPI -->|launch pipeline| TemporalClient
TemporalClient --> TemporalServer
TemporalServer --> Worker1
TemporalServer --> Worker2
subgraph Workers["📦 Ingestion Workers (Knowledge Flow Workers)"]
Worker1
Worker2
ScaleHint
end
style User fill:#d0e1ff,stroke:#333,stroke-width:1.5px
style KnowledgeAPI fill:#d7fada,stroke:#333,stroke-width:1.5px
style TemporalClient fill:#bbf3ff,stroke:#333,stroke-dasharray: 3,3
style TemporalServer fill:#ffd6e7,stroke:#333,stroke-width:1.5px
style Worker1 fill:#f2f2f2,stroke:#333,stroke-width:1.5px
style Worker2 fill:#f2f2f2,stroke:#333,stroke-width:1.5px
style ScaleHint fill:#ffffff,stroke:#888,stroke-dasharray: 5,5
A closer look at the worker shows how its architecture is laser-focused on turning raw input — whether local or remote — into structured, searchable knowledge through a streamlined fetch → process → generate flow.
flowchart TD
subgraph Source["📦 Input Sources"]
Push[🗃️ S3 Fred Store]
Pull[🌍 Remote Source]
end
subgraph Worker["⚙️ Knowledge Flow Worker"]
Step2[🧠 LLM: Metadata + Markdown]
Step3[🔎 LLM: Embeddings]
Step4[💾 Save Artifacts]
Step2 --> Step3 --> Step4
end
subgraph Output["📤 Generated Artifacts"]
Markdown[📄 Fred Previews]
Embeddings[🧬 Fred Vectors]
Archive[🗃️ S3 Fred Store]
end
Push --> Step2
Pull --> Step2
Step2 --> Markdown
Step3 --> Embeddings
Step4 --> Archive
style Step2 fill:#e6f7ff,stroke:#333,stroke-width:1.5px
style Step3 fill:#e6f7ff,stroke:#333,stroke-width:1.5px
style Step4 fill:#f9f9f9,stroke:#333,stroke-width:1.5px
style Push fill:#f2f2f2,stroke:#333,stroke-width:1.5px
style Pull fill:#f2f2f2,stroke:#888,stroke-dasharray: 5,5
style Markdown fill:#f0f8ff,stroke:#333,stroke-width:1.5px
style Embeddings fill:#e2ffe2,stroke:#333,stroke-width:1.5px
style Archive fill:#eeeeee,stroke:#333,stroke-width:1.5px
Here’s how the full workflow looks in code:
@workflow.defn
class Process:
@workflow.run
async def run(self, definition: PipelineDefinition) -> str:
for file in definition.files:
metadata = await workflow.execute_child_workflow(ExtractMetadata.run, args=[file])
metadata = await workflow.execute_child_workflow(InputProcess.run, args=[file, metadata])
await workflow.execute_child_workflow(OutputProcess.run, args=[file, metadata])
return "success"You can ingest 1 file or 1,000 — each goes through its own reliable workflow.
This design changes the game.
We’re now prototyping pipelines that:
If you run Fred locally start a temporal server (it will be shortly integrated into the devcontainers):
# Download the CLI
curl -LO https://github.com/temporalio/cli/releases/download/v1.4.1/temporal_cli_1.4.1_linux_amd64.tar.gz
tar xvzf temporal_cli_1.4.1_linux_amd64.tar.gz
sudo cp temporal /usr/local/bin/
# Start the Temporal server locally (for testing)
temporal server start-devYou can also run it via Docker using
temporalio/auto-setup:
https://hub.docker.com/r/temporalio/auto-setup
Then, make sure to:
config/configuration.yaml of the Fred Knowledge Flow backendtemporal sectionThis allows the ingestion API to act as a Temporal client and dispatch workflows to your Temporal server.
To trigger an ingestion manually visit Fred UI and launch a Process on your files.
Once everything is running, you can:
http://localhost:8233)With just a few lines, you’ll have a full ingestion pipeline running — durable, observable, and ready for scale.
Fred’s ingestion system is now ready to support:
We’re also exploring:
With Temporal in the picture, Fred’s ingestion story goes from “hope it works” to “I can see exactly what happened.” That’s a huge step for reliable AI pipelines.