Agent Architecture
Building multi-agent systems that work reliably โ orchestration, failure modes, and production patterns.
What is an Agent
An agent is not a smarter chatbot: it is a different execution model. This module defines what makes something agentic, maps the spectrum from single call to autonomous agent, and gives you the decision matrix to know which approach fits your problem.
Memory and State
Memory is what separates a stateless chatbot from an agent that can work across sessions and build on past experience. This module covers the four memory types, how to manage the lifecycle of each, and the anti-patterns that cause agents to accumulate stale, conflicting, or poisoned state.
Planning and Decomposition
Complex tasks fail when handed to an agent as a single goal. Planning is the process of decomposing a goal into executable steps: deciding what to do, in what order, and when to revise the plan based on what actually happens.
Multi-Agent Patterns
A single agent hits limits: context windows fill, specialisation is hard, and long tasks become fragile. Multi-agent architectures solve this by distributing work, but they introduce coordination costs, trust boundaries, and new failure modes. This module covers the patterns that work in production.
Agent Failure Modes
Agents fail in ways that are qualitatively different from single API calls: errors compound, loops consume unbounded resources, and failures can be invisible until they cause damage. This module catalogues the failure modes and the structural mitigations for each.
Human-in-the-Loop
Human oversight is not a bolt-on safety feature: it is an architectural primitive that determines what an agent is permitted to do autonomously and what requires a human decision. This module covers the design of approval gates, interrupt points, confidence escalation, and audit trails that make human oversight practical at scale.
Production Agent Systems
An agent that works in a demo fails in production the first time it crashes mid-task, gets retried with a duplicate side effect, or loses its state to a process restart. This module covers the durability semantics that separate toy agents from production systems.
Agent Evaluation
Evaluating an agent is fundamentally different from evaluating a model. The question is not just 'was the answer correct?' but 'did the agent take the right path to get there, and would it hold up under different conditions?' This module covers offline trajectory evaluation and online production monitoring: the two distinct disciplines that together keep agent quality measurable.
Cognitive Architectures
The way you structure an agent's memory, planning, and action loop determines what kinds of tasks it can handle well โ and where it will fail. This module maps the three main cognitive architecture families and gives you the decision criteria to choose between them.
Internal Coding Agents
Coding agents are moving from personal tools to team infrastructure. This module covers the architecture for deploying coding agents internally โ startup context, sandboxed execution, CI integration, and the review gates that keep automation safe.
Middleware and Deterministic Injection
LLM reasoning is powerful but unreliable for things you can formally specify. Middleware lets you enforce hard rules deterministically โ input normalisation, output validation, routing โ while leaving everything else to the model. This module covers where to draw that line and how to implement it.
Reliability Patterns for Agent Systems
Agent failures are often silent, partial, and hard to replay. This module applies distributed-systems reliability patterns โ idempotency, compensation transactions, circuit breakers, and graceful degradation โ to the specific failure modes agents introduce.