Agent Architecture

Building multi-agent systems that work reliably — orchestration, failure modes, and production patterns.

Junior DevSenior DevSRE

What is an Agent

An agent is not a smarter chatbot: it is a different execution model. This module defines what makes something agentic, maps the spectrum from single call to autonomous agent, and gives you the decision matrix to know which approach fits your problem.

6 min →

4.2

Memory and State

Memory is what separates a stateless chatbot from an agent that can work across sessions and build on past experience. This module covers the four memory types, how to manage the lifecycle of each, and the anti-patterns that cause agents to accumulate stale, conflicting, or poisoned state.

6 min →

4.3

Planning and Decomposition

Complex tasks fail when handed to an agent as a single goal. Planning is the process of decomposing a goal into executable steps: deciding what to do, in what order, and when to revise the plan based on what actually happens.

5 min →

4.4

Multi-Agent Patterns

A single agent hits limits: context windows fill, specialisation is hard, and long tasks become fragile. Multi-agent architectures solve this by distributing work, but they introduce coordination costs, trust boundaries, and new failure modes. This module covers the patterns that work in production.

5 min →

4.5

Agent Failure Modes

Agents fail in ways that are qualitatively different from single API calls: errors compound, loops consume unbounded resources, and failures can be invisible until they cause damage. This module catalogues the failure modes and the structural mitigations for each.

6 min →

4.6

Human-in-the-Loop

Human oversight is not a bolt-on safety feature: it is an architectural primitive that determines what an agent is permitted to do autonomously and what requires a human decision. This module covers the design of approval gates, interrupt points, confidence escalation, and audit trails that make human oversight practical at scale.

5 min →

4.7

Production Agent Systems

An agent that works in a demo fails in production the first time it crashes mid-task, gets retried with a duplicate side effect, or loses its state to a process restart. This module covers the durability semantics that separate toy agents from production systems.

5 min →

4.8

Agent Evaluation

Evaluating an agent is fundamentally different from evaluating a model. The question is not just 'was the answer correct?' but 'did the agent take the right path to get there, and would it hold up under different conditions?' This module covers offline trajectory evaluation and online production monitoring: the two distinct disciplines that together keep agent quality measurable.

5 min →

4.9

Cognitive Architectures

The way you structure an agent's memory, planning, and action loop determines what kinds of tasks it can handle well — and where it will fail. This module maps the three main cognitive architecture families and gives you the decision criteria to choose between them.

6 min →

4.10

Internal Coding Agents

Coding agents are moving from personal tools to team infrastructure. This module covers the architecture for deploying coding agents internally — startup context, sandboxed execution, CI integration, and the review gates that keep automation safe.

6 min →

4.11

Middleware and Deterministic Injection

LLM reasoning is powerful but unreliable for things you can formally specify. Middleware lets you enforce hard rules deterministically — input normalisation, output validation, routing — while leaving everything else to the model. This module covers where to draw that line and how to implement it.

5 min →

4.12

Reliability Patterns for Agent Systems

Agent failures are often silent, partial, and hard to replay. This module applies distributed-systems reliability patterns — idempotency, compensation transactions, circuit breakers, and graceful degradation — to the specific failure modes agents introduce.

6 min →

Start here →