Safety & Compliance

Prompt injection, supply chain risk, guardrails, EU AI Act — what you need to ship responsibly.

Tech LeaderJunior DevSenior DevSRE

The AI Threat Landscape

Every LLM application has a multi-layer attack surface: model, context, tools, memory, and outputs. Understanding what attackers want and what they can do is the prerequisite to building defences that actually hold. This module maps the threat landscape and establishes why defence in depth is not optional.

5 min →

7.2

Prompt Injection

Prompt injection is the most prevalent attack class in LLM applications. It takes two forms: direct injection from user input, and indirect injection through retrieved documents or tool results. Both exploit the same root cause: the model cannot distinguish instructions from data when they share the same channel.

5 min →

7.3

Jailbreaking and Policy Bypass

Jailbreaking is the attempt to get a model to produce output that its alignment training or system prompt prohibit. No defence is permanent: the arms race between jailbreak techniques and countermeasures is ongoing. This module covers the attack taxonomy and the multi-layer defences that reduce, but never eliminate, the risk.

5 min →

7.4

Data Privacy and PII

LLM systems create new PII leakage vectors that traditional data protection controls do not cover: model memorisation, cross-user context leakage, and RAG pipelines that pull in customer records without scrubbing. This module covers detection, scrubbing, retention, and the vendor agreements that govern what happens to your data.

5 min →

7.5

Guardrails Architecture

Guardrails are controls on inputs, outputs, or both: classifiers, validators, and policy checks that run independently of the model. Designing a guardrails architecture means choosing which controls to apply, how to layer them for coverage and performance, and how to calibrate them so false positives do not kill legitimate use.

5 min →

7.6

Supply Chain Security

The AI supply chain, base model, fine-tuning data, adapters, Python packages, and API keys, has more attack surfaces than teams typically consider. A .pkl file is executable code. An unverified model weight can contain backdoors. This module covers the controls that keep your AI system trustworthy from training data to production inference.

5 min →

7.7

Regulatory Landscape

The regulatory environment for AI is moving quickly. The EU AI Act introduced risk tiers and mandatory requirements. GDPR has always applied to automated decision-making. The US has the NIST AI RMF. This module maps the landscape for a B2B SaaS product using LLMs: what you likely need to document, what you need to avoid, and where you need legal counsel.

5 min →

7.8

Incident Response for AI Systems

An AI incident is not a software incident: it involves model misbehaviour, safety violations, or data leakage, each with distinct root causes and remediation paths. This module covers detection, containment, investigation, and post-mortem structure for AI-specific incidents, and the one logging investment that makes all of it possible.

5 min →

7.9

EU AI Act & Governance, Risk, and Compliance

The EU AI Act is the first comprehensive binding regulation for AI systems. It classifies AI by risk tier, imposes strict obligations on high-risk deployments, and prohibits specific uses outright. This module covers what you must do, what you cannot do, and how to determine which rules apply to your system.

8 min →

7.10

Constitutional AI & RLHF

Safety-aligned models like Claude and GPT-4 are trained, not just prompted, to be helpful and avoid harm. Understanding how Constitutional AI and RLHF bake safety into model weights explains why inference-time guardrails are still necessary — and what they can and cannot catch.

6 min →

Start here →